2022 Fiscal Year Final Research Report

Deep State Space Modeling Methods for Video Understanding

Research Project

PDF

Project/Area Number	19K12039
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	Chiba University
Principal Investigator	Kawamoto Kazuhiko 千葉大学, 大学院工学研究院, 教授 (30345376)
Project Period (FY)	2019-04-01 – 2023-03-31
Keywords	状態空間モデル / 深層学習
Outline of Final Research Achievements	This study tackled video understanding based on integrating deep learning and state-space models. First, we introduced a deep Markov model for predicting chaotic dynamics. Next, we extend the deep Markov model to a 2D convolutional neural Markov model that handles both time series and spatial data. Furthermore, we developed deep models for video generation and action recognition. Then, we worked on building a deep model that enables control of video generation and developed zero-shot image generation. Furthermore, we developed a sequential variational autoencoder that separates static and dynamic features in video images. These studies demonstrated the effectiveness of our approach.
Free Research Field	コンピュータビジョン
Academic Significance and Societal Importance of the Research Achievements	深層学習モデルと状態空間モデルの統合により、コンピュータビジョンにおける動画像理解タスクを適切にモデル化でき、行動認識、人物追跡、動画生成といったタスクがより精度高く、効率的に行えるようになる。これは、監視システム、自動運転車、ロボティクスなどの分野に貢献できる。また、動画生成技術は、エンターテイメントや広告への応用も期待できる。