2019 Fiscal Year Final Research Report

A study on shared representation learning considering the uncertainty of each modality

Research Project

PDF

Project/Area Number	19K21527
Project/Area Number (Other)	18H06458 (2018)
Research Category	Grant-in-Aid for Research Activity Start-up
Allocation Type	Multi-year Fund (2019) Single-year Grants (2018)
Review Section	1001:Information science, computer engineering, and related fields
Research Institution	The University of Tokyo
Principal Investigator	Suzuki Masahiro 東京大学, 大学院工学系研究科(工学部), 特任研究員 (30823885)
Project Period (FY)	2018-08-24 – 2020-03-31
Keywords	深層学習 / 共有表現学習 / マルチモーダル学習 / 深層生成モデル
Outline of Final Research Achievements	In this research, we addressed how to integrate several different types of information (i.e., different modalities), such as images, documents, and sounds. Previous studies did not take into account the differences in uncertainty across modalities and therefore integrated them deterministically. In this study, we proposed the probabilistic integration of different modalities based on a framework called deep generative models. We then showed that this approach is effective in multiple multimodal learning problem settings. In addition, we developed a new library to simplify the implementation of complex deep generative models containing multimodal information relationships.
Free Research Field	人工知能
Academic Significance and Societal Importance of the Research Achievements	本研究で提案する異なるモダリティの統合の枠組みは，今回扱ったデータや問題設定によらず，様々な領域に応用できると考えている．それは，この統合方法では深層生成モデルを用いてるため，モダリティの不確実性の違いのみに着目しており，モダリティの入力空間の次元数には依存しないからである．また，今回開発した深層生成モデルライブラリは，マルチモーダル学習のモデルに限らず，様々な深層生成モデルの実装に利用することができる．