2022 Fiscal Year Final Research Report
Development of Multimodal Data Retrieval Engine Based on Human Cognitive System
Project/Area Number |
19H04172
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Review Section |
Basic Section 61030:Intelligent informatics-related
|
Research Institution | Osaka Gakuin University (2020-2022) Kobe University (2019) |
Principal Investigator |
|
Co-Investigator(Kenkyū-buntansha) |
白浜 公章 近畿大学, 情報学部, 准教授 (30467675)
松原 崇 大阪大学, 大学院基礎工学研究科, 准教授 (70756197)
|
Project Period (FY) |
2019-04-01 – 2023-03-31
|
Keywords | 深層学習 / マルチモーダルデータ / epistemic uncertainty / aleatoric uncertainty / マルチモーダル検索 |
Outline of Final Research Achievements |
This project focuses on embedding that is a technique for semantically aligning images with text captions, and aims to exploit characteristics of biological cognitive systems to develop embedding approaches that can decompose a complex semantic meaning into component concepts and analyze attributes and relations of those concepts. The main contributions to this goal are the development of embedding methods that can treat uncertainties to evaluate the importance of information and the reliability of retrieval results. Another main contribution is to devise a method that iteratively aligns words and phrases in a text caption with regions in an image based on human attention mechanism. The effectiveness of these methods has been validated by using large-scale benchmark datasets and participating in an international competition (TRECVID).
|
Free Research Field |
人工知能、特に機械学習
|
Academic Significance and Societal Importance of the Research Achievements |
深層学習の発達や大規模データの整備によって,物体やシーンといった単純な意味に関する画像分類ならば性能は人間を上回っている.しかし,従来手法では,複数の概念間の関係性が重要である複合概念を分解したり,分解した個々の概念のマッチングが行えない.そこで,本研究では,埋め込みに基づく画像とテキストの相互検索を題材として,概念の重要度や信頼性を評価する手法や,下位の概念から上位の概念の意味を動的に導出し,画像とテキストに含まれる概念を逐次的に対応づける手法を開発した.これらの手法は,例えば動画とテキスト,音楽と感情など,様々なマルチモーダルデータの意味的なマッチングに汎用的に応用可能である.
|