2016 Fiscal Year Final Research Report

learning of multimodal representation and its application

Research Project

Project/Area Number	26330249
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Intelligent informatics
Research Institution	Shinshu University
Principal Investigator	MARUYAMA Minoru 信州大学, 学術研究院工学系, 教授 (80283232)
Project Period (FY)	2014-04-01 – 2017-03-31
Keywords	画像認識 / 機械学習 / 深層学習 / 画像検索 / ニューラルネットワーク
Outline of Final Research Achievements	Research on multimodal learning and its application to image search has been carried out. We have studied the DBM to jointly represent the image and corresponding text. We showed feature vector obtained from the joint layer could give rise to better classification results than CNN-based features. We also studied image query method based on multimodal representation which is enabled by using visual-semantic embedding model based on CNN and LSTM. It allows us to perform analogical reasoning over images by specifying properties to be added and subtracted by words. We introduced a novel similarity measure based on the difference between additive and subtractive query. Our methods strongly depend upon the CNNs for image processing. To reduce the computational cost of the image processing by the CNN, we examined a method for compressing the given CNN. We proposed the compression method based on the block-wise distillation and examined its effectiveness.
Free Research Field	知能情報学