2016 Fiscal Year Final Research Report
learning of multimodal representation and its application
Project/Area Number |
26330249
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Shinshu University |
Principal Investigator |
|
Project Period (FY) |
2014-04-01 – 2017-03-31
|
Keywords | 画像認識 / 機械学習 / 深層学習 / 画像検索 / ニューラルネットワーク |
Outline of Final Research Achievements |
Research on multimodal learning and its application to image search has been carried out. We have studied the DBM to jointly represent the image and corresponding text. We showed feature vector obtained from the joint layer could give rise to better classification results than CNN-based features. We also studied image query method based on multimodal representation which is enabled by using visual-semantic embedding model based on CNN and LSTM. It allows us to perform analogical reasoning over images by specifying properties to be added and subtracted by words. We introduced a novel similarity measure based on the difference between additive and subtractive query. Our methods strongly depend upon the CNNs for image processing. To reduce the computational cost of the image processing by the CNN, we examined a method for compressing the given CNN. We proposed the compression method based on the block-wise distillation and examined its effectiveness.
|
Free Research Field |
知能情報学
|