2015 Fiscal Year Final Research Report

Image Reconstruction from Bag of Visual Words using Large-scale Image Data

Research Project

Project/Area Number	26540079
Research Category	Grant-in-Aid for Challenging Exploratory Research
Allocation Type	Multi-year Fund
Research Field	Perceptual information processing
Research Institution	The University of Tokyo
Principal Investigator	Harada Tatsuya 東京大学, 情報理工学(系)研究科, 教授 (60345113)
Project Period (FY)	2014-04-01 – 2016-03-31
Keywords	コンピュータビジョン / 機械学習
Outline of Final Research Achievements	The objective of this study is to reconstruct images from Bag-of-Visual-Words (BoVW). BoVW is defined here as a histogram of quantized local descriptors extracted densely on a regular grid at a single scale. This task is challenging for two reasons: 1) BoVW includes quantization errors. 2) BoVW lacks spatial information of local descriptors. To tackle this difficult task, we use a large-scale image database to estimate the spatial arrangement of local descriptors. Then this task creates a jigsaw puzzle problem with adjacency and global location costs of visual words. Solving this optimization problem is also challenging because it is known as an NP-Hard problem. We propose a heuristic but efficient method to optimize it. To underscore the effectiveness of our method, we apply it to BoVWs extracted from about 100 different categories and demonstrate that it can reconstruct the original images.
Free Research Field	知能機械情報学