2017 年度実施状況報告書

Participatory Sensing and Felicitous Recommending of Venues

研究課題

研究課題/領域番号	16K16058
研究機関	国立情報学研究所
研究代表者	ュイ国立情報学研究所, コンテンツ科学研究系, 特任助教 (00754681)
研究期間 (年度)	2016-04-01 – 2019-03-31
キーワード	マルチモーダルディープラーニング / ディープハッシング / 文書及び画像間クロスモーダル検索
研究実績の概要	We take Wikipedia featured articles and photos for venues as basic knowledge to learn a deep correlation model for fine-grained venue discovery from Foursquare photos. Specifically, we are interested in the challenging research problem of venue discovery from multimodal dataset: given a photo (with a rough position) as an input, the system returns its exact venue name (i.e., in which venue the photo was taken), category, and textual description. This work has demonstrated the first study on visual venue discovery over an integrated venue-related multimodal dataset. We proposed a novel framework for deep correlation learning to realize fine-grained venue discovery. In particular, we apply a deep learning model, deep canonical correlation analysis (DCCA), to learn the correlations between venue photos and venue descriptions obtained from Wikipedia and Foursquare. Particularly, our contribution is three-fold: i) A novel dataset for venue-related multimodal contents is created based on integrating venue photos and descriptions from Wikipedia and Foursquare to solve fine-grained venue discovery with the aim of academic research, ii) An end-to-end deep network with two branches CNN and Doc2vec is trained, which converts different views into the same space and maximizes their correlations there, and iii) Extensive experiments verify the practicability of the DCCA model for fine-grained venue discovery, where DCCA outperforms state-of-the-art methods such as KCCA [4]. Some dataset are available on http://research.nii.ac.jp/_yiyu/VenueNet.htm for the research purpose.
現在までの達成度 (区分)	現在までの達成度 (区分) 2: おおむね順調に進展している理由 The current research progress is going well. Some initial results have been published in the international conferences. We are working more tasks which are related to venue-related cross-modal deep hashing for efficient retrieval.
今後の研究の推進方策	There is still much room for improvements, as follows: i) Based on our dataset, some topics will be investigated, e.g. cross-modal retrieval and image question answering where visual objects are described by natural language with different levels of understanding. ii) We still keep enlarging the number of image-text pairs for venues to investigate more interesting questions, for example, is there any correlation between visual objects and categories? iii) We will try to incorporate more data domains such as checkins and tips for personalized venue recommendation. iv) We will investigate more deep learning methods such as long short term memory(LSTM) for processing Wikipedia articles.
次年度使用額が生じた理由	Last year, I planned to attend ACMMM17 held in Mountain View, CA USA. Since I had to prepare for my courses, I did not have time to go for USA. This year, we have submitted the papers to ACMMM18 and plan to submit the paper to ICDM18. So, I would like to use this budget to support me or my student to attend ACMMM18 held in Seoul, Korea or attend IEEE ICDM2018 held in Singapore.

研究成果
(2件)

すべて学会発表 (2件) (うち国際学会 2件、招待講演 2件)

[学会発表] VenueNet: Fine-Grained Venue Discovery by Deep Correlation Learning2017
- 著者名/発表者名
  Yi Yu, Suhua Tang, Kiyoharu Aizawa, Akiko Aizawa
- 学会等名
  The 19th IEEE International Symposium on Multimedia (ISM 2017)
- 国際学会 / 招待講演
[学会発表] Deep Multi-label Hashing for Large-Scale Visual Search Based on Semantic Graph2017
- 著者名/発表者名
  Chunlin Zhong, Yi Yu, Suhua Tang, Shin'ichi Satoh, Kai Xing
- 学会等名
  APWeb/WAIM 2017
- 国際学会 / 招待講演