2018 Fiscal Year Research-status Report
Participatory Sensing and Felicitous Recommending of Venues
Project/Area Number |
16K16058
|
Research Institution | National Institute of Informatics |
Principal Investigator |
ュ イ 国立情報学研究所, コンテンツ科学研究系, 特任助教 (00754681)
|
Project Period (FY) |
2016-04-01 – 2020-03-31
|
Keywords | Category-Based Deep CCA / Venue discovery / Cross-modal retrieval |
Outline of Annual Research Achievements |
Travel destinations and business locations are taken as venues. Discovering a venue by a photo is very important for visual context-aware applications. Unfortunately, few efforts paid attention to complicated real images such as venue photos generated by users. Our goal is fine-grained venue discovery from heterogeneous social multimodal data. To this end, we propose a novel deep learning model, Category-based Deep Canonical Correlation Analysis (C-DCCA). Given a photo as input, this model performs (i) exact venue search (find the venue where the photo was taken), and (ii) group venue search (find relevant venues that have the same category as the photo), by the cross-modal correlation between the input photo and textual description of venues. In this model, data in different modalities are projected to a same space via deep networks. Pairwise correlation (between different modality data from the same venue) for exact venue search and category-based correlation (between different modality data from different venues with the same category) for group venue search are jointly optimized. Because a photo cannot fully reflect rich text description of a venue, the number of photos per venue in the training phase is increased to capture more aspects of a venue. Experimental results confirm the feasibility of the proposed method. In addition, our C-DCCA also can be applied to solve cross-modal retrieval between audio and video.
|
Current Status of Research Progress |
Current Status of Research Progress
2: Research has progressed on the whole more than it was originally planned.
Reason
The current research progress is going well. Some significant results have been published in top journal. We are working on other tasks which are related to cross-modal deep hashing for efficient image-text retrieval.
|
Strategy for Future Research Activity |
Existing deep cross-modal learning models usually do not work well when the query or database include new data with unknown categories. To solve this problem, future work will aim to develop zero-shot cross-modal learning algorithms from the following aspects: (i) compute modality-invariant embedding, (ii) predict unknown categories based on external knowledge describing their correlation from known categories, (iii) apply adversarial learning to enhance system performance.
|
Causes of Carryover |
Last year, I planned to attend ACMMM18 held in Seoul, Republic of Korea. Since I had to prepare for my courses, I did not have time to go to Seoul. This year, we have submitted the papers to ACMMM19 and plan to submit the paper to ICDM19. So, I would like to use this budget to support me or my student to attend ACMMM19 held in Nice, France, or attend IEEE ICDM2019 held in Beijing, China.
|
Research Products
(4 results)