• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2021 Fiscal Year Final Research Report

Zero-shot Cross-modal Embedding Learning

Research Project

  • PDF
Project/Area Number 19K11987
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 60080:Database-related
Research InstitutionNational Institute of Informatics

Principal Investigator

Yu Yi  国立情報学研究所, コンテンツ科学研究系, 特任助教 (00754681)

Project Period (FY) 2019-04-01 – 2022-03-31
KeywordsCross-Modal Correlation / Cross-Modal Embedding
Outline of Final Research Achievements

This project focused on cross-modal embedding learning for cross-modal retrieval. The main challenge is how to learn joint embeddings in a shared subspace for computing the similarity across different modalities. 1) We proposed a novel deep triplet neural network with cluster canonical correlation analysis (TNN-C-CCA), which is an end-to-end supervised learning architecture with audio branch and video branch. 2) We proposed a novel variational autoencoder (VAE) architecture for audio-visual cross-modal retrieval, by learning paired audio-visual correlation embedding and category correlation embedding as constraints to reinforce the mutuality of audio-visual information. 3) We proposed an unsupervised generative adversarial alignment representation (UGAAR) model to learn deep discriminative representations shared across three major musical modalities: sheet music, lyrics, and audio, where a deep neural network based architecture on three branches is jointly trained.

Free Research Field

データベース関連

Academic Significance and Societal Importance of the Research Achievements

The distribution of data in different modalities are inconsistent, which makes it difficult to directly measure the similarity across different modalities. The proposed technique of cross-modal embedding learning can help improve the performance of cross-modal retrieval, recognition, and generation.

URL: 

Published: 2023-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi