2023 Fiscal Year Annual Research Report
Vision and language cross-modal for training conditional GANs with long-tail data.
Project/Area Number |
22K17947
|
Research Institution | The University of Tokyo |
Principal Investigator |
ヴォ ミンデュク 東京大学, 大学院情報理工学系研究科, 特任助教 (40939906)
|
Project Period (FY) |
2022-04-01 – 2024-03-31
|
Keywords | Vision and language / Novel object captioning / GANs / External knowledge / Bias mitigation |
Outline of Annual Research Achievements |
We expand our knowledge of the cross-modality between vision and language spaces. We obtained four achievements: 1. By using commonsense knowledge, we can anticipate the future, given a set of sparsely temporally-ordered set of images. It was published at CVPR 2023. 2. We explore training GANs under limited and open-set dataset as well as GAN inversion. The three papers were published at WACV 2024. 3. We build a new knowledge containing image features and corresponding object names. Using it, we propose a method for novel object captioning that outperforms other methods while being comparable to LLMs. It will be published at CVPR 2024. 4. We also gain knowledge about bias mitigation in image classification using a mixture of biases-specific experts. It was published at ICCV 2023.
|