研究課題/領域番号 |
22K17947
|
研究種目 |
若手研究
|
配分区分 | 基金 |
審査区分 |
小区分61030:知能情報学関連
|
研究機関 | 東京大学 |
研究代表者 |
ヴォ ミンデュク 東京大学, 大学院情報理工学系研究科, 特任助教 (40939906)
|
研究期間 (年度) |
2022-04-01 – 2024-03-31
|
研究課題ステータス |
完了 (2023年度)
|
配分額 *注記 |
2,600千円 (直接経費: 2,000千円、間接経費: 600千円)
2023年度: 1,170千円 (直接経費: 900千円、間接経費: 270千円)
2022年度: 1,430千円 (直接経費: 1,100千円、間接経費: 330千円)
|
キーワード | Vision and language / Novel object captioning / GANs / External knowledge / Bias mitigation / Story evaluation / Dataset / Conditional GANs / Long-tail data |
研究開始時の研究の概要 |
1) Creating a dataset for our study because existing datasets are insufficient. 2) Constructing vision-language cross-modal by learning cross-modal similarity. 3) Learning data augmentation using vision-language cross-modal. 4) Incorporating the vision-language cross-modal into the conditional GANs.
|
研究実績の概要 |
We expand our knowledge of the cross-modality between vision and language spaces. We obtained four achievements: 1. By using commonsense knowledge, we can anticipate the future, given a set of sparsely temporally-ordered set of images. It was published at CVPR 2023. 2. We explore training GANs under limited and open-set dataset as well as GAN inversion. The three papers were published at WACV 2024. 3. We build a new knowledge containing image features and corresponding object names. Using it, we propose a method for novel object captioning that outperforms other methods while being comparable to LLMs. It will be published at CVPR 2024. 4. We also gain knowledge about bias mitigation in image classification using a mixture of biases-specific experts. It was published at ICCV 2023.
|