2023 Fiscal Year Annual Research Report

Vision and language cross-modal for training conditional GANs with long-tail data.

Research Project

Project/Area Number	22K17947
Research Institution	The University of Tokyo
Principal Investigator	ヴォミンデュク東京大学, 大学院情報理工学系研究科, 特任助教 (40939906)
Project Period (FY)	2022-04-01 – 2024-03-31
Keywords	Vision and language / Novel object captioning / GANs / External knowledge / Bias mitigation
Outline of Annual Research Achievements	We expand our knowledge of the cross-modality between vision and language spaces. We obtained four achievements: 1. By using commonsense knowledge, we can anticipate the future, given a set of sparsely temporally-ordered set of images. It was published at CVPR 2023. 2. We explore training GANs under limited and open-set dataset as well as GAN inversion. The three papers were published at WACV 2024. 3. We build a new knowledge containing image features and corresponding object names. Using it, we propose a method for novel object captioning that outperforms other methods while being comparable to LLMs. It will be published at CVPR 2024. 4. We also gain knowledge about bias mitigation in image classification using a mixture of biases-specific experts. It was published at ICCV 2023.

Research Products
(13 results)

All 2024 2023

All Journal Article (7 results) (of which Int'l Joint Research: 7 results, Peer Reviewed: 7 results) Presentation (6 results) (of which Int'l Joint Research: 6 results)

[Journal Article] Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data2024
- Author(s)
  Katsumata Kai、Vo Duc Minh、Harada Tatsuya、Nakayama Hideki
- Journal Title
  
  2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
  
  Volume: 1 Pages: 5311-5320
- DOI
  10.1109/WACV57701.2024.00524
- Peer Reviewed / Int'l Joint Research
[Journal Article] Revisiting Latent Space of GAN Inversion for Robust Real Image Editing2024
- Author(s)
  Katsumata Kai、Vo Duc Minh、Liu Bei、Nakayama Hideki
- Journal Title
  
  2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
  
  Volume: 1 Pages: 5301-5310
- DOI
  10.1109/WACV57701.2024.00523
- Peer Reviewed / Int'l Joint Research
[Journal Article] Label Augmentation as Inter-class Data Augmentation for Conditional Image Synthesis with Imbalanced Data2024
- Author(s)
  Katsumata Kai、Vo Duc Minh、Nakayama Hideki
- Journal Title
  
  2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
  
  Volume: 1 Pages: 4932-4941
- DOI
  10.1109/WACV57701.2024.00487
- Peer Reviewed / Int'l Joint Research
[Journal Article] EVCap: Retrieval-Augmented Image Captioning with External Visual--Name Memory for Open-World Comprehension2024
- Author(s)
  Li Jiaxuan、Vo Duc Minh、Sugimoto Akihiro, Nakayama Hideki
- Journal Title
  
  2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  
  Volume: 1 Pages: -
- Peer Reviewed / Int'l Joint Research
[Journal Article] Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts2023
- Author(s)
  Li Jiaxuan、Vo Duc Minh、Nakayama Hideki
- Journal Title
  
  2023 IEEE/CVF International Conference on Computer Vision (ICCV)
  
  Volume: 1 Pages: 4901-4911
- DOI
  10.1109/ICCV51070.2023.00454
- Peer Reviewed / Int'l Joint Research
[Journal Article] A-CAP: Anticipation Captioning with Commonsense Knowledge2023
- Author(s)
  Vo Duc Minh、Luong Quoc-An、Sugimoto Akihiro、Nakayama Hideki
- Journal Title
  
  2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  
  Volume: 1 Pages: 10824-10833
- DOI
  10.1109/CVPR52729.2023.01042
- Peer Reviewed / Int'l Joint Research
[Journal Article] Indirect Adversarial Losses via an Intermediate Distribution for Training GANs2023
- Author(s)
  Yang Rui、Vo Duc Minh、Nakayama Hideki
- Journal Title
  
  2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
  
  Volume: 1 Pages: 4641-4650
- DOI
  10.1109/WACV56688.2023.00463
- Peer Reviewed / Int'l Joint Research
[Presentation] Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data2024
- Author(s)
  Katsumata Kai、Vo Duc Minh、Harada Tatsuya、Nakayama Hideki
- Organizer
  2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
- Int'l Joint Research
[Presentation] Revisiting Latent Space of GAN Inversion for Robust Real Image Editing2024
- Author(s)
  Katsumata Kai、Vo Duc Minh、Liu Bei、Nakayama Hideki
- Organizer
  2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
- Int'l Joint Research
[Presentation] Label Augmentation as Inter-class Data Augmentation for Conditional Image Synthesis with Imbalanced Data2024
- Author(s)
  Katsumata Kai、Vo Duc Minh、Nakayama Hideki
- Organizer
  2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
- Int'l Joint Research
[Presentation] Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts2023
- Author(s)
  Li Jiaxuan、Vo Duc Minh、Nakayama Hideki
- Organizer
  2023 IEEE/CVF International Conference on Computer Vision (ICCV)
- Int'l Joint Research
[Presentation] A-CAP: Anticipation Captioning with Commonsense Knowledge2023
- Author(s)
  Vo Duc Minh、Luong Quoc-An、Sugimoto Akihiro、Nakayama Hideki
- Organizer
  2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- Int'l Joint Research
[Presentation] Indirect Adversarial Losses via an Intermediate Distribution for Training GANs2023
- Author(s)
  Yang Rui、Vo Duc Minh、Nakayama Hideki
- Organizer
  2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
- Int'l Joint Research

2023 Fiscal Year Annual Research Report

Vision and language cross-modal for training conditional GANs with long-tail data.

Principal Investigator

ヴォ ミンデュク 東京大学, 大学院情報理工学系研究科, 特任助教 (40939906)

Research Products

[Journal Article] Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data2024

Author(s)

Journal Title

DOI

[Journal Article] Revisiting Latent Space of GAN Inversion for Robust Real Image Editing2024

Author(s)

Journal Title

DOI

[Journal Article] Label Augmentation as Inter-class Data Augmentation for Conditional Image Synthesis with Imbalanced Data2024

Author(s)

Journal Title

DOI

[Journal Article] EVCap: Retrieval-Augmented Image Captioning with External Visual--Name Memory for Open-World Comprehension2024

Author(s)

Journal Title

[Journal Article] Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts2023

Author(s)

Journal Title

DOI

[Journal Article] A-CAP: Anticipation Captioning with Commonsense Knowledge2023

Author(s)

Journal Title

DOI

[Journal Article] Indirect Adversarial Losses via an Intermediate Distribution for Training GANs2023

Author(s)

Journal Title

DOI

[Presentation] Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data2024

Author(s)

Organizer

[Presentation] Revisiting Latent Space of GAN Inversion for Robust Real Image Editing2024

Author(s)

Organizer

[Presentation] Label Augmentation as Inter-class Data Augmentation for Conditional Image Synthesis with Imbalanced Data2024

Author(s)

Organizer

[Presentation] Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts2023

Author(s)

Organizer

[Presentation] A-CAP: Anticipation Captioning with Commonsense Knowledge2023

Author(s)

Organizer

[Presentation] Indirect Adversarial Losses via an Intermediate Distribution for Training GANs2023

Author(s)

Organizer

ヴォミンデュク東京大学, 大学院情報理工学系研究科, 特任助教 (40939906)