Visual Question Answering System with a Knowledge Base

Research Project

Project/Area Number	18H03264
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	Osaka University
Principal Investigator	Yuta Nakashima 大阪大学, データビリティフロンティア機構, 准教授 (70633551)
Co-Investigator(Kenkyū-buntansha)	金進東大学共同利用機関法人情報・システム研究機構(機構本部施設等), データサイエンス共同利用基盤施設, 特任准教授 (40536893)
Project Period (FY)	2018-04-01 – 2022-03-31
Project Status	Completed (Fiscal Year 2021)
Budget Amount *help	¥17,160,000 (Direct Cost: ¥13,200,000、Indirect Cost: ¥3,960,000) Fiscal Year 2021: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000) Fiscal Year 2020: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000) Fiscal Year 2019: ¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000) Fiscal Year 2018: ¥5,200,000 (Direct Cost: ¥4,000,000、Indirect Cost: ¥1,200,000)
Keywords	質疑応答 / 知識ベース / 深層学習
Outline of Final Research Achievements	Visual Question Answering (VQA) is an interdisciplinary field, lying on the vision and natural language fields, which is recently advanced drastically due to deep learning. Current techniques for VQA rely on rather a statistics approach, where the distribution of the training set solely matters. We need to go beyond this to make VQA more practical. Our core research question is: “Can VQA systems can answer questions that require inference?”, and we have been committed to building a system that uses knowledge for visual question answering (knowledge-based visual question answering; KBVQA), while also exploring an effective video representation.
Academic Significance and Societal Importance of the Research Achievements	本研究では、KBVQAの実現に向けて、モデルの評価のためのデータセットを構築し、その上でKBVQAのプロトタイプシステムを構築した。データセットは、今後のKBVQAの発展に大きく貢献するものであり、学術的に非常に価値が高いものであると考える。また、プロトタイプシステムでは、KBVQAの実現に際して問題となる映像記述とモデルの転用可能性について検証した。特に映像記述については、一般に広く利用されている高次元ベクトルによる記述が不十分であることを示し、新たな映像記述を提案している。

Report

(5 results)

2021 Annual Research Report Final Research Report ( PDF )
2020 Annual Research Report
2019 Annual Research Report
2018 Annual Research Report

Research Products
(40 results)

All 2022 2021 2020 2019 2018 Other

All Int'l Joint Research (5 results) Journal Article (5 results) (of which Peer Reviewed: 5 results, Open Access: 3 results) Presentation (26 results) (of which Int'l Joint Research: 19 results) Remarks (4 results)

[Int'l Joint Research] University of Oulu/Tampere University(フィンランド)
- Related Report
  2021 Annual Research Report
[Int'l Joint Research] Carnegie Mellon University(米国)
- Related Report
  2020 Annual Research Report
[Int'l Joint Research] Tampere University/University of Oulu(フィンランド)
- Related Report
  2020 Annual Research Report
[Int'l Joint Research] University of Oulu/Tampere University(フィンランド)
- Related Report
  2019 Annual Research Report
[Int'l Joint Research] Tampere University/University of Oulu(フィンランド)
- Related Report
  2018 Annual Research Report
[Journal Article] The semantic typology of visually grounded paraphrases2022
- Author(s)
  Chu Chenhui、Oliveira Vinicius、Virgo Felix Giovanni、Otani Mayu、Garcia Noa、Nakashima Yuta
- Journal Title
  
  Computer Vision and Image Understanding
  
  Volume: 215 Pages: 103333-103333
- DOI
  10.1016/j.cviu.2021.103333
- NAID
  120007179309
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] A comparative study of language transformers for video question answering2021
- Author(s)
  Yang Zekun、Garcia Noa、Chu Chenhui、Otani Mayu、Nakashima Yuta、Takemura Haruo
- Journal Title
  
  Neurocomputing
  
  Volume: 445 Pages: 121-133
- DOI
  10.1016/j.neucom.2021.02.092
- Related Report
  2021 Annual Research Report 2020 Annual Research Report
- Peer Reviewed
[Journal Article] Visually grounded paraphrase identification via gating and phrase localization2020
- Author(s)
  Otani Mayu、Chu Chenhui、Nakashima Yuta
- Journal Title
  
  Neurocomputing
  
  Volume: 404 Pages: 165-172
- DOI
  10.1016/j.neucom.2020.04.066
- Related Report
  2020 Annual Research Report
- Peer Reviewed / Open Access
[Journal Article] Visually grounded paraphrase identification via gating and phrase localization2020
- Author(s)
  Mayu Otani, Chenhui Chu, and Yuta Nakashima
- Journal Title
  
  Neurocomputing
  
  Volume: -
- Related Report
  2019 Annual Research Report
- Peer Reviewed
[Journal Article] ContextNet: Representation and exploration for painting classification and retrieval in context2019
- Author(s)
  Noa Garcia, Benjamin Renoust, and Yuta Nakashima
- Journal Title
  
  International Journal on Multimedia Information Retrieval
  
  Volume: 9 Issue: 1 Pages: 17-30
- DOI
  10.1007/s13735-019-00189-4
- Related Report
  2019 Annual Research Report
- Peer Reviewed / Open Access
[Presentation] Quantifying societal bias amplification in image captioning2022
- Author(s)
  Yusuke Hirota、Yuta Nakashima、Noa Garcia
- Organizer
  IEEE/CVF Conference on Computer Vision and Pattern Recognition
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Transferring domain-agnostic knowledge in video question answering2021
- Author(s)
  Tianran Wu、Noa Garcia、Mayu Otani、Chenhui Chu、Yuta Nakashima、Haruo Takemura
- Organizer
  British Machine Vision Conference
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] GCNBoost: Artwork classification by label propagation through a knowledge graph2021
- Author(s)
  Cheikh Brahim El Vaigh、Noa Garcia、Benjamin Renoust、Chenhui Chu、Yuta Nakashima、Hajime Nagahara
- Organizer
  ACM International Conference on Multimedia Retrieval
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Image retrieval by hierarchy-aware deep hashing based on multi- task learning2021
- Author(s)
  Bowen Wang、Liangzhi Li、Yuta Nakashima、Takehiro Yamamoto、Hiroaki Ohshima、Yoshiyuki Shoji、Kenro Aihara、Noriko Kando
- Organizer
  ACM International Conference on Multimedia Retrieval
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Explain me the painting: Multi-topic knowledge- able art description generation2021
- Author(s)
  Zechen Bai、Yuta Nakashima、Noa Garcia
- Organizer
  IEEE/CVF International Conference on Computer Vision
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] Visual question answering with textual representations for images2021
- Author(s)
  Yusuke Hirota、Noa Garcia、Mayu Otani、Chenhui Chu、Yuta Nakashima、Ittetsu Taniguchi、Takao Onoye
- Organizer
  IEEE/CVF International Conference on Computer Vision Workshops
- Related Report
  2021 Annual Research Report
- Int'l Joint Research
[Presentation] The Laughing Machine: Predicting Humor in Video2021
- Author(s)
  Yuta Kayatani、Zekun Yang、Mayu Otani、Noa Garcia、Chenhui Chu、Yuta Nakashima、Haruo Takemura
- Organizer
  2021 IEEE Winter Conference on Applications Computer Vision
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] Uncovering Hidden Challenges in Query-Based Video Moment Retrieval2020
- Author(s)
  Mayu Otani、Yuta Nakashima、Esa Rahtu、Janne Heikkila
- Organizer
  31st Biritish Machine Vision Conference
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] Knowledge-Based Visual Question Answering in Videos2020
- Author(s)
  Noa Garcia、Mayu Otani、Chenhui Chu、Yuta Nakashima
- Organizer
  2020 Conference on Computer Vision and Pattern Recognition Workshops
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] A Dataset and Baselines for Visual Question Answering on Art2020
- Author(s)
  Noa Garcia、Chentao Ye、Zihua Liu、Qingtao Hu、Mayu Otani、Chenhui Chu、Yuta Nakashima、Teruko Mitamura
- Organizer
  2020 Workshop on Computer Vision for Art
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions2020
- Author(s)
  Noa Garcia、Yuta Nakashima
- Organizer
  European Conference on Computer Vision
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] What We All Need Are Non-trivial Baselines and Sanity Checks2020
- Author(s)
  Mayu Otani、Yuta Nakashima、Esa Rahtu、Janne Heikkila
- Organizer
  第23回画像の認識・理解シンポジウム
- Related Report
  2020 Annual Research Report
[Presentation] BERT representations for video question answering2020
- Author(s)
  Zekun Yang, Noa Garcia, Chenhui Chu, Mayu Otani, Yuta Nakashima, and Haruo Takemura
- Organizer
  IEEE Winter Conference on Applications of Computer Vision
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] KnowIT VQA: Answering knowledge-based questions about video2020
- Author(s)
  Noa Garcia, Chenhui Chu, Mayu Otani, and Yuta Nakashima
- Organizer
  AAAI Conference on Artificial Intelligence
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] Adaptive gating mechanism for identifying visually grounded paraphrases2019
- Author(s)
  Mayu Otani, Chenhui Chu, and Yuta Nakashima
- Organizer
  Multi-Discipline Approach for Learning Concepts
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] Rethinking the evaluation of video summaries2019
- Author(s)
  Mayu Otani, Yuta Nakashima, Esa Rahtu, and Janne Heikkila
- Organizer
  IEEE Conference on Computer Vision and Pattern Recognition
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] Context-aware embeddings for automatic art analysis2019
- Author(s)
  Noa Garcia, Benjamin Renoust, and Yuta Nakashima
- Organizer
  ACM International Conference on Multimedia Retrieval
- Related Report
  2019 Annual Research Report 2018 Annual Research Report
- Int'l Joint Research
[Presentation] Video meets knowledge in visual question answering2019
- Author(s)
  Noa Garcia, Chenhui Chu, Mayu Otani, and Yuta Nakashima
- Organizer
  第22回画像の認識・理解シンポジウム
- Related Report
  2019 Annual Research Report
[Presentation] Collecting relation-aware video captions2019
- Author(s)
  Mayu Otani, Kazuhiro Ota, Yuta Nakashima, Esa Rahtu, Janne Heikkila, and Yoshitaka Ushiku
- Organizer
  第22回画像の認識・理解シンポジウム
- Related Report
  2019 Annual Research Report
[Presentation] Video question answering with BERT2019
- Author(s)
  Zekun Yang, Noa Garcia, Chenhui Chu, Mayu Otani, Yuta Nakashima, and Haruo Takemura
- Organizer
  第22回画像の認識・理解シンポジウム
- Related Report
  2019 Annual Research Report
[Presentation] コメディドラマにおける字幕と表情を用いた笑い予測2019
- Author(s)
  萓谷勇太, 大谷まゆ, Chenhui Chu, 中島悠太, 竹村治雄
- Organizer
  2019年度人工知能学会全国大会
- Related Report
  2019 Annual Research Report
[Presentation] Understanding art through multi-modal retrieval in paintings2019
- Author(s)
  Noa Garcia, Benjamin Renoust, and Yuta Nakashima
- Organizer
  Language and Vision Workshop
- Related Report
  2019 Annual Research Report
- Int'l Joint Research
[Presentation] Rethinking the evaluation of video summaries2019
- Author(s)
  Mayu Otani, Yuta Nakashima, Esa Rahtu, and Janne Heikkila
- Organizer
  IEEE Computer Society Conference on Computer Vision and Pattern Recognition
- Related Report
  2018 Annual Research Report
- Int'l Joint Research
[Presentation] iParaphrasing: Extracting visually grounded paraphrases via an image2018
- Author(s)
  Chenhui Chu, Mayu Otani, and Yuta Nakashima
- Organizer
  27th International Conference on Computational Linguistics
- Related Report
  2018 Annual Research Report
- Int'l Joint Research
[Presentation] Phrase localization-based visually grounded paraphrase identification2018
- Author(s)
  Mayu Otani, Chenhui Chu, and Yuta Nakashima
- Organizer
  第21回画像の認識・理解シンポジウム
- Related Report
  2018 Annual Research Report
[Presentation] Visually grounded paraphrase extraction via phrase grounding2018
- Author(s)
  Mayu Otani, Chenhui Chu, and Yuta Nakashima
- Organizer
  Workshop on Language and Vision at CVPR
- Related Report
  2018 Annual Research Report
[Remarks] KnowIT VQA
- URL
  https://knowit-vqa.github.io/
- Related Report
  2021 Annual Research Report
[Remarks] Art Description Generation
- URL
  https://sites.google.com/view/art-description-generation
- Related Report
  2021 Annual Research Report
[Remarks] KnowIT VQA Paper
- URL
  https://knowit-vqa.github.io
- Related Report
  2019 Annual Research Report
[Remarks] Knowledge VQA
- URL
  https://www.n-yuta.jp/project/knowledge-vqa/
- Related Report
  2019 Annual Research Report

Visual Question Answering System with a Knowledge Base

Principal Investigator

Yuta Nakashima 大阪大学, データビリティフロンティア機構, 准教授 (70633551)

¥17,160,000 (Direct Cost: ¥13,200,000、Indirect Cost: ¥3,960,000)

Report

Research Products

[Int'l Joint Research] University of Oulu/Tampere University(フィンランド)

Related Report

[Int'l Joint Research] Carnegie Mellon University(米国)

Related Report

[Int'l Joint Research] Tampere University/University of Oulu(フィンランド)

Related Report

[Int'l Joint Research] University of Oulu/Tampere University(フィンランド)

Related Report

[Int'l Joint Research] Tampere University/University of Oulu(フィンランド)

Related Report

[Journal Article] The semantic typology of visually grounded paraphrases2022

Author(s)

Journal Title

DOI

NAID

Related Report

[Journal Article] A comparative study of language transformers for video question answering2021

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Visually grounded paraphrase identification via gating and phrase localization2020

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Visually grounded paraphrase identification via gating and phrase localization2020

Author(s)

Journal Title

Related Report

[Journal Article] ContextNet: Representation and exploration for painting classification and retrieval in context2019

Author(s)

Journal Title

DOI

Related Report

[Presentation] Quantifying societal bias amplification in image captioning2022

Author(s)

Organizer

Related Report

[Presentation] Transferring domain-agnostic knowledge in video question answering2021

Author(s)

Organizer

Related Report

[Presentation] GCNBoost: Artwork classification by label propagation through a knowledge graph2021

Author(s)

Organizer

Related Report

[Presentation] Image retrieval by hierarchy-aware deep hashing based on multi- task learning2021

Author(s)

Organizer

Related Report

[Presentation] Explain me the painting: Multi-topic knowledge- able art description generation2021

Author(s)

Organizer

Related Report

[Presentation] Visual question answering with textual representations for images2021

Author(s)

Organizer

Related Report

[Presentation] The Laughing Machine: Predicting Humor in Video2021

Author(s)

Organizer

Related Report

[Presentation] Uncovering Hidden Challenges in Query-Based Video Moment Retrieval2020

Author(s)

Organizer

Related Report

[Presentation] Knowledge-Based Visual Question Answering in Videos2020

Author(s)

Organizer

Related Report

[Presentation] A Dataset and Baselines for Visual Question Answering on Art2020

Author(s)

Organizer