• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Visual Question Answering System with a Knowledge Base

Research Project

Project/Area Number 18H03264
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Review Section Basic Section 61010:Perceptual information processing-related
Research InstitutionOsaka University

Principal Investigator

Yuta Nakashima  大阪大学, データビリティフロンティア機構, 准教授 (70633551)

Co-Investigator(Kenkyū-buntansha) 金 進東  大学共同利用機関法人情報・システム研究機構(機構本部施設等), データサイエンス共同利用基盤施設, 特任准教授 (40536893)
Project Period (FY) 2018-04-01 – 2022-03-31
Project Status Completed (Fiscal Year 2021)
Budget Amount *help
¥17,160,000 (Direct Cost: ¥13,200,000、Indirect Cost: ¥3,960,000)
Fiscal Year 2021: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2020: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2019: ¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)
Fiscal Year 2018: ¥5,200,000 (Direct Cost: ¥4,000,000、Indirect Cost: ¥1,200,000)
Keywords質疑応答 / 知識ベース / 深層学習
Outline of Final Research Achievements

Visual Question Answering (VQA) is an interdisciplinary field, lying on the vision and natural language fields, which is recently advanced drastically due to deep learning. Current techniques for VQA rely on rather a statistics approach, where the distribution of the training set solely matters. We need to go beyond this to make VQA more practical. Our core research question is: “Can VQA systems can answer questions that require inference?”, and we have been committed to building a system that uses knowledge for visual question answering (knowledge-based visual question answering; KBVQA), while also exploring an effective video representation.

Academic Significance and Societal Importance of the Research Achievements

本研究では、KBVQAの実現に向けて、モデルの評価のためのデータセットを構築し、その上でKBVQAのプロトタイプシステムを構築した。データセットは、今後のKBVQAの発展に大きく貢献するものであり、学術的に非常に価値が高いものであると考える。また、プロトタイプシステムでは、KBVQAの実現に際して問題となる映像記述とモデルの転用可能性について検証した。特に映像記述については、一般に広く利用されている高次元ベクトルによる記述が不十分であることを示し、新たな映像記述を提案している。

Report

(5 results)
  • 2021 Annual Research Report   Final Research Report ( PDF )
  • 2020 Annual Research Report
  • 2019 Annual Research Report
  • 2018 Annual Research Report
  • Research Products

    (40 results)

All 2022 2021 2020 2019 2018 Other

All Int'l Joint Research (5 results) Journal Article (5 results) (of which Peer Reviewed: 5 results,  Open Access: 3 results) Presentation (26 results) (of which Int'l Joint Research: 19 results) Remarks (4 results)

  • [Int'l Joint Research] University of Oulu/Tampere University(フィンランド)

    • Related Report
      2021 Annual Research Report
  • [Int'l Joint Research] Carnegie Mellon University(米国)

    • Related Report
      2020 Annual Research Report
  • [Int'l Joint Research] Tampere University/University of Oulu(フィンランド)

    • Related Report
      2020 Annual Research Report
  • [Int'l Joint Research] University of Oulu/Tampere University(フィンランド)

    • Related Report
      2019 Annual Research Report
  • [Int'l Joint Research] Tampere University/University of Oulu(フィンランド)

    • Related Report
      2018 Annual Research Report
  • [Journal Article] The semantic typology of visually grounded paraphrases2022

    • Author(s)
      Chu Chenhui、Oliveira Vinicius、Virgo Felix Giovanni、Otani Mayu、Garcia Noa、Nakashima Yuta
    • Journal Title

      Computer Vision and Image Understanding

      Volume: 215 Pages: 103333-103333

    • DOI

      10.1016/j.cviu.2021.103333

    • NAID

      120007179309

    • Related Report
      2021 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] A comparative study of language transformers for video question answering2021

    • Author(s)
      Yang Zekun、Garcia Noa、Chu Chenhui、Otani Mayu、Nakashima Yuta、Takemura Haruo
    • Journal Title

      Neurocomputing

      Volume: 445 Pages: 121-133

    • DOI

      10.1016/j.neucom.2021.02.092

    • Related Report
      2021 Annual Research Report 2020 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Visually grounded paraphrase identification via gating and phrase localization2020

    • Author(s)
      Otani Mayu、Chu Chenhui、Nakashima Yuta
    • Journal Title

      Neurocomputing

      Volume: 404 Pages: 165-172

    • DOI

      10.1016/j.neucom.2020.04.066

    • Related Report
      2020 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Visually grounded paraphrase identification via gating and phrase localization2020

    • Author(s)
      Mayu Otani, Chenhui Chu, and Yuta Nakashima
    • Journal Title

      Neurocomputing

      Volume: -

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed
  • [Journal Article] ContextNet: Representation and exploration for painting classification and retrieval in context2019

    • Author(s)
      Noa Garcia, Benjamin Renoust, and Yuta Nakashima
    • Journal Title

      International Journal on Multimedia Information Retrieval

      Volume: 9 Issue: 1 Pages: 17-30

    • DOI

      10.1007/s13735-019-00189-4

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed / Open Access
  • [Presentation] Quantifying societal bias amplification in image captioning2022

    • Author(s)
      Yusuke Hirota、Yuta Nakashima、Noa Garcia
    • Organizer
      IEEE/CVF Conference on Computer Vision and Pattern Recognition
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Transferring domain-agnostic knowledge in video question answering2021

    • Author(s)
      Tianran Wu、Noa Garcia、Mayu Otani、Chenhui Chu、Yuta Nakashima、Haruo Takemura
    • Organizer
      British Machine Vision Conference
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] GCNBoost: Artwork classification by label propagation through a knowledge graph2021

    • Author(s)
      Cheikh Brahim El Vaigh、Noa Garcia、Benjamin Renoust、Chenhui Chu、Yuta Nakashima、Hajime Nagahara
    • Organizer
      ACM International Conference on Multimedia Retrieval
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Image retrieval by hierarchy-aware deep hashing based on multi- task learning2021

    • Author(s)
      Bowen Wang、Liangzhi Li、Yuta Nakashima、Takehiro Yamamoto、Hiroaki Ohshima、Yoshiyuki Shoji、Kenro Aihara、Noriko Kando
    • Organizer
      ACM International Conference on Multimedia Retrieval
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Explain me the painting: Multi-topic knowledge- able art description generation2021

    • Author(s)
      Zechen Bai、Yuta Nakashima、Noa Garcia
    • Organizer
      IEEE/CVF International Conference on Computer Vision
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Visual question answering with textual representations for images2021

    • Author(s)
      Yusuke Hirota、Noa Garcia、Mayu Otani、Chenhui Chu、Yuta Nakashima、Ittetsu Taniguchi、Takao Onoye
    • Organizer
      IEEE/CVF International Conference on Computer Vision Workshops
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] The Laughing Machine: Predicting Humor in Video2021

    • Author(s)
      Yuta Kayatani、Zekun Yang、Mayu Otani、Noa Garcia、Chenhui Chu、Yuta Nakashima、Haruo Takemura
    • Organizer
      2021 IEEE Winter Conference on Applications Computer Vision
    • Related Report
      2020 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Uncovering Hidden Challenges in Query-Based Video Moment Retrieval2020

    • Author(s)
      Mayu Otani、Yuta Nakashima、Esa Rahtu、Janne Heikkila
    • Organizer
      31st Biritish Machine Vision Conference
    • Related Report
      2020 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Knowledge-Based Visual Question Answering in Videos2020

    • Author(s)
      Noa Garcia、Mayu Otani、Chenhui Chu、Yuta Nakashima
    • Organizer
      2020 Conference on Computer Vision and Pattern Recognition Workshops
    • Related Report
      2020 Annual Research Report
    • Int'l Joint Research
  • [Presentation] A Dataset and Baselines for Visual Question Answering on Art2020

    • Author(s)
      Noa Garcia、Chentao Ye、Zihua Liu、Qingtao Hu、Mayu Otani、Chenhui Chu、Yuta Nakashima、Teruko Mitamura
    • Organizer
      2020 Workshop on Computer Vision for Art
    • Related Report
      2020 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions2020

    • Author(s)
      Noa Garcia、Yuta Nakashima
    • Organizer
      European Conference on Computer Vision
    • Related Report
      2020 Annual Research Report
    • Int'l Joint Research
  • [Presentation] What We All Need Are Non-trivial Baselines and Sanity Checks2020

    • Author(s)
      Mayu Otani、Yuta Nakashima、Esa Rahtu、Janne Heikkila
    • Organizer
      第23回 画像の認識・理解シンポジウム
    • Related Report
      2020 Annual Research Report
  • [Presentation] BERT representations for video question answering2020

    • Author(s)
      Zekun Yang, Noa Garcia, Chenhui Chu, Mayu Otani, Yuta Nakashima, and Haruo Takemura
    • Organizer
      IEEE Winter Conference on Applications of Computer Vision
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] KnowIT VQA: Answering knowledge-based questions about video2020

    • Author(s)
      Noa Garcia, Chenhui Chu, Mayu Otani, and Yuta Nakashima
    • Organizer
      AAAI Conference on Artificial Intelligence
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Adaptive gating mechanism for identifying visually grounded paraphrases2019

    • Author(s)
      Mayu Otani, Chenhui Chu, and Yuta Nakashima
    • Organizer
      Multi-Discipline Approach for Learning Concepts
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Rethinking the evaluation of video summaries2019

    • Author(s)
      Mayu Otani, Yuta Nakashima, Esa Rahtu, and Janne Heikkila
    • Organizer
      IEEE Conference on Computer Vision and Pattern Recognition
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Context-aware embeddings for automatic art analysis2019

    • Author(s)
      Noa Garcia, Benjamin Renoust, and Yuta Nakashima
    • Organizer
      ACM International Conference on Multimedia Retrieval
    • Related Report
      2019 Annual Research Report 2018 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Video meets knowledge in visual question answering2019

    • Author(s)
      Noa Garcia, Chenhui Chu, Mayu Otani, and Yuta Nakashima
    • Organizer
      第22回 画像の認識・理解シンポジウム
    • Related Report
      2019 Annual Research Report
  • [Presentation] Collecting relation-aware video captions2019

    • Author(s)
      Mayu Otani, Kazuhiro Ota, Yuta Nakashima, Esa Rahtu, Janne Heikkila, and Yoshitaka Ushiku
    • Organizer
      第22回 画像の認識・理解シンポジウム
    • Related Report
      2019 Annual Research Report
  • [Presentation] Video question answering with BERT2019

    • Author(s)
      Zekun Yang, Noa Garcia, Chenhui Chu, Mayu Otani, Yuta Nakashima, and Haruo Takemura
    • Organizer
      第22回 画像の認識・理解シンポジウム
    • Related Report
      2019 Annual Research Report
  • [Presentation] コメディドラマにおける字幕と表情を用いた笑い予測2019

    • Author(s)
      萓谷 勇太, 大谷まゆ, Chenhui Chu, 中島 悠太, 竹村 治雄
    • Organizer
      2019年度 人工知能学会全国大会
    • Related Report
      2019 Annual Research Report
  • [Presentation] Understanding art through multi-modal retrieval in paintings2019

    • Author(s)
      Noa Garcia, Benjamin Renoust, and Yuta Nakashima
    • Organizer
      Language and Vision Workshop
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Rethinking the evaluation of video summaries2019

    • Author(s)
      Mayu Otani, Yuta Nakashima, Esa Rahtu, and Janne Heikkila
    • Organizer
      IEEE Computer Society Conference on Computer Vision and Pattern Recognition
    • Related Report
      2018 Annual Research Report
    • Int'l Joint Research
  • [Presentation] iParaphrasing: Extracting visually grounded paraphrases via an image2018

    • Author(s)
      Chenhui Chu, Mayu Otani, and Yuta Nakashima
    • Organizer
      27th International Conference on Computational Linguistics
    • Related Report
      2018 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Phrase localization-based visually grounded paraphrase identification2018

    • Author(s)
      Mayu Otani, Chenhui Chu, and Yuta Nakashima
    • Organizer
      第21回 画像の認識・理解シンポジウム
    • Related Report
      2018 Annual Research Report
  • [Presentation] Visually grounded paraphrase extraction via phrase grounding2018

    • Author(s)
      Mayu Otani, Chenhui Chu, and Yuta Nakashima
    • Organizer
      Workshop on Language and Vision at CVPR
    • Related Report
      2018 Annual Research Report
  • [Remarks] KnowIT VQA

    • URL

      https://knowit-vqa.github.io/

    • Related Report
      2021 Annual Research Report
  • [Remarks] Art Description Generation

    • URL

      https://sites.google.com/view/art-description-generation

    • Related Report
      2021 Annual Research Report
  • [Remarks] KnowIT VQA Paper

    • URL

      https://knowit-vqa.github.io

    • Related Report
      2019 Annual Research Report
  • [Remarks] Knowledge VQA

    • URL

      https://www.n-yuta.jp/project/knowledge-vqa/

    • Related Report
      2019 Annual Research Report

URL: 

Published: 2018-04-23   Modified: 2023-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi