• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Linking Vision and Language through Computational Modelling

Research Project

Project/Area Number 19K12733
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 90030:Cognitive science-related
Research InstitutionKobe City University of Foreign Studies

Principal Investigator

Chang Franklin  神戸市外国語大学, 英米学科, 教授 (60827343)

Project Period (FY) 2019-04-01 – 2024-03-31
Project Status Completed (Fiscal Year 2023)
Budget Amount *help
¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)
Fiscal Year 2023: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2022: ¥390,000 (Direct Cost: ¥300,000、Indirect Cost: ¥90,000)
Fiscal Year 2021: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2020: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2019: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Keywords視覚情報 / ディープラーニングモデル / 動詞 / 過去形 / 進行形 / 終了状態 / 子ども / 大人 / action understanding / deep learning / Japanese verbs / Vision / Language / Learning / Event understanding / Computational model / Deep Learning / Priming / Verbs / Syntax / Eyetracking / language / thematic roles / object tracking / connectionist model
Outline of Research at the Start

The first project will be the development of a computational model which can explain behavioral data from both adults and children within multiple object tracking tasks.
The next step will be to extend this model to address motion understanding.
The next project will link this computational model of action understanding to language.
To test this computational model, we will do a series of eye-tracking studies which test various assumptions of the model.

Outline of Final Research Achievements

Language is used to describe events that we see, but the relationship between visual and language representations is still not well understood. In this research, we focused on the visual cues that are used to select past tense (ran) and progressive aspect forms (is running). We created videos of actions by human-like characters where they performed actions like running. Then we added objects into the scene that signaled that endstate had been reached. We found that both Japanese adults and 3-5 year old children used past tense more when the videos has endstate information compared to when it didn’t. To understand how they mapped these visual signals into language, we developed a deep learning model that tracked the motion of body parts and objects in the videos and used that to generate Japanese verbs. The model could explain our data and it made predictions that were confirmed in a follow-up experiment. This work demonstrates that vision and language are tightly linked.

Academic Significance and Societal Importance of the Research Achievements

本研究では、大人と子どもが動画の視覚情報をどのように利用して動詞や動詞の形態を生成するかを調べた。日本人が視覚的な情報からどのように動詞を生成するかを示す、計算AIモデルを開発した。このモデルは、第一言語と第二言語の習得をサポートするための視覚資料作成に役立つ。また、人間が視覚的情報を言語化する方法を解明する一助となる本研究は、日本語を話すA Iシステムを作成する際に役立つ。

Report

(6 results)
  • 2023 Annual Research Report   Final Research Report ( PDF )
  • 2022 Research-status Report
  • 2021 Research-status Report
  • 2020 Research-status Report
  • 2019 Research-status Report
  • Research Products

    (5 results)

All 2023 2022 2019

All Journal Article (4 results) (of which Int'l Joint Research: 4 results,  Peer Reviewed: 4 results,  Open Access: 3 results) Presentation (1 results)

  • [Journal Article] Visual Heuristics for Verb Production: Testing a Deep‐Learning Model With Experiments in Japanese2023

    • Author(s)
      Chang Franklin、Tatsumi Tomoko、Hiranuma Yuna、Bannard Colin
    • Journal Title

      Cognitive Science

      Volume: 47(8) Issue: 8 Pages: 1-38

    • DOI

      10.1111/cogs.13324

    • Related Report
      2023 Annual Research Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Thematic role tracking difficulties across multiple visual events influences role use in language production2022

    • Author(s)
      Jessop Andrew、Chang Franklin
    • Journal Title

      Visual Cognition

      Volume: 30 Issue: 3 Pages: 151-173

    • DOI

      10.1080/13506285.2021.2013374

    • Related Report
      2022 Research-status Report 2021 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Abstract structures and meaning in Japanese dative structural priming2022

    • Author(s)
      Chang Franklin、Tsumura Saki、Minemi Itsuki、Hirose Yuki
    • Journal Title

      Applied Psycholinguistics

      Volume: 43 Issue: 2 Pages: 411-433

    • DOI

      10.1017/s0142716421000576

    • Related Report
      2022 Research-status Report 2021 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Thematic role information is maintained in the visual object-tracking system2019

    • Author(s)
      Jessop Andrew、Chang Franklin
    • Journal Title

      Quarterly Journal of Experimental Psychology

      Volume: 73 Issue: 1 Pages: 146-163

    • DOI

      10.1177/1747021819882842

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Presentation] A deep learning verb production model using input from visual animations2022

    • Author(s)
      Chang, F., Tatsumi, T., Hiranuma, Y., & Bannard, C.
    • Organizer
      Talk presented at the TL/Mental Architecture for Processing and Learning of Language conference.
    • Related Report
      2022 Research-status Report

URL: 

Published: 2019-04-18   Modified: 2025-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi