• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Predictive coding in children's context-based learning of homophones

Research Project

Project/Area Number 24KF0103
Research Category

Grant-in-Aid for JSPS Fellows

Allocation TypeMulti-year Fund
Section外国
Review Section Basic Section 02060:Linguistics-related
Research InstitutionThe University of Tokyo

Principal Investigator

長井 志江  東京大学, ニューロインテリジェンス国際研究機構, 特任教授 (30571632)

Co-Investigator(Kenkyū-buntansha) LU YOUTAO  東京大学, ニューロインテリジェンス国際研究機構, 外国人特別研究員
Project Period (FY) 2024-07-26 – 2027-03-31
Project Status Granted (Fiscal Year 2024)
Budget Amount *help
¥2,100,000 (Direct Cost: ¥2,100,000)
Fiscal Year 2026: ¥300,000 (Direct Cost: ¥300,000)
Fiscal Year 2025: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2024: ¥900,000 (Direct Cost: ¥900,000)
Keywordshomophone / word learning / corpus analysis / language development
Outline of Research at the Start

This research examines children’s homophone learning under the framework of predictive coding. It includes 1)corpus analysis to quantify the frequency of homophone and disambiguating syntactic and semantic cues in children’s language input, 2)experimentation to test children’s usage of these cues when learning a novel homophone, 3)computational modeling to examine the predictivity and learnability of homophones based on the provided cues, and 4)comparison between Japanese (a homophone-rich language) and English (a homophone-sparse language) to elucidate the impact of language background.

Outline of Annual Research Achievements

Our main focus this year has been to examine the distribution of homophones-words with the same sound but different meanings-in the early language input of children across languages. We selected comparable corpora from the CHILDES/TalkBank database for English, French, and Japanese, and analyzed words in maternal speech to children aged one to four.
As predicted, Japanese-learning children are exposed to significantly more homophones (about 14% of words heard) than English-learning (1.8%) and French-learning (2.6%) children. This cross-linguistic difference aligns with earlier findings in adult speech. We also found that while the proportion of homophones remains stable across age groups in English and French, it increases with age in Japanese-learning children (from 12% to 15%).
Detailed analyses showed results contrary to our hypotheses: homophone pairs or sets tend to be learned at similar ages and show smaller frequency gaps than non-homophones. This suggests that language input does not naturally separate them to aid learning. We also found that individual homophones can appear in more than one syntactic category, with frequent overlap within a homophone pair or set. While previous studies suggest toddlers use the differences in syntactic categories to learn homophones, our findings show that such cues are often inconsistent in natural input.
Our results so far suggest that homophone learning differs greatly across languages and is more complex than previously thought. It may be especially challenging for children learning Japanese.

Current Status of Research Progress
Current Status of Research Progress

2: Research has progressed on the whole more than it was originally planned.

Reason

We concluded that the project is progressing smoothly overall, based on our progress in both corpus analysis and experimental studies.
As summarized, we have made progress beyond our initial expectations in corpus analysis. We identified cross-linguistic differences that align with our hypotheses and previous studies, and we discovered novel developmental patterns. Furthermore, our detailed analyses yielded findings that challenge earlier results. While we have entered the stage to conclude this part of the research as planned, these new findings have opened up wide possibilities for further development. We plan to present our results at international conferences to obtain further insights into potential extensions of the research.
The experimental studies, on the other hand, have been delayed. Although we have completed the preparation of the program and stimuli, unexpected findings about the syntactic categories of homophones suggest that our original experiment may not have as solid a theoretical foundation as anticipated. We are now actively discussing new experimental designs that reflect our latest insights, including one that investigates homophone learning both behaviorally and computationally within a curriculum learning framework.
Part of the delay also resulted from an unexpected change in personnel. A research assistant, who was a native speaker of both English and French, left the lab after December. This led to delays in annotating and analyzing language data in both languages. We are actively recruiting new candidates.

Strategy for Future Research Activity

The corpus analysis has provided us with a broad view of cross-linguistic differences in the distribution of homophones in children’s language input. A major direction for future work is to precisely track developmental changes in this measure. We have identified corpora in English, French, and Japanese that will allow us to rigorously compare changes in homophone exposure by month.
Thus far, our analyses have individually examined several cues that may help children distinguish homophones. We plan to extend these findings by comprehensively analyzing the predictability of homophones based on the sentences in which they appear. To do so, we will first use pretrained large language models to estimate predictability. If necessary, we also plan to train our own language models using child-directed speech to better reflect children’s actual language input.
In the next stage of the project, the focus of our research will gradually shift toward experimental studies. While previous studies acknowledge that syntactic and semantic cues can effectively support homophone learning in controlled laboratory settings, our analyses suggest that such cues may not be provided in a consistent manner in real language input. We plan to explore an alternative possibility based on our finding: a pair or a set of homophones tends to be introduced to children within a relatively short developmental window. We will examine whether this temporal proximity in exposure supports more efficient organization of the mental lexicon, and whether such effects differ across languages.

Report

(1 results)
  • 2024 Research-status Report
  • Research Products

    (1 results)

All 2024

All Presentation (1 results)

  • [Presentation] Innovations in the measurement and analysis of naturalistic infant behavior: from lab to life2024

    • Author(s)
      Youtao Lu, Jiarui Li, Tomoko Isomura
    • Organizer
      日本赤ちゃん学会第24回学術集会
    • Related Report
      2024 Research-status Report

URL: 

Published: 2024-07-29   Modified: 2025-12-26  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi