• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Creating the first oral and written corpus of Japanese learners of Spanish as a foreign language

Research Project

Project/Area Number 23K00698
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 02100:Foreign language education-related
Research InstitutionHiroshima University

Principal Investigator

GARCIA CARLOS  広島大学, 外国語教育研究センター, 准教授 (30817169)

Co-Investigator(Kenkyū-buntansha) VALVERDE Pilar  関西外国語大学, 外国語学部, 准教授 (10588205)
Project Period (FY) 2023-04-01 – 2026-03-31
Project Status Granted (Fiscal Year 2023)
Budget Amount *help
¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Fiscal Year 2025: ¥520,000 (Direct Cost: ¥400,000、Indirect Cost: ¥120,000)
Fiscal Year 2024: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2023: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Keywordslearner corpus / oral corpus / written corpus / Spanish / foreign language / Spanish Foreign Language
Outline of Research at the Start

In this research project, we aim to create the first oral and written corpus of Spanish as a Foreign Language learners in Japan by merging and enriching two corpora we created in our previous works.

Outline of Annual Research Achievements

In this research project, we aim to create the first oral and written corpus of Spanish as a Foreign Language learners (SFL) in Japan by merging and enriching two corpora we created in our previous works. The objectives FY2023 were to stablish the requirements for integrating both corpora. A systematic review of SFL learner’s corpora available on the Internet guided us on how to maximize the potential use of the oral corpus in studies related to linguistic and interactional phenomena.
We have stablished the steps to process the transcriptions of the oral corpus for their linguistic annotation. In short: 1) Preprocessing step for automatic parts of speech (PoS) processing, 2) PoS processing using the Freeling tool; 3) Adding metadata and xml tags.
We have also addressed two key issues: the logic structure of the metadata scheme and the XML tags. In the written corpus, each document is assigned to one participant, but in the oral corpus, each transcript corresponds to a conversation with 2 participants. In addition, in its present version, the written corpus uses XML tags for sentence and document. For the oral corpus, tags for segmenting each turn and its participant are also required.
Finally, by using the markup language HTML5 for the oral corpus, it will be possible to read the transcription with interactional codes and listen the audio for each turn. This will allow studies about interactional phenomena. Preliminary tests for exporting our transcriptions made using the linguistic annotator ELAN software to HTML5 have been proven to be successful and feasible.

Current Status of Research Progress
Current Status of Research Progress

3: Progress in research has been slightly delayed.

Reason

As our research objectives for FY2023 were successfully accomplished, we consider that this project is progressing rather smoothly. However, there is a crucial point that should be considered and could delay this project. In FY2024 funds are required as honoraria for a graduate student with sufficient knowledge of Spanish, Japanese and Linguistics, to prepare the data to be processed with an automatic linguistic annotation tool. We still need to find this graduate student with the required background.

Strategy for Future Research Activity

In FY2024 we aim to process the data for the integration of the oral corpus (Corpus of Natural Conversations) into the platform and online interface of the written corpus (CELEN corpus).
This processing will include: (1) Preprocessing of each transcriptions for PoS tagging (for instance, deleting transcription marks of interactional phenomena such as codes for pauses, fillers or overlapping), (2) automatic PoS tagging of each transcription using the Freeling tool, (3) assigning metadata and XML tags to each transcription, (4) preparing HTML5 versions of each transcription,
(5) reviewing the materials, and (6) integrating the materials into the Sketch platform and online site of the written corpus (CELEN corpus). In addition, we will prepare the transcriptions of the oral corpus to be incorporated as HTML5 documents in the CELEN corpus online site.
In order to achieve these objectives, we need the collaboration of a graduate student under our guidance.

Report

(1 results)
  • 2023 Research-status Report
  • Research Products

    (3 results)

All 2024 2023

All Journal Article (2 results) (of which Int'l Joint Research: 1 results,  Peer Reviewed: 2 results,  Open Access: 2 results) Presentation (1 results) (of which Int'l Joint Research: 1 results)

  • [Journal Article] Characteristics of Publicly Available Learner Corpora for the Study of Oral Interaction and Conversation in Spanish as a Foreign Language2024

    • Author(s)
      Carlos Garcia Ruiz-Castillo
    • Journal Title

      広島外国語教育研究

      Volume: 27 Issue: 27 Pages: 117-132

    • DOI

      10.15027/54916

    • URL

      https://hiroshima.repo.nii.ac.jp/records/2026480

    • Year and Date
      2024-03-01
    • Related Report
      2023 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] El corpus de aprendices japoneses CELEN y su aplicacioon a la docencia y la investigacion en ELE2023

    • Author(s)
      Pilar Valverde
    • Journal Title

      TEISEL. Tecnologias para la investigacion en segundas lenguas

      Volume: 3 Pages: 1-31

    • DOI

      10.1344/teisel.v3.42898

    • Related Report
      2023 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Presentation] La conversacion en ELE de aprendientes japoneses: posibles casos de influencia de la lengua materna y sus consecuencias en la interaccion2023

    • Author(s)
      Carlos Garcia Ruiz-Castillo
    • Organizer
      XXXIII Congreso Internacional de ASELE
    • Related Report
      2023 Research-status Report
    • Int'l Joint Research

URL: 

Published: 2023-04-13   Modified: 2024-12-25  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi