Project/Area Number |
21K17779
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 61020:Human interface and interaction-related
|
Research Institution | The University of Tokyo |
Principal Investigator |
Zhang Xinlei 東京大学, 大学院情報学環・学際情報学府, 特任研究員 (60898138)
|
Project Period (FY) |
2021-04-01 – 2022-03-31
|
Project Status |
Discontinued (Fiscal Year 2021)
|
Budget Amount *help |
¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000)
Fiscal Year 2023: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2022: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2021: ¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
|
Keywords | Tutoring Agent / Speech Recognition / Multi-Modal Interface / Language Learning / Device Wakeup |
Outline of Research at the Start |
This research aims to develop a one-of-its-kind language tutoring agent that is fully-automated, adaptive, and user-configurable. To achieve this goal, I plan to 1) develop an architecture to allow users to generate and customize the agent through simple text editing. 2) Develop a technology to chunk the speech template for difficulty adjustments. 3) Evaluate such agents' usability and learning outputs in self-studies. The agent can serve as a complementary assistant to help language tutors train students, or serve as a personalized language tutor to teach every single student in self study.
|
Outline of Annual Research Achievements |
During the six months of conducting this project, I mainly achieved two goals: 1) Developed a prototype system to allow users to generate and customize the agent through simple text editing. The system takes a text file containing the transcript of the template speech and the user's feedback mode, then creates the tutoring agent accordingly for adaptive and personalized tutoring. It is part of a paper published in EICS 2021 now.
2) Developed a novel way to allow users to awake the device by changing the prosody when speaking the keyword (e.g., Alexa) for accurate device activations. Evaluation studies show significant advantages of this method compared to Keyword Spotting based method. The results are summarized into a top-conference paper which is under review.
|