Real-time Speech Support Technology for Individuals with Speech Disorders Using Deep Learning

Research Project

Project/Area Number	25K03150
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61020:Human interface and interaction-related Basic Section 62040:Entertainment and game informatics-related Sections That Are Subject to Joint Review: Basic Section61020:Human interface and interaction-related , Basic Section62040:Entertainment and game informatics-related
Research Institution	The University of Tokyo
Principal Investigator	暦本純一東京大学, 大学院情報学環・学際情報学府, 教授 (20463896)
Project Period (FY)	2025-04-01 – 2028-03-31
Project Status	Granted (Fiscal Year 2025)
Budget Amount *help	¥18,850,000 (Direct Cost: ¥14,500,000、Indirect Cost: ¥4,350,000) Fiscal Year 2027: ¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000) Fiscal Year 2026: ¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000) Fiscal Year 2025: ¥9,880,000 (Direct Cost: ¥7,600,000、Indirect Cost: ¥2,280,000)
Keywords	ヒューマンコンピュータインタラクション / 音声インタラクション / 深層学習
Outline of Research at the Start	本研究は、声帯ポリープや痙攣性発声障害などによる発声障害者や、声帯を失った人々、聴覚障害者のための深層学習による発声支援技術を開発する。申請者は、かすれ声や囁き声を通常音声に変換できる自己教師型の深層学習モデルを開発している。本研究では、このモデルを発展させ、発声障害者の音声をより自然な形にかつ実時間で復元する音声変換技術を構築し、発声障害者の音声を通常音声に変換する深層学習モデル、特定の言語,発話者に依存しない音声改善, 目的音声を任意の発話者の音声にすることが可能な音声変換,実時間音声変換を実現し、発声困難者の生活利便性を向上させる。