2022 Fiscal Year Final Research Report
Construction and evaluation of a prosody control model for effective information transmission by speech to the elderly
Project/Area Number |
20K11869
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Review Section |
Basic Section 61010:Perceptual information processing-related
|
Research Institution | Suwa University of Science |
Principal Investigator |
Mizuno Hideyuki 公立諏訪東京理科大学, 工学部, 教授 (30833892)
|
Co-Investigator(Kenkyū-buntansha) |
中嶋 秀治 日本電信電話株式会社NTTコミュニケーション科学基礎研究所, 協創情報研究部, 研究主任 (90832684)
|
Project Period (FY) |
2020-04-01 – 2023-03-31
|
Keywords | 高齢者向け発話データの整備 / 韻律分析 / 韻律モデル構築 / 言語モデル構築 |
Outline of Final Research Achievements |
In 2020, we created 136 documents with labels attached to important parts by a female speaker who was evaluated as being the easiest to hear by elderly, and collected both of utterances with conscious of elderly and reading style utterances. In 2021, we conducted a comparative analysis of the prosodic differences between the two types of speech, and confirmed the expansion of the average value and range of F0 and the increase of the maximum F0 value at important parts. In 2022, we constructed a prosody control model and confirmed that the F0 maximum value can be controlled with a high accuracy of 0.75 as a coefficient of determination by objective evaluation, but we did not found an effect on the ease of hearing by a subjective evaluation using analysis-by-synthesis speech. In addition, we constructed a language model that predicts important parts and confirmed that it is possible to predict with a high accuracy of about 81%.
|
Free Research Field |
音声情報処理
|
Academic Significance and Societal Importance of the Research Achievements |
1)高齢者にとって聞き取りやすいと評価されている話者が同一内容の文章を高齢者を意識して発話した音声と読み上げた音声をパラレルで収集し,両者の韻律的な差異を統計的に分析することで,高齢者にとって聞き取り易い音声の韻律的な特徴を明らかにした. 2)読み上げ音声から高齢者向け発話の韻律を予測する韻律予測モデルを構築し,高い精度で予測可能であることを示し,通常の読み上げ音声から高齢者にとって聞き取りやすい音声への変換が可能であることを示した. 3)高齢者の情報取得の観点から重要と考えられる文書内での重要な箇所を言語モデルによって高精度に予測することが可能であることを示した.
|