1999 Fiscal Year Annual Research Report

韻律の定式化に基づく自然な音声の合成と認識

Research Project

Project/Area Number	09480061
Research Institution	The University of Tokyo
Principal Investigator	広瀬啓吉東京大学, 大学院・新領域創成科学研究科, 教授 (50111472)
Co-Investigator(Kenkyū-buntansha)	峯松信明豊橋技術科学大学, 工学部, 助手 (90273333)
Keywords	韻律的特徴 / 音声合成 / 音声認識 / 対話音声 / モーラ持続時間 / モーラ遷移確率モデル / ビーム探索 / 学術情報検索システム
Research Abstract	音声の韻律的特徴を積極的に利用し、音声合成と音声認識の高度化を実現することを目的とする。本年度は、韻律の音声認識への利用を中心として以下の成果を達成した。 1.対話調音声合成のためのモーラ持続時間制御規則を作成した。これは、朗読調音声の各モーラの持続時間を、基本周波数パターンによって定義された韻律句毎に対話音声のそれに変更することを基本とする。 2.既にアクセント句の基本周波数パターンをモーラ遷移確率モデルによって表現し、連続音声のアクセント句境界検出行う手法を構築しているが、コード化等について改良を行い、不特定話者で、70%〜75%の検出率、11%〜15%の挿入誤り率を得た。また、語彙制約のない連続音声認識に利用し数%のモーラ認識率の向上を確認した。さらに、モーラ遷移確率モデルを連続分布モデルとすることの検討を行い、数%の境界検出率の向上を得た。 3.モーラ遷移確率モデルにより、アクセント型、フレーズ境界位置を入力として基本周波数パターンを合成する手法を開発した。ビタビアルゴリズムによるパターン生成の際にモーラ bi-gramを導入することで、自然な韻律を生成した。 4.基本周波数パターンから求めた韻律境界情報を用いて大語彙連続音声認識のViterbiビーム探索の枝刈りを効率的に行う手法を開発した。これは、韻律境界で枝刈りを緩くし、境界間できつくするものである。探索範囲を1/4程度以下で同等以上の認識率を達成した。さらに、韻律境界情報によって単語間文脈依存音素モデルの選択を制御する手法を開発し、特に文正解率での性能向上を得た。学術情報検索をタスクとする音声対話システムについて、機能向上を図るとともに、応答音声に対話音声の韻律的特徴を付与した。応答音声の韻律の観点から聴取実験を行い、回答内容を韻律的に強調することにより、システムとして使い易くなる等の結果を得た。

Research Products
(16 results)

All Other

All Publications (16 results)

[Publications] 岩野公司: "モーラを単位とした基本周波数パターンの確率モデル化とそれによるアクセント句境界の検出"情報処理学会論文誌. 40・4. 1356-1364 (1999)
[Publications] 桜井淳宏: "Designing a parameter-based prosodic speech database"Proc. Oriental COCOSDA Workshop. 5-8 (1999)
[Publications] 広瀬啓吉: "Statistical modeling of prosodic features and its use for speech recognition"Proc. International Conf. on Speech Processing. 1. 43-52 (1999)
[Publications] 広瀬啓吉: "Generation of speech reply in a spoken dialogue system for literature retrieval"Proc. ESCA TR Workshop on International Dialogue in Multi-Modal Systems. 29-32 (1999)
[Publications] 川波弘道: "Speech rate control for dialogue speech synthesis based on the prosodic structures"Proc. ESCA TR Workshop on Dialogue and Prosody. 59-64 (1999)
[Publications] 広瀬啓吉: "Tone recognition of Chinese continuous speech using tone critical segments"Proc. European Conf. on Speech Communication and Techonology. 2. 879-882 (1999)
[Publications] 桜井淳宏: "Detecting accent sandhi in Japanese using a superpositionl F_0 model"Proc. European Conf. on Speech Communication and Techonology. 4. 1863-1866 (1999)
[Publications] 倪晋富: "A study on quantitative modeling of sentence fundamental frequency contours in standard Chinese"Proc. Japan-China Symposium on Advanced Information Technology. 39-46 (1999)
[Publications] 峯松信明: "HMMを用いた英単語音声からの強勢音節の自動検出とそれに基づく発音能力の韻律的評定"電子情報通信学会論文誌. J82-D-II・11. 1865-1876 (1999)
[Publications] 李時旭: "Dynamic beam search strategy using prosodic-syntactic information"Proc. IEEE Workshop on Automatic Speech Recognition and Understanding. 189-192 (1999)
[Publications] 桜井淳宏: "モーラ遷移HMMに基づくF0パターンのモデル化と生成"日本音響学会講演論文集. I(発売予定). (2000)
[Publications] 倪晋富: "Quantitative analysis of phrasal fundamental frequency contours in standard Chinese"日本音響学会講演論文集. I(発売予定). (2000)
[Publications] 西澤信行: "基本周波数の影響を考慮したフォルマンと音声合成"日本音響学会講演論文集. I(発売予定). (2000)
[Publications] 広瀬啓吉: "21世紀に向けての音声合成の技術展望"情報処理. 41・3(発売予定).
[Publications] 広瀬啓吉: "Derection of prosodic word boundaries by statistical modeling of mora transitions of fundamental frequency contours and its use for continuous speech recognition"Prot. IEEE International Conf. on Acoustics, Speech, & Signal Processing. (発売予定). (2000)
[Publications] 張勁松: "Anchoring hypothesis and its application to tone recognition of Chinese contimuous speech"Prot. IEEE International Conf. on Acoustics, Speech, & Signal Processing. (発売予定). (2000)

1999 Fiscal Year Annual Research Report

韻律の定式化に基づく自然な音声の合成と認識

Principal Investigator

広瀬 啓吉 東京大学, 大学院・新領域創成科学研究科, 教授 (50111472)

Research Products

[Publications] 岩野公司: "モーラを単位とした基本周波数パターンの確率モデル化とそれによるアクセント句境界の検出"情報処理学会論文誌. 40・4. 1356-1364 (1999)

[Publications] 桜井淳宏: "Designing a parameter-based prosodic speech database"Proc. Oriental COCOSDA Workshop. 5-8 (1999)

[Publications] 広瀬啓吉: "Statistical modeling of prosodic features and its use for speech recognition"Proc. International Conf. on Speech Processing. 1. 43-52 (1999)

[Publications] 広瀬啓吉: "Generation of speech reply in a spoken dialogue system for literature retrieval"Proc. ESCA TR Workshop on International Dialogue in Multi-Modal Systems. 29-32 (1999)

[Publications] 川波弘道: "Speech rate control for dialogue speech synthesis based on the prosodic structures"Proc. ESCA TR Workshop on Dialogue and Prosody. 59-64 (1999)

[Publications] 広瀬啓吉: "Tone recognition of Chinese continuous speech using tone critical segments"Proc. European Conf. on Speech Communication and Techonology. 2. 879-882 (1999)

[Publications] 桜井淳宏: "Detecting accent sandhi in Japanese using a superpositionl F_0 model"Proc. European Conf. on Speech Communication and Techonology. 4. 1863-1866 (1999)

[Publications] 倪晋富: "A study on quantitative modeling of sentence fundamental frequency contours in standard Chinese"Proc. Japan-China Symposium on Advanced Information Technology. 39-46 (1999)

[Publications] 峯松信明: "HMMを用いた英単語音声からの強勢音節の自動検出とそれに基づく発音能力の韻律的評定"電子情報通信学会論文誌. J82-D-II・11. 1865-1876 (1999)

[Publications] 李時旭: "Dynamic beam search strategy using prosodic-syntactic information"Proc. IEEE Workshop on Automatic Speech Recognition and Understanding. 189-192 (1999)

[Publications] 桜井淳宏: "モーラ遷移HMMに基づくF0パターンのモデル化と生成"日本音響学会講演論文集. I(発売予定). (2000)

[Publications] 倪晋富: "Quantitative analysis of phrasal fundamental frequency contours in standard Chinese"日本音響学会講演論文集. I(発売予定). (2000)

[Publications] 西澤信行: "基本周波数の影響を考慮したフォルマンと音声合成"日本音響学会講演論文集. I(発売予定). (2000)

[Publications] 広瀬啓吉: "21世紀に向けての音声合成の技術展望"情報処理. 41・3(発売予定).

[Publications] 広瀬啓吉: "Derection of prosodic word boundaries by statistical modeling of mora transitions of fundamental frequency contours and its use for continuous speech recognition"Prot. IEEE International Conf. on Acoustics, Speech, & Signal Processing. (発売予定). (2000)

[Publications] 張勁松: "Anchoring hypothesis and its application to tone recognition of Chinese contimuous speech"Prot. IEEE International Conf. on Acoustics, Speech, & Signal Processing. (発売予定). (2000)

広瀬啓吉東京大学, 大学院・新領域創成科学研究科, 教授 (50111472)