2010 Fiscal Year Annual Research Report

ヒューマノイド音声対話システムのための話し言葉音声合成に関する研究

Research Project

Project/Area Number	21800020
Research Institution	Tokyo Institute of Technology
Principal Investigator	能勢隆東京工業大学, 大学院・総合理工学研究科, 助教 (90550591)
Keywords	テキスト音声合成 / 隠れマルコフモデル / 話し言葉音声 / 話者適応 / HMM音声合成 / ヒューマノイドロボット / 音声対話システム / 声質変換
Research Abstract	本研究はヒューマノイド音声対話システムの実現に向けた多様な音声の認識・合成技術のための各基盤要素技術の研究・開発からなり、本年度は以下に示す3項目について成果が得られた。 (1)話し言葉音声の合成において、目標話者の少量の音声のみから自然な合成音声を生成するためにモデルの学習に読み上げ音声による平均声モデルを導入した二段階モデル適応を提案した。これにより収録やラベル付けにコストがかかる話し言葉音声の利用を極力抑え、既に整備されている豊富な読み上げ音声データベースを用いることにより話し言葉らしさと自然性のバランスが取れた音声を合成することが可能となった。また、強調や語尾上げなどの話し言葉音声において特徴的な表現を考慮したモデル学習によりこれらの表現を合成音声に反映できることを示した。 (2)音声合成における多様化技術として注目されている声質変換技術について、声の高さを表す基本周波数(FO)情報の変換精度を改善するために適応FO量子化に基づく手法を提案した。また任意の間で容易に変換を行うことを目的として不特定話者モデルに基づく声質変換法を提案した。さらに、従来問題となっていた音素認識精度に依存する問題を回避する手法として隠れマルコフモデル(HMM)と混合正規分布を組み合わせた手法を提案した。これらの手法を読み上げ音声に適用した結果良好な結果が得られたため、今後は感情音声や話し言葉音声などのより多様な表現を含む音声についても検討を行う。 (3)HMMに基づく音声合成では声の高さやリズムなどの変化を適切に表現・モデル化するため、音声の音韻および韻律情報をコンテキストとして考慮している。本年度は昨年度の日本語音声に対する評価に加え、あらたに多言語音声合成の実現を目的として、英語音声についても評価を行った。

Research Products
(16 results)

All 2011 2010

All Journal Article (5 results) (of which Peer Reviewed: 5 results) Presentation (11 results)

[Journal Article] HMM-based voice conversion using quantized FO context2010
- Author(s)
  Takashi Nose, Yuhei Ota, Takao Kobayashi
- Journal Title
  
  MICE Trans.on Information and Systems
  
  Volume: vol.E93-D,No.9 Pages: 2483-2490
- Peer Reviewed
[Journal Article] Evaluation of prosodic contextual factors for HMM-based speech synthes is2010
- Author(s)
  Shuji Yokomizo, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc.11th Annual Conference of the International Speech Communication Association
  
  Pages: 430-433
- Peer Reviewed
[Journal Article] Conversational spontaneous speech synthesis using average voice model2010
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc.11th Annual Conference of the International Speech Communication Association
  
  Pages: 853-856
- Peer Reviewed
[Journal Article] Speaker-independent HMM-based voice conversion using quantized fund amental frequency2010
- Author(s)
  Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc.11th Annual Conference of the International Speech Communication Association
  
  Pages: 1724-1727
- Peer Reviewed
[Journal Article] HMM-based robust voice conversion using adaptive FO quantization2010
- Author(s)
  Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc.7th ISCA Workshop on Speech Synthesis
  
  Pages: 80-85
- Peer Reviewed
[Presentation] 日本語話し言葉コーパスを用いた対話音声合成のための音韻・韻律コンテキストの検討2011
- Author(s)
  郡山知樹, 能勢隆, 小林隆夫
- Organizer
  日本音響学会2011年春季研究発表会
- Place of Presentation
  早稲田大学, 東京都新宿区
- Year and Date
  2011-03-11
[Presentation] 多様な発話様式によるHMM音声合成のための韻律コンテキストの検討2011
- Author(s)
  前野悠, 能勢隆, 小林隆夫, 井島勇祐, 中嶋秀治, 水野秀之, 吉岡理
- Organizer
  日本音響学会2011年春季研究発表会
- Place of Presentation
  早稲田大学, 東京都新宿区
- Year and Date
  2011-03-09
[Presentation] 合成音声を用いた非パラレルデータによる声質変換の検討2011
- Author(s)
  史潤宇, 能勢隆, 小林隆夫
- Organizer
  日本音響学会2011年春季研究発表会
- Place of Presentation
  早稲田大学, 東京都新宿区
- Year and Date
  2011-03-09
[Presentation] Speaker-independent HMM-based voice conversion using quantized fund amental frequency2010
- Author(s)
  Takashi Nose, Takao Kobayashi
- Organizer
  11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
- Place of Presentation
  Makuhari, Japan
- Year and Date
  2010-09-29
[Presentation] Conversational spontaneous speech synthesis using average voice model2010
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Organizer
  11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
- Place of Presentation
  Makuhari, Japan
- Year and Date
  2010-09-28
[Presentation] Evaluation of prosodic contextual factors for HMM-based speech synthes is2010
- Author(s)
  Shuji Yokomizo, Takashi Nose, Takao Kobayashi
- Organizer
  11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
- Place of Presentation
  Makuhari, Japan
- Year and Date
  2010-09-27
[Presentation] HMM-based robust voice conversion using adaptive FO quantization2010
- Author(s)
  Takashi Nose, Takao Kobayashi
- Organizer
  7th ISCA Workshop on Speech Synthesis, SSW7
- Place of Presentation
  Kyoto, Japan
- Year and Date
  2010-09-27
[Presentation] HMMに基づく英語音声合成の韻律コンテキストの評価2010
- Author(s)
  横溝秀始, 能勢隆, 小林隆夫
- Organizer
  日本音響学会2010年秋季研究発表会
- Place of Presentation
  関西大学, 大阪府吹田市
- Year and Date
  2010-09-16
[Presentation] 話者適応を用いたHMMに基づく不特定話者間声質変換2010
- Author(s)
  能勢隆, 小林隆夫
- Organizer
  日本音響学会2010年秋季研究発表会
- Place of Presentation
  関西大学, 大阪府吹田市
- Year and Date
  2010-09-16
[Presentation] 適応FO量子化によるHMM声質変換の品質改善2010
- Author(s)
  能勢隆, 小林隆夫
- Organizer
  日本音響学会2010年秋季研究発表会
- Place of Presentation
  関西大学, 大阪府吹田市
- Year and Date
  2010-09-16
[Presentation] 二段階モデル適応に基づく対話音声合成の検討2010
- Author(s)
  郡山知樹, 能勢隆, 小林隆夫
- Organizer
  日本音響学会2010年秋季研究発表会
- Place of Presentation
  関西大学, 大阪府吹田市
- Year and Date
  2010-09-15

2010 Fiscal Year Annual Research Report

ヒューマノイド音声対話システムのための話し言葉音声合成に関する研究

Principal Investigator

能勢 隆 東京工業大学, 大学院・総合理工学研究科, 助教 (90550591)

Research Products

[Journal Article] HMM-based voice conversion using quantized FO context2010

Author(s)

Journal Title

[Journal Article] Evaluation of prosodic contextual factors for HMM-based speech synthes is2010

Author(s)

Journal Title

[Journal Article] Conversational spontaneous speech synthesis using average voice model2010

Author(s)

Journal Title

[Journal Article] Speaker-independent HMM-based voice conversion using quantized fund amental frequency2010

Author(s)

Journal Title

[Journal Article] HMM-based robust voice conversion using adaptive FO quantization2010

Author(s)

Journal Title

[Presentation] 日本語話し言葉コーパスを用いた対話音声合成のための音韻・韻律コンテキストの検討2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 多様な発話様式によるHMM音声合成のための韻律コンテキストの検討2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 合成音声を用いた非パラレルデータによる声質変換の検討2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Speaker-independent HMM-based voice conversion using quantized fund amental frequency2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Conversational spontaneous speech synthesis using average voice model2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Evaluation of prosodic contextual factors for HMM-based speech synthes is2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMM-based robust voice conversion using adaptive FO quantization2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMMに基づく英語音声合成の韻律コンテキストの評価2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 話者適応を用いたHMMに基づく不特定話者間声質変換2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 適応FO量子化によるHMM声質変換の品質改善2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 二段階モデル適応に基づく対話音声合成の検討2010

Author(s)

Organizer

Place of Presentation

Year and Date

能勢隆東京工業大学, 大学院・総合理工学研究科, 助教 (90550591)