2009 Fiscal Year Annual Research Report

ヒューマノイド音声対話システムのための話し言葉音声合成に関する研究

Research Project

Project/Area Number	21800020
Research Category	Grant-in-Aid for Young Scientists (Start-up)
Research Institution	Tokyo Institute of Technology
Principal Investigator	能勢隆 Tokyo Institute of Technology, 大学院・総合理工学研究科, 助教 (90550591)
Keywords	テキスト音声合成 / 隠れマルコフモデル / 話し言葉音声 / 感情音声 / HMM音声合成 / ヒューマノイドロボット / 音声対話システム / ロバスト音声認識
Research Abstract	本研究はヒューマノイド音声対話システムの実現に向けた多様な音声の認識・合成技術のための各基盤要素技術の研究・開発からなり、本年度は以下に示す4項目について成果が得られた。 (1) 感情や発話様式を伴う音声の認識率の向上を目的とし、重回帰隠れマルコフモデル(HMM)に基づくオンラインでの音響モデルの適応化手法を提案し、入力された発話毎にモデルを適応することの有効性を確認した。また日本語話し言葉コーパス(CSJ)を用いた実験を行い、話し言葉においてもその有効性を確認した。 (2) 音声に表れる感情や発話様式を識別し、さらに表現の度合い推定も可能な手法として音声のスペクトル、基本周波数、音韻継続長を同時に考慮した重回帰隠れセミマルコフモデルに基づくスタイル推定法を提案し模擬音声・自然発話音声による客観および主観評価実験により有効性を確認した。 (3) 音声の中で最も自発性の高い対話音声の合成を目的とし、HMMに基づく対話音声合成法を提案し、対話音声合成のためのコンテキストの検討、対話音声に基づく平均声を用いた合成音声の品質改善などを行った。また、実験により対話における多様な表現が再現されることを示した。 (4) テキスト音声合成において話者や感情・発話様式の多様化を容易にするために、モデル学習時のコストの削減を目的とし、平均声と量子化基本周波数に基づく教師なしモデル学習法を提案し、従来の教師あり学習法に近い品質が得られることを示した。

Research Products
(19 results)

All 2010 2009

All Journal Article (7 results) (of which Peer Reviewed: 7 results) Presentation (12 results)

[Journal Article] A rapid model adaptation technique for emotional speech recognition with stylestimation based on multiple-regression HMM2010
- Author(s)
  Yusuke Ijima, Takashi Nose, Makoto Tachibana, Takao Kobayashi
- Journal Title
  
  IEICE Trans. on Information and Systems Vol.E93-D, No.
  
  Pages: 107-115
- Peer Reviewed
[Journal Article] A technique for estimating intensity of emotional expressions and speaking styles in speech based on multiple-regression HSMM2010
- Author(s)
  Takashi Nose, Takao Kobayashi
- Journal Title
  
  IEICE Trans. on Information and Systems Vol.E93-D, No.
  
  Pages: 116-124
- Peer Reviewed
[Journal Article] HMM-based speech synthesis with unsupervised labeling of accentual context based on FO quantization and average voice model2010
- Author(s)
  Takashi Nose, Koujirou Ooki, Takao Kobayashi
- Journal Title
  
  Proc. 2010 IEEE International Conference on Acoustics, Speech and Signal Processing
  
  Pages: 4622-4625
- Peer Reviewed
[Journal Article] Emotional speech recognition based on style estimation and adaptationwith multiple-regression HMM2009
- Author(s)
  Yusuke Ijima, Makoto Tachibana, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc. 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
  
  Pages: 4157-4160
- Peer Reviewed
[Journal Article] A robust speaker-adaptive HMM-based text-to-speech synthesis2009
- Author(s)
  Junichi Yamagishi, Takashi Nose, HeigaZen, Zhen-Hua Ling, Tomoki Toda, Keiichi Tokuda, Simon King, Steve Renals
- Journal Title
  
  IEEE Trans. on Audio, Speech, and Language Processing Vol.17, No.6
  
  Pages: 1208-1230
- Peer Reviewed
[Journal Article] Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM2009
- Author(s)
  Yusuke Ijima, Takeshi Matsubara, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc. 10th Annual Conference of the International Speech Communication Association
  
  Pages: 552-555
- Peer Reviewed
[Journal Article] HMM-based speaker characteristics emphasis using average voice model2009
- Author(s)
  Takashi Nose, Junichi Asada, Takao Kobayashi
- Journal Title
  
  Proc. 10th Annual Conference of the International Speech Communication Association
  
  Pages: 2631-2634
- Peer Reviewed
[Presentation] HMM-based speech synthesis with unsupervised labeling of accentual context based on FO quantization and average voice model2010
- Author(s)
  Takashi Nose, Koujirou Ooki, Takao Kobayashi
- Organizer
  2010 IEEE Interantional Conference on Acoustics, Speech and Signal Processing, ICASSP 2010
- Place of Presentation
  Dallas, Texas, USA
- Year and Date
  2010-03-17
[Presentation] HMMに基づく対話音声合成のための発話単位の検討2010
- Author(s)
  郡山知樹, 能勢隆, 小林隆夫
- Organizer
  日本音響学会2010年春季研究発表会
- Place of Presentation
  電気通信大学,東京都調布市
- Year and Date
  2010-03-10
[Presentation] 量子化FO韻律コンテキストを用いたHMM音声合成の評価2010
- Author(s)
  大木康次郎, 能勢隆, 小林隆夫
- Organizer
  日本音響学会2010年春季研究発表会
- Place of Presentation
  電気通信大学,東京都調布市
- Year and Date
  2010-03-09
[Presentation] HMM音声合成における韻律コンテキストの評価2010
- Author(s)
  横溝秀始, 能勢隆, 小林隆夫
- Organizer
  日本音響学会2010年春季研究発表会
- Place of Presentation
  電気通信大学,東京都調布市
- Year and Date
  2010-03-08
[Presentation] 平均声に基づく対話音声合成に関する検討2010
- Author(s)
  郡山知樹, 能勢隆, 小林隆夫
- Organizer
  電子情報通信学会・音声研究会
- Place of Presentation
  京都大学,京都市
- Year and Date
  2010-01-21
[Presentation] FO量子化に基づく韻律コンテキストを用いたHMM音声合成2009
- Author(s)
  大木康次郎, 能勢隆, 小林隆夫
- Organizer
  電子情報通信学会
- Place of Presentation
  東京大学,東京都文京区
- Year and Date
  2009-12-21
[Presentation] HMMに基づく対話音声合成の検討2009
- Author(s)
  郡山知樹, 能勢隆, 小林隆夫
- Organizer
  日本音響学会2009年秋季研究発表会
- Place of Presentation
  日本大学,福島県郡山市
- Year and Date
  2009-09-15
[Presentation] HMM音声合成におけるFOモデルの教師なし学習の検討2009
- Author(s)
  大木康次郎, 能勢隆, 小林隆夫
- Organizer
  日本音響学会2009年秋季研究発表会
- Place of Presentation
  日本大学,福島県郡山市
- Year and Date
  2009-09-15
[Presentation] HMM-based speaker characteristics emphasis using average voice model2009
- Author(s)
  Takashi Nose, Junichi Asada, Takao Kobayashi
- Organizer
  10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
- Place of Presentation
  Brighton, UK
- Year and Date
  2009-09-10
[Presentation] Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM2009
- Author(s)
  Yusuke Ijima, Matsubara, Takashi Nose, Takao Kobayashi
- Organizer
  10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
- Place of Presentation
  Brighton, UK
- Year and Date
  2009-09-07
[Presentation] 重回帰HMMに基づく自然発話音声の発話様式識別2009
- Author(s)
  能勢隆, 松原健, 井島勇祐, 小林隆夫
- Organizer
  電子情報通信学会音声研究会
- Place of Presentation
  飯坂ホテル聚楽,福島県福島市
- Year and Date
  2009-07-18
[Presentation] Emotional speech recognition based on style estimation and adaptationwith multiple-regression HMM2009
- Author(s)
  Yusuke Ijima, Makoto Tachibana, Takashi Nose, Takao Kobayashi
- Organizer
  2009 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009
- Place of Presentation
  Taipei, Taiwan
- Year and Date
  2009-04-21

2009 Fiscal Year Annual Research Report

ヒューマノイド音声対話システムのための話し言葉音声合成に関する研究

Principal Investigator

能勢 隆 Tokyo Institute of Technology, 大学院・総合理工学研究科, 助教 (90550591)

Research Products

[Journal Article] A rapid model adaptation technique for emotional speech recognition with stylestimation based on multiple-regression HMM2010

Author(s)

Journal Title

[Journal Article] A technique for estimating intensity of emotional expressions and speaking styles in speech based on multiple-regression HSMM2010

Author(s)

Journal Title

[Journal Article] HMM-based speech synthesis with unsupervised labeling of accentual context based on FO quantization and average voice model2010

Author(s)

Journal Title

[Journal Article] Emotional speech recognition based on style estimation and adaptationwith multiple-regression HMM2009

Author(s)

Journal Title

[Journal Article] A robust speaker-adaptive HMM-based text-to-speech synthesis2009

Author(s)

Journal Title

[Journal Article] Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM2009

Author(s)

Journal Title

[Journal Article] HMM-based speaker characteristics emphasis using average voice model2009

Author(s)

Journal Title

[Presentation] HMM-based speech synthesis with unsupervised labeling of accentual context based on FO quantization and average voice model2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMMに基づく対話音声合成のための発話単位の検討2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 量子化FO韻律コンテキストを用いたHMM音声合成の評価2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMM音声合成における韻律コンテキストの評価2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 平均声に基づく対話音声合成に関する検討2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] FO量子化に基づく韻律コンテキストを用いたHMM音声合成2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMMに基づく対話音声合成の検討2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMM音声合成におけるFOモデルの教師なし学習の検討2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMM-based speaker characteristics emphasis using average voice model2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 重回帰HMMに基づく自然発話音声の発話様式識別2009

Author(s)

Organizer

Place of Presentation

能勢隆 Tokyo Institute of Technology, 大学院・総合理工学研究科, 助教 (90550591)