2010 Fiscal Year Final Research Report

Study on speech synthesis for humanoid spoken dialog system

Research Project

Project/Area Number	21800020
Research Category	Grant-in-Aid for Research Activity Start-up
Allocation Type	Single-year Grants
Research Field	Perception information processing/Intelligent robotics
Research Institution	Tokyo Institute of Technology
Principal Investigator	NOSE Takashi Tokyo Institute of Technology, 大学院・総合理工学研究科, 助教 (90550591)
Project Period (FY)	2009 – 2010
Keywords	テキスト音声合成 / 隠れマルコフモデル / 話し言葉音声 / 話者適応 / HMM音声合成 / ヒューマノイドロボット / 音声対話システム / 声質変換
Research Abstract	Two novel techniques and an investigation were presented that is key technologies of speech synthesis for the development of humanoid spoken dialog system as follows. (1) Spontaneous speech synthesis based on statistical parametric modeling (2) Speaker-independent voice conversion based on statistical parametric modeling. (3) Investigation of phonetic and prosodic contextual factors in speech synthesis.

Research Products
(20 results)

All 2010 2009

All Journal Article (12 results) (of which Peer Reviewed: 12 results) Presentation (8 results)

[Journal Article] HMM-based voice conversion using quantized F0 context2010
- Author(s)
  Takashi Nose, Yuhei Ota, Takao Kobayashi
- Journal Title
  
  IEICE Trans.on Information and Systems D vol.E93-9
  
  Pages: 2483-2490
- Peer Reviewed
[Journal Article] Evaluation of prosodic contextual factors for HMM-based speech synthesis2010
- Author(s)
  Shuji Yokomizo, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc.11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
  
  Pages: 430-433
- Peer Reviewed
[Journal Article] Conversational spontaneous speech synthesis using average voice model2010
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc.11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
  
  Pages: 853-856
- Peer Reviewed
[Journal Article] Speaker-independent HMM-based voice conversion using quantized fundamental frequency2010
- Author(s)
  Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc.11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
  
  Pages: 1724-1727
- Peer Reviewed
[Journal Article] HMM-based robust voice conversion using adaptive F0 quantization2010
- Author(s)
  Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc.7th ISCA Workshop on Speech Synthesis, SSW7-2010
  
  Pages: 80-85
- Peer Reviewed
[Journal Article] A rapid model adaptation technique for emotional speech recognition with style estimation based on multiple-regression HMM2010
- Author(s)
  Yusuke Ijima, Takashi Nose, Makoto Tachibana, Takao Kobayashi
- Journal Title
  
  IEICE Trans.on Information and Systems D vol.E93-1
  
  Pages: 107-115
- Peer Reviewed
[Journal Article] A technique for estimating intensity of emotional expressions and speaking styles in speech based on multiple-regression HSMM2010
- Author(s)
  Takashi Nose, Takao Kobayashi
- Journal Title
  
  IEICE Trans.on Information and Systems D vol.E93-1
  
  Pages: 116-124
- Peer Reviewed
[Journal Article] HMM-based speech synthesis with unsupervised labeling of accentual context based on F0 quantization and average voice model2009
- Author(s)
  Takashi Nose, Koujirou Ooki, Takao Kobayashi
- Journal Title
  
  Proc.2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010
  
  Pages: 4622-4625
- Peer Reviewed
[Journal Article] Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM2009
- Author(s)
  Yusuke Ijima, Takeshi Matsubara, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc.10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
  
  Pages: 552-555
- Peer Reviewed
[Journal Article] HMM-based speaker characteristics emphasis using average voice model2009
- Author(s)
  Takashi Nose, Junichi Asada, Takao Kobayashi
- Journal Title
  
  Proc.10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
  
  Pages: 2631-2634
- Peer Reviewed
[Journal Article] A robust speaker-adaptive HMM-based text-to-speech synthesis2009
- Author(s)
  Junichi Yamagishi, Takashi Nose, Heiga Zen, Zhenhua Ling, Tomoki Toda, Keiichi Tokuda, Simon King, Steve Renals
- Journal Title
  
  IEEE Trans.on Audio, Speech, and Language Processing vol.17, 6
  
  Pages: 1208-1230
- Peer Reviewed
[Journal Article] Emotional speech recognition based on style estimation and adaptation with multiple-regression HMM2009
- Author(s)
  Yusuke Ijima, Makoto Tachibana, Takashi Nose, Takao Kobayashi
- Journal Title
  
  Proc.2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009
  
  Pages: 4157-4160
- Peer Reviewed
[Presentation] Speaker-independent HMM-based voice conversion using quantized fundamental frequency2010
- Author(s)
  Takashi Nose, Takao Kobayashi
- Organizer
  Proc.11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
- Place of Presentation
  Makuhari, Japan.
- Year and Date
  2010-09-29
[Presentation] Conversational spontaneous speech synthesis using average voice model2010
- Author(s)
  Tomoki Koriyama, Takashi Nose, Takao Kobayashi
- Organizer
  Proc.11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
- Place of Presentation
  Makuhari, Japan.
- Year and Date
  2010-09-28
[Presentation] Evaluation of prosodic contextual factors for HMM-based speech synthesis2010
- Author(s)
  Shuji Yokomizo, Takashi Nose, Takao Kobayashi
- Organizer
  Proc.11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
- Place of Presentation
  Makuhari, Japan.
- Year and Date
  2010-09-27
[Presentation] HMM-based robust voice conversion using adaptive F0 quantization2010
- Author(s)
  Takashi Nose, Takao Kobayashi
- Organizer
  Proc.7th ISCA Workshop on Speech Synthesis, SSW7-2010
- Place of Presentation
  Kyoto, Japan.
- Year and Date
  2010-09-22
[Presentation] HMM-based speech synthesis with unsupervised labeling of accentual context based on F0 quantization and average voice model2010
- Author(s)
  Takashi Nose, Koujirou Ooki, Takao Kobayashi
- Organizer
  2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010
- Place of Presentation
  Dallas, USA.
- Year and Date
  2010-03-17
[Presentation] HMM-based speaker characteristics emphasis using average voice model2009
- Author(s)
  Takashi Nose, Junichi Asada, Takao Kobayashi
- Organizer
  Proc.10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
- Place of Presentation
  Brighton, U.K.
- Year and Date
  2009-09-10
[Presentation] Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM2009
- Author(s)
  Yusuke Ijima, Takeshi Matsubara, Takashi Nose, Takao Kobayashi
- Organizer
  Proc.10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
- Place of Presentation
  Brighton, U.K.
- Year and Date
  2009-09-07
[Presentation] Emotional speech recognition based on style estimation and adaptation with multiple-regression HMM2009
- Author(s)
  Yusuke Ijima, Makoto Tachibana, Takashi Nose, Takao Kobayashi
- Organizer
  Proc.2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009, 4157-4160
- Place of Presentation
  Taipei, Taiwan.
- Year and Date
  2009-04-21

2010 Fiscal Year Final Research Report

Study on speech synthesis for humanoid spoken dialog system

Principal Investigator

NOSE Takashi Tokyo Institute of Technology, 大学院・総合理工学研究科, 助教 (90550591)

Research Products

[Journal Article] HMM-based voice conversion using quantized F0 context2010

Author(s)

Journal Title

[Journal Article] Evaluation of prosodic contextual factors for HMM-based speech synthesis2010

Author(s)

Journal Title

[Journal Article] Conversational spontaneous speech synthesis using average voice model2010

Author(s)

Journal Title

[Journal Article] Speaker-independent HMM-based voice conversion using quantized fundamental frequency2010

Author(s)

Journal Title

[Journal Article] HMM-based robust voice conversion using adaptive F0 quantization2010

Author(s)

Journal Title

[Journal Article] A rapid model adaptation technique for emotional speech recognition with style estimation based on multiple-regression HMM2010

Author(s)

Journal Title

[Journal Article] A technique for estimating intensity of emotional expressions and speaking styles in speech based on multiple-regression HSMM2010

Author(s)

Journal Title

[Journal Article] HMM-based speech synthesis with unsupervised labeling of accentual context based on F0 quantization and average voice model2009

Author(s)

Journal Title

[Journal Article] Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM2009

Author(s)

Journal Title

[Journal Article] HMM-based speaker characteristics emphasis using average voice model2009

Author(s)

Journal Title

[Journal Article] A robust speaker-adaptive HMM-based text-to-speech synthesis2009

Author(s)

Journal Title

[Journal Article] Emotional speech recognition based on style estimation and adaptation with multiple-regression HMM2009

Author(s)

Journal Title

[Presentation] Speaker-independent HMM-based voice conversion using quantized fundamental frequency2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Conversational spontaneous speech synthesis using average voice model2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Evaluation of prosodic contextual factors for HMM-based speech synthesis2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMM-based robust voice conversion using adaptive F0 quantization2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMM-based speech synthesis with unsupervised labeling of accentual context based on F0 quantization and average voice model2010

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] HMM-based speaker characteristics emphasis using average voice model2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM2009

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Emotional speech recognition based on style estimation and adaptation with multiple-regression HMM2009

Author(s)

Organizer

Place of Presentation