2013 Fiscal Year Final Research Report

Pronunciation education system based on the systematization of non-mothor tongue speech prosody using generation process model and speech synthesis

Research Project

Project/Area Number	24652115
Research Category	Grant-in-Aid for Challenging Exploratory Research
Allocation Type	Single-year Grants
Research Field	Foreign language education
Research Institution	The University of Tokyo
Principal Investigator	HIROSE Keikichi 東京大学, 情報理工学(系)研究科, 教授 (50111472)
Co-Investigator(Renkei-kenkyūsha)	KAWAI Goh 北海道大学, 外国語教育センター, 准教授 (70312981)
Project Period (FY)	2012-04-01 – 2014-03-31
Keywords	韻律体系化 / 非母語音声 / 生成過程モデル / 音声合成 / 発音教育CALL / 基本周波数パターン / 単語アクセント / 音声変換
Research Abstract	Fundamental frequency (F0) contours of speech by natives and learners are analyzed using the generation process model. Several findings, such as phrase components being less affected by language differences, are shown. As for utterances by learners, influence of their mother tongue is observed. Since learners utterances involve F0 movements not observable in natives utterances, accent type identifier trained using native s utterances does not work well. To solve this problem, a series of perceptual experiments is conducted using synthetic speech with systematic control on F0 (points of F0 movements, slope coefficients). Based on the result, a threshold method of high-low decision of F0 is developed. Also, generation process model constraints are applied to HMM-based speech synthesis resulting in speech quality improvement. A pronunciation training system on Japanese accent type is developed and evaluated.

Research Products
(25 results)

All 2014 2013 2012

All Journal Article (6 results) (of which Peer Reviewed: 5 results) Presentation (19 results)

[Journal Article] Automatic recognition of gemination in Japanese motivated by perceptual experiments2014
- Author(s)
  Greg Short, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Acoustical Science and Technology
  
  Volume: Vol.35, No.2 Pages: 73-85
- URL
  https://www.jstage.jst.go.jp/browse/ast
- Peer Reviewed
[Journal Article] Toward flexible and systematic control of fundamental frequencies in HMM-based speech synthesis2013
- Author(s)
  Keikichi Hirose
- Journal Title
  
  Journal of EPSJ (English Phonetics Society of Japan)
  
  Volume: Vol.18 Pages: 121-128
- URL
  http://www.cc.kochi-u.ac.jp/~tamasaki/EPSJ.htm
[Journal Article] Japanese lexical accent recognition for a CALL system by deriving classification equations with perceptual experiments2013
- Author(s)
  Greg Short, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Speech Communication
  
  Volume: Vol.55, Issue 10 Pages: 1064-1080
- URL
  http://www.sciencedirect.com/science/article/pii/S0167639313000927
- Peer Reviewed
[Journal Article] Generation of fundamental frequency contours for Thai speech using the tone nucleus model2013
- Author(s)
  Oraphan Krityakien, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Journal of Signal Processing, Research Institute of Signal Processing
  
  Volume: vol.16, no.4 Pages: 135-138
- URL
  https://www.jstage.jst.go.jp/browse/jsp
- Peer Reviewed
[Journal Article] A method for generation of Mandarin F0 contours based on tone nucleus model and superpositional model2012
- Author(s)
  Qinghua Sun, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Speech Communication
  
  Volume: Vol.54, Issue 8 Pages: 932-945
- URL
  http://www.sciencedirect.com/science/article/pii/S0167639312000349
- Peer Reviewed
[Journal Article] Applying generation process model constraint to fundamental frequency contours generated by hidden-Markov-model-based speech synthesis2012
- Author(s)
  Tatsuya Matsuda, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Acoustical Science and Technology, Acoustical Society of Japan
  
  Volume: Vol.33, No.4 Pages: 221-228
- URL
  https://www.jstage.jst.go.jp/browse/ast
- Peer Reviewed
[Presentation] Hierarchical stress generation with Fujisaki model in expressive speech synthesis2014
- Author(s)
  Ya Li, Jianhua Tao, Keikichi Hirose, Wei Lai, and Xiaoying Xu
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Dublin(pp.1032-1036)
- Year and Date
  20140520-23
[Presentation] Selection of training data for HMM-based speech synthesis from prosodic features - Use of generation process model of fundamental frequency contours2014
- Author(s)
  Tomoyuki Mizukami, Hiroya Hashimoto, Keikichi Hirose, Daisuke Saito, and Nobuaki Minematsu
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Dublin(pp.1042-1046)
- Year and Date
  20140520-23
[Presentation] Rhythmic patterns in Native and non-native Mandarin Speech2014
- Author(s)
  Wentao Gu and Keikichi Hirose
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Dublin(pp.592-596)
- Year and Date
  20140520-23
[Presentation] 生成過程モデルにおけるF0 パターン差分を考慮したHMM 音声合成の実験的検討2014
- Author(s)
  百武恭汰,橋本浩弥,齋藤大輔,峯松信明,広瀬啓吉
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  日本大学(1-R5-14,pp.407-408)
- Year and Date
  20140310-12
[Presentation] RhythmicPatterns of Nonnative Mandarin Speech2014
- Author(s)
  Wentao Gu and Keikichi Hirose
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  日本大学(3-6-16, pp.357-360)
- Year and Date
  20140310-12
[Presentation] 行列変量ガウス混合分布に基づく声質変換の検討2014
- Author(s)
  土井秀信, 齋藤大輔,峯松信明,広瀬啓吉
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  日本大学(1-R5-21,pp.425-428)
- Year and Date
  20140310-12
[Presentation] 文節を基本単位とした基本周波数パターン生成過程モデルのパラメータ自動抽出2013
- Author(s)
  橋本浩弥, 広瀬啓吉, 峯松信明
- Organizer
  日本音響学会全国大会講演論文
- Place of Presentation
  豊橋技術科学大学(1-P-5a, pp.327-328)
- Year and Date
  20130925-27
[Presentation] Context labels based on "bunsetsu" for HMM-based speech synthesis of Japanese2013
- Author(s)
  Hiroya Hashimoto, Keikichi Hirose, and Nobuaki Minematsu
- Organizer
  Proceedings 8^<th> ISCA Workshop on Speech Synthesis (SSW-8)
- Place of Presentation
  Barcelona
- Year and Date
  20130831-0903
[Presentation] Generation of fundamental frequency contours for Thai speech synthesis using tone nucleus model2013
- Author(s)
  Oraphan Krityakien, Nobuaki Minematsu, and Keikichi Hirose
- Organizer
  Proceedings INTERSPEECH 2013
- Place of Presentation
  Lyon
- Year and Date
  20130826-29
[Presentation] 基本周波数パターン生成過程モデルの指令差分に基づく焦点制御の改良2013
- Author(s)
  川口拓也,橋本浩弥,広瀬啓吉,峯松信明
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  東京工科大学(3-P-34b, pp.505-506)
- Year and Date
  20130313-15
[Presentation] Use of generation process model for synthesizing fundamental frequency contours in HMM-based speech synthesis2012
- Author(s)
  Keikichi Hirose, Hiroya Hashimoto, Jun Ikeshima, and Nobuaki Minematsu
- Organizer
  Proceedings IEEE International Conference on Signal Processing (ICSP'12)
- Place of Presentation
  Beijing(pp.575-578)
- Year and Date
  20121022-24
[Presentation] アクセント核の知覚と母音長との関係の基礎的検討2012
- Author(s)
  ショート・グレッグ, 広瀬啓吉, 峯松信明
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  信州大学(3-Q-31,pp.429-430)
- Year and Date
  20120919-21
[Presentation] Tone nucleus model for Thai language speech synthesis2012
- Author(s)
  Krityakien Oraphan, Hirose Keikichi, and Nobuaki Minematsu
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  信州大学(3-Q14 pp.391-392)
- Year and Date
  20120919-21
[Presentation] An alignment matching method to explore pseudosyllable properties across different corpora2012
- Author(s)
  Raymond W. M. Ng, Thomas Hain, and Keikichi Hirose
- Organizer
  Proceedings INTERSPEECH 2012
- Place of Presentation
  Portland(CD-ROM Proceedings)
- Year and Date
  20120909-13
[Presentation] Effects of speaker adaptive training on tensor-based arbitrary speaker conversion2012
- Author(s)
  Daisuke Saito, Nobuaki Minematsu, and Keikichi Hirose
- Organizer
  Proceedings INTERSPEECH 2012
- Place of Presentation
  Portland(CD-ROM Proceedings)
- Year and Date
  20120909-13
[Presentation] Improved automatic extraction of generation process model commands and its use for generating fundamental frequency contours for training HMM-based speech synthesis2012
- Author(s)
  Hiroya Hashimoto, Keikichi Hirose, and Nobuaki Minematsu
- Organizer
  Proceedings INTERSPEECH 2012
- Place of Presentation
  Portland(CD-ROM Proceedings)
- Year and Date
  20120909-13
[Presentation] Fundamental frequency contour reshaping in HMM-based speech synthesis and realization of prosodic focus using generation process model2012
- Author(s)
  Keikichi Hirose, Hiroya Hashimoto, Jun Ikeshima, and Nobuaki Minematsu
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Shanghai(SS1-3, pp.171-174)
- Year and Date
  20120522-25
[Presentation] Automatic segmentation of English words using phonotactic and syllable information2012
- Author(s)
  Raymond W. M. Ng and Keikichi Hirose
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Shanghai(PS1A-6, pp.27-30)
- Year and Date
  20120522-25
[Presentation] Effects of learners' language transfer on native listeners' evaluation of the prosodic naturalness of japanese words2012
- Author(s)
  Shuhei Kato, Greg Short, Nobuaki Minematsu, and Keikichi Hirose
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Shanghai(pp.198-201)
- Year and Date
  20120522-25

2013 Fiscal Year Final Research Report

Pronunciation education system based on the systematization of non-mothor tongue speech prosody using generation process model and speech synthesis

Principal Investigator

HIROSE Keikichi 東京大学, 情報理工学(系)研究科, 教授 (50111472)

Research Products

[Journal Article] Automatic recognition of gemination in Japanese motivated by perceptual experiments2014

Author(s)

Journal Title

URL

[Journal Article] Toward flexible and systematic control of fundamental frequencies in HMM-based speech synthesis2013

Author(s)

Journal Title

URL

[Journal Article] Japanese lexical accent recognition for a CALL system by deriving classification equations with perceptual experiments2013

Author(s)

Journal Title

URL

[Journal Article] Generation of fundamental frequency contours for Thai speech using the tone nucleus model2013

Author(s)

Journal Title

URL

[Journal Article] A method for generation of Mandarin F0 contours based on tone nucleus model and superpositional model2012

Author(s)

Journal Title

URL

[Journal Article] Applying generation process model constraint to fundamental frequency contours generated by hidden-Markov-model-based speech synthesis2012

Author(s)

Journal Title

URL

[Presentation] Hierarchical stress generation with Fujisaki model in expressive speech synthesis2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Selection of training data for HMM-based speech synthesis from prosodic features - Use of generation process model of fundamental frequency contours2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Rhythmic patterns in Native and non-native Mandarin Speech2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 生成過程モデルにおけるF0 パターン差分を考慮したHMM 音声合成の実験的検討2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] RhythmicPatterns of Nonnative Mandarin Speech2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 行列変量ガウス混合分布に基づく声質変換の検討2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 文節を基本単位とした基本周波数パターン生成過程モデルのパラメータ自動抽出2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Context labels based on "bunsetsu" for HMM-based speech synthesis of Japanese2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Generation of fundamental frequency contours for Thai speech synthesis using tone nucleus model2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 基本周波数パターン生成過程モデルの指令差分に基づく焦点制御の改良2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Use of generation process model for synthesizing fundamental frequency contours in HMM-based speech synthesis2012