Pronunciation education system based on the systematization of non-mothor tongue speech prosody using generation process model and speech synthesis

Research Project

Project/Area Number	24652115
Research Category	Grant-in-Aid for Challenging Exploratory Research
Allocation Type	Single-year Grants
Research Field	Foreign language education
Research Institution	The University of Tokyo
Principal Investigator	HIROSE Keikichi 東京大学, 情報理工学(系)研究科, 教授 (50111472)
Co-Investigator(Renkei-kenkyūsha)	KAWAI Goh 北海道大学, 外国語教育センター, 准教授 (70312981)
Project Period (FY)	2012-04-01 – 2014-03-31
Project Status	Completed (Fiscal Year 2013)
Budget Amount *help	¥3,640,000 (Direct Cost: ¥2,800,000、Indirect Cost: ¥840,000) Fiscal Year 2013: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000) Fiscal Year 2012: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
Keywords	韻律体系化 / 非母語音声 / 生成過程モデル / 音声合成 / 発音教育CALL / 基本周波数パターン / 単語アクセント / 音声変換
Research Abstract	Fundamental frequency (F0) contours of speech by natives and learners are analyzed using the generation process model. Several findings, such as phrase components being less affected by language differences, are shown. As for utterances by learners, influence of their mother tongue is observed. Since learners utterances involve F0 movements not observable in natives utterances, accent type identifier trained using native s utterances does not work well. To solve this problem, a series of perceptual experiments is conducted using synthetic speech with systematic control on F0 (points of F0 movements, slope coefficients). Based on the result, a threshold method of high-low decision of F0 is developed. Also, generation process model constraints are applied to HMM-based speech synthesis resulting in speech quality improvement. A pronunciation training system on Japanese accent type is developed and evaluated.

Report

(3 results)

2013 Annual Research Report Final Research Report ( PDF )
2012 Research-status Report

Research Products
(37 results)

All 2014 2013 2012

All Journal Article (12 results) (of which Peer Reviewed: 10 results) Presentation (25 results) (of which Invited: 1 results)

[Journal Article] Automatic recognition of gemination in Japanese motivated by perceptual experiments2014
- Author(s)
  Greg Short, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Acoustical Science and Technology
  
  Volume: Vol.35, No.2 Pages: 73-85
- NAID
  130003390797
- URL
  https://www.jstage.jst.go.jp/browse/ast
- Related Report
  2013 Annual Research Report 2013 Final Research Report
- Peer Reviewed
[Journal Article] Rhythmic patterns in Native and non-native Mandarin Speech2014
- Author(s)
  Wentao Gu, and Keikichi Hirose
- Journal Title
  
  Proceedings of International Conference on Speech Prosody
  
  Volume: 1 Pages: 592-596
- Related Report
  2013 Annual Research Report
- Peer Reviewed
[Journal Article] Hierarchical stress generation with Fujisaki model in expressive speech synthesis2014
- Author(s)
  Ya Li, Jianhua Tao, Keikichi Hirose, Wei Lai, Xiaoying Xu
- Journal Title
  
  Proceedings of International Conference on Speech Prosody
  
  Volume: 1 Pages: 1032-1036
- Related Report
  2013 Annual Research Report
- Peer Reviewed
[Journal Article] Toward flexible and systematic control of fundamental frequencies in HMM-based speech synthesis2013
- Author(s)
  Keikichi Hirose
- Journal Title
  
  Journal of EPSJ (English Phonetics Society of Japan)
  
  Volume: Vol.18 Pages: 121-128
- URL
  http://www.cc.kochi-u.ac.jp/~tamasaki/EPSJ.htm
- Related Report
  2013 Final Research Report
[Journal Article] Japanese lexical accent recognition for a CALL system by deriving classification equations with perceptual experiments2013
- Author(s)
  Greg Short, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Speech Communication
  
  Volume: Vol.55, Issue 10 Pages: 1064-1080
- URL
  http://www.sciencedirect.com/science/article/pii/S0167639313000927
- Related Report
  2013 Annual Research Report 2013 Final Research Report
- Peer Reviewed
[Journal Article] Generation of fundamental frequency contours for Thai speech using the tone nucleus model2013
- Author(s)
  Oraphan Krityakien, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Journal of Signal Processing, Research Institute of Signal Processing
  
  Volume: vol.16, no.4 Pages: 135-138
- NAID
  130004849292
- URL
  https://www.jstage.jst.go.jp/browse/jsp
- Related Report
  2013 Final Research Report
- Peer Reviewed
[Journal Article] Toward flexible and systematic control of fundamental frequencies in HMM-based speech synthesis2013
- Author(s)
  Keikichi Hirose
- Journal Title
  
  Journal of English Phonetics Society of Japan
  
  Volume: 18 Pages: 121-128
- Related Report
  2013 Annual Research Report
[Journal Article] A method for generation of Mandarin F0 contours based on tone nucleus model and superpositional model2012
- Author(s)
  Qinghua Sun, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Speech Communication
  
  Volume: Vol.54, Issue 8 Pages: 932-945
- URL
  http://www.sciencedirect.com/science/article/pii/S0167639312000349
- Related Report
  2013 Final Research Report
- Peer Reviewed
[Journal Article] Applying generation process model constraint to fundamental frequency contours generated by hidden-Markov-model-based speech synthesis2012
- Author(s)
  Tatsuya Matsuda, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Acoustical Science and Technology, Acoustical Society of Japan
  
  Volume: Vol.33, No.4 Pages: 221-228
- NAID
  130001853341
- URL
  https://www.jstage.jst.go.jp/browse/ast
- Related Report
  2013 Final Research Report 2012 Research-status Report
- Peer Reviewed
[Journal Article] Fundamental frequency contour reshaping in HMM-based speech synthesis and realization of prosodic focus using generation process model2012
- Author(s)
  Keikichi Hirose, Hiroya Hashimoto, Jun Ikeshima, and Nobuaki Minematsu
- Journal Title
  
  Proceedings of International Conference on Speech Prosody
  
  Volume: 1 Pages: 171-174
- Related Report
  2012 Research-status Report
- Peer Reviewed
[Journal Article] Improved automatic extraction of generation process model commands and its use for generating fundamental frequency contours for training HMM-based speech synthesis2012
- Author(s)
  Hiroya Hashimoto, Keikichi Hirose, and Nobuaki Minematsu
- Journal Title
  
  Proceedings INTERSPEECH
  
  Volume: 1
- Related Report
  2012 Research-status Report
- Peer Reviewed
[Journal Article] Use of generation process model for synthesizing fundamental frequency contours in HMM-based speech synthesis2012
- Author(s)
  Keikichi Hirose, Hiroya Hashimoto, Jun Ikeshima, and Nobuaki Minematsu
- Journal Title
  
  Proceedings IEEE International Conference on Signal Processing
  
  Volume: 1 Pages: 575-578
- Related Report
  2012 Research-status Report
- Peer Reviewed
[Presentation] Hierarchical stress generation with Fujisaki model in expressive speech synthesis2014
- Author(s)
  Ya Li, Jianhua Tao, Keikichi Hirose, Wei Lai, and Xiaoying Xu
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Dublin(pp.1032-1036)
- Related Report
  2013 Final Research Report
[Presentation] Selection of training data for HMM-based speech synthesis from prosodic features - Use of generation process model of fundamental frequency contours2014
- Author(s)
  Tomoyuki Mizukami, Hiroya Hashimoto, Keikichi Hirose, Daisuke Saito, and Nobuaki Minematsu
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Dublin(pp.1042-1046)
- Related Report
  2013 Final Research Report
[Presentation] Rhythmic patterns in Native and non-native Mandarin Speech2014
- Author(s)
  Wentao Gu and Keikichi Hirose
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Dublin(pp.592-596)
- Related Report
  2013 Final Research Report
[Presentation] 生成過程モデルにおけるF0 パターン差分を考慮したHMM 音声合成の実験的検討2014
- Author(s)
  百武恭汰,橋本浩弥,齋藤大輔,峯松信明,広瀬啓吉
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  日本大学(1-R5-14,pp.407-408)
- Related Report
  2013 Final Research Report
[Presentation] RhythmicPatterns of Nonnative Mandarin Speech2014
- Author(s)
  Wentao Gu and Keikichi Hirose
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  日本大学(3-6-16, pp.357-360)
- Related Report
  2013 Final Research Report
[Presentation] 行列変量ガウス混合分布に基づく声質変換の検討2014
- Author(s)
  土井秀信, 齋藤大輔,峯松信明,広瀬啓吉
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  日本大学(1-R5-21,pp.425-428)
- Related Report
  2013 Final Research Report
[Presentation] Rhythmic patterns in native and non-native Mandarin speech2014
- Author(s)
  Wentao Gu
- Organizer
  International Conference on Speech Prosody
- Place of Presentation
  Dublin, Ireland
- Related Report
  2013 Annual Research Report
[Presentation] Hierarchical stress generation with Fujisaki model in expressive speech synthesis2014
- Author(s)
  Ya Li
- Organizer
  International Conference on Speech Prosody
- Place of Presentation
  Dublin, Ireland
- Related Report
  2013 Annual Research Report
[Presentation] 生成過程モデルにおけるF0 パターン差分を考慮したHMM音声合成の実験的検討2014
- Author(s)
  百武恭汰
- Organizer
  日本音響学会全国大会
- Place of Presentation
  日本大学, 東京
- Related Report
  2013 Annual Research Report
[Presentation] Control of fundamental frequencies in HMM-based speech synthesis using generation process model2014
- Author(s)
  Keikichi Hirose
- Organizer
  International Symposium on Frontiers of Research on Speech and Music
- Place of Presentation
  Mysore, India
- Related Report
  2013 Annual Research Report
- Invited
[Presentation] Rhythmic patterns of nonnative Mandarin speech2014
- Author(s)
  Wentao Gu
- Organizer
  日本音響学会全国大会
- Place of Presentation
  日本大学, 東京
- Related Report
  2013 Annual Research Report
[Presentation] 文節を基本単位とした基本周波数パターン生成過程モデルのパラメータ自動抽出2013
- Author(s)
  橋本浩弥, 広瀬啓吉, 峯松信明
- Organizer
  日本音響学会全国大会講演論文
- Place of Presentation
  豊橋技術科学大学(1-P-5a, pp.327-328)
- Related Report
  2013 Final Research Report
[Presentation] Context labels based on "bunsetsu" for HMM-based speech synthesis of Japanese2013
- Author(s)
  Hiroya Hashimoto, Keikichi Hirose, and Nobuaki Minematsu
- Organizer
  Proceedings 8^<th> ISCA Workshop on Speech Synthesis (SSW-8)
- Place of Presentation
  Barcelona
- Related Report
  2013 Final Research Report
[Presentation] Generation of fundamental frequency contours for Thai speech synthesis using tone nucleus model2013
- Author(s)
  Oraphan Krityakien, Nobuaki Minematsu, and Keikichi Hirose
- Organizer
  Proceedings INTERSPEECH 2013
- Place of Presentation
  Lyon
- Related Report
  2013 Final Research Report
[Presentation] 基本周波数パターン生成過程モデルの指令差分に基づく焦点制御の改良2013
- Author(s)
  川口拓也,橋本浩弥,広瀬啓吉,峯松信明
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  東京工科大学(3-P-34b, pp.505-506)
- Related Report
  2013 Final Research Report
[Presentation] Use of generation process model for synthesizing fundamental frequency contours in HMM-based speech synthesis2012
- Author(s)
  Keikichi Hirose, Hiroya Hashimoto, Jun Ikeshima, and Nobuaki Minematsu
- Organizer
  Proceedings IEEE International Conference on Signal Processing (ICSP'12)
- Place of Presentation
  Beijing(pp.575-578)
- Related Report
  2013 Final Research Report
[Presentation] アクセント核の知覚と母音長との関係の基礎的検討2012
- Author(s)
  ショート・グレッグ, 広瀬啓吉, 峯松信明
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  信州大学(3-Q-31,pp.429-430)
- Related Report
  2013 Final Research Report
[Presentation] Tone nucleus model for Thai language speech synthesis2012
- Author(s)
  Krityakien Oraphan, Hirose Keikichi, and Nobuaki Minematsu
- Organizer
  日本音響学会全国大会講演論文集
- Place of Presentation
  信州大学(3-Q14 pp.391-392)
- Related Report
  2013 Final Research Report
[Presentation] An alignment matching method to explore pseudosyllable properties across different corpora2012
- Author(s)
  Raymond W. M. Ng, Thomas Hain, and Keikichi Hirose
- Organizer
  Proceedings INTERSPEECH 2012
- Place of Presentation
  Portland(CD-ROM Proceedings)
- Related Report
  2013 Final Research Report
[Presentation] Effects of speaker adaptive training on tensor-based arbitrary speaker conversion2012
- Author(s)
  Daisuke Saito, Nobuaki Minematsu, and Keikichi Hirose
- Organizer
  Proceedings INTERSPEECH 2012
- Place of Presentation
  Portland(CD-ROM Proceedings)
- Related Report
  2013 Final Research Report
[Presentation] Improved automatic extraction of generation process model commands and its use for generating fundamental frequency contours for training HMM-based speech synthesis2012
- Author(s)
  Hiroya Hashimoto, Keikichi Hirose, and Nobuaki Minematsu
- Organizer
  Proceedings INTERSPEECH 2012
- Place of Presentation
  Portland(CD-ROM Proceedings)
- Related Report
  2013 Final Research Report
[Presentation] Fundamental frequency contour reshaping in HMM-based speech synthesis and realization of prosodic focus using generation process model2012
- Author(s)
  Keikichi Hirose, Hiroya Hashimoto, Jun Ikeshima, and Nobuaki Minematsu
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Shanghai(SS1-3, pp.171-174)
- Related Report
  2013 Final Research Report
[Presentation] Automatic segmentation of English words using phonotactic and syllable information2012
- Author(s)
  Raymond W. M. Ng and Keikichi Hirose
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Shanghai(PS1A-6, pp.27-30)
- Related Report
  2013 Final Research Report
[Presentation] Effects of learners' language transfer on native listeners' evaluation of the prosodic naturalness of japanese words2012
- Author(s)
  Shuhei Kato, Greg Short, Nobuaki Minematsu, and Keikichi Hirose
- Organizer
  Proceedings of International Conference on Speech Prosody
- Place of Presentation
  Shanghai(pp.198-201)
- Related Report
  2013 Final Research Report
[Presentation] アクセント核の知覚と母音長との関係の基礎的検討2012
- Author(s)
  ショート・グレッグ
- Organizer
  日本音響学会全国大会
- Place of Presentation
  信州大学
- Related Report
  2012 Research-status Report

Pronunciation education system based on the systematization of non-mothor tongue speech prosody using generation process model and speech synthesis

Principal Investigator

HIROSE Keikichi 東京大学, 情報理工学(系)研究科, 教授 (50111472)

¥3,640,000 (Direct Cost: ¥2,800,000、Indirect Cost: ¥840,000)

Report

Research Products

[Journal Article] Automatic recognition of gemination in Japanese motivated by perceptual experiments2014

Author(s)

Journal Title

NAID

URL

Related Report

[Journal Article] Rhythmic patterns in Native and non-native Mandarin Speech2014

Author(s)

Journal Title

Related Report

[Journal Article] Hierarchical stress generation with Fujisaki model in expressive speech synthesis2014

Author(s)

Journal Title

Related Report

[Journal Article] Toward flexible and systematic control of fundamental frequencies in HMM-based speech synthesis2013

Author(s)

Journal Title

URL

Related Report

[Journal Article] Japanese lexical accent recognition for a CALL system by deriving classification equations with perceptual experiments2013

Author(s)

Journal Title

URL

Related Report

[Journal Article] Generation of fundamental frequency contours for Thai speech using the tone nucleus model2013

Author(s)

Journal Title

NAID

URL

Related Report

[Journal Article] Toward flexible and systematic control of fundamental frequencies in HMM-based speech synthesis2013

Author(s)

Journal Title

Related Report

[Journal Article] A method for generation of Mandarin F0 contours based on tone nucleus model and superpositional model2012

Author(s)

Journal Title

URL

Related Report

[Journal Article] Applying generation process model constraint to fundamental frequency contours generated by hidden-Markov-model-based speech synthesis2012

Author(s)

Journal Title

NAID

URL

Related Report

[Journal Article] Fundamental frequency contour reshaping in HMM-based speech synthesis and realization of prosodic focus using generation process model2012

Author(s)

Journal Title

Related Report

[Journal Article] Improved automatic extraction of generation process model commands and its use for generating fundamental frequency contours for training HMM-based speech synthesis2012

Author(s)

Journal Title

Related Report

[Journal Article] Use of generation process model for synthesizing fundamental frequency contours in HMM-based speech synthesis2012

Author(s)

Journal Title

Related Report

[Presentation] Hierarchical stress generation with Fujisaki model in expressive speech synthesis2014

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] Selection of training data for HMM-based speech synthesis from prosodic features - Use of generation process model of fundamental frequency contours2014

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] Rhythmic patterns in Native and non-native Mandarin Speech2014

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] 生成過程モデルにおけるF0 パターン差分を考慮したHMM 音声合成の実験的検討2014

Author(s)