2013 Fiscal Year Annual Research Report

音声からの調音運動抽出に基づく発音マップ・調音動作アニメ表示法と発音矯正への適用

Research Project

Project/Area Number	25280128
Research Category	Grant-in-Aid for Scientific Research (B)
Research Institution	Waseda University
Principal Investigator	新田恒雄早稲田大学, グリーンコンピューティングシステム研究機構, 教授 (70314101)
Co-Investigator(Kenkyū-buntansha)	河合剛北海道大学, その他の研究科, 准教授 (70312981) 入部百合絵愛知県立大学, 情報科学部, 助教 (40397500) 林良子神戸大学, その他の研究科, 准教授 (20347785)
Project Period (FY)	2013-04-01 – 2016-03-31
Keywords	発音学習 / 調音特徴抽出 / 発音マップ / 調音ジェスチャ
Research Abstract	H25年度は，発音学習システム基本性能の向上に注力すると共に，システムの設計と評価方法等を検討した。 (1) 音声-調音特徴変換エンジン: 調音特徴28種（破裂音，摩擦音，… 等の調音様式と，口唇，歯茎，…等の調音部位からなる）を高精度に抽出する方式を検討した。特に，子供から大人・男女といった話者の違いに起因する性能劣化に対処する「話者正準化」方式のシミュレーションを行い，これまでのVTLN(声道長正規化法)/HMM (標準的HMM方式から + 2% 向上)と比較し，一段高い音声認識性能を得た(同 +4% 向上)。これにより，話者に依存しない調音特徴抽出のfront-endを確立できた。また，調音特徴抽出は2段のMLP(Multi-Layer Perceptron)と直交化処理(Gram-Schmidt Orthogonalization)で構成しているが，一層の性能向上を図るため，テンソル解析を含む方式を検討した。 (2) 発音マップサブシステム，調音アニメ生成サブシステム基幹部である調音－座標変換器を開発し，各サブシステムに組み込んだ。これより，発音マップへプロットするための座標値およびアニメ生成のための顔輪郭線の座標値を取得することができ，各サブシステムのユーザインタフェース上に動的に描画できるようになった。また，調音運動の動画資料収集のため，2度に亘り40単語程度のMRI撮像を行なった。これにより，不足していた動画資料の補てんができ，同時に新たな被験者の資料を収集することができた。 (3) 発音学習システムの機能仕様市販英語発音学習ソフトウエアの機能と性能を調べると共に，第二言語としての英語発音教授法に関する調査を行った。また，発音学習の中のイントネーション矯正に関して，大学新入生2,600名を対象にオンラインの自主学習教材とその改良を行った。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason - 調音特徴抽出アルゴリズムの改良を行い，精度向上に目処が得られた。 - 発音マップサブシステムおよび調音アニメ生成サブシステム共に精度を向上することができ，各サブシステムのユーザインタフェース上に動的に描画できるようになった。 - MRI撮像を順調に進めることができ，発音の動画データベースを充実させることができた。 - 発音学習システムの基本仕様の調査と設計・評価方法を検討することができた。
Strategy for Future Research Activity	初年度の研究成果を基に評価システムを実装すると共に，教育現場での評価と連携し，システム改良を進める。 (1)音声-調音特徴変換エンジン: 音声を調音特徴に変換する精度底上げ(85%以上)を目標に，アルゴリズム(話者正準化，固有ベクトル双対変換，DNNなど) のシミュレーションと実装を行う。 (2)発音マップサブシステム: 教育現場に導入可能な実用レベルのプロット精度を目指し，調音特徴を発音マップ上に精度よく変換する，座標変換器の改良を進める。また実地評価での意見を吸い上げ，直感的で分かり易いユーザインタフェース（UI）を設計する。 (3)調音アニメ生成サブシステム: 音素バランスの良いMRI画像を収集し，調音アニメの精度を向上する（MRI原画像との相関で0.8以上）。また調音で重要な箇所をハイライトする等のUI改良，他言語への応用検討を行う。 (4)発音学習システムと教育効果の検証: 音声－調音特徴変換エンジン，発音マップサブシステム，調音アニメ生成サブシステムを統合し，学習教材や学習履歴機能を組み込んだ発音学習システムを開発する。
Expenditure Plans for the Next FY Research Funding	H26期初に支払がある経費のために，少額(57,095円)を次年度に残した。 H26期初に，PC周辺部品ほかの支払に，5万円程度を充てる。

Research Products
(18 results)

All 2014 2013

All Journal Article (4 results) (of which Peer Reviewed: 4 results) Presentation (13 results) (of which Invited: 1 results) Patent(Industrial Property Rights) (1 results)

[Journal Article] Mapping Articulatory-Feature to Vocal-Tract Parameters for Voice Conversion2014
- Author(s)
  Narpendyah Wisjnu ARIWARDHANI, Masashi KIMURA, Yurie IRIBE, Kouichi KATSURADA, and Tsuneo NITTA
- Journal Title
  
  IEICE Trans. Inf. & Syst.
  
  Volume: Vol.E97-D, No.4 Pages: 911-918
- DOI
  10.1587/transinf.E97.D911
- Peer Reviewed
[Journal Article] Solving the Phoneme Conflict in Grapheme-to-Phoneme Conversion Using a Two-Stage Neural Network-Based Approach2014
- Author(s)
  Seng KHEANG, Kouichi KATSURADA, Yurie IRIBE, and Tsuneo NITTA
- Journal Title
  
  IEICE Trans. Inf. & Syst.
  
  Volume: Vol.E97-D, No.4 Pages: 901-910
- DOI
  10.1587/transinf.E97.D901
- Peer Reviewed
[Journal Article] 調音特徴―声道音響パラメータ変換を用いた調音特徴運動HMM音声合成2013
- Author(s)
  木村優志，入部百合絵，桂田浩一，新田恒雄
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: Vol. J96-D No.5 Pages: pp.1356-1364
- Peer Reviewed
[Journal Article] Suffix Arrayを用いた高速音声検索語検出システムの性能評価2013
- Author(s)
  桂田浩一，勝浦広大，入部百合絵，新田恒雄
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: Vol. J96-D No.10 Pages: pp.2540-2548
- Peer Reviewed
[Presentation] Online learning of introductory technical writing by using captions of figures and tables for English as a foreign language2014
- Author(s)
  Akio Ohnishi and Goh Kawai
- Organizer
  Proc. of the American Association for Applied Linguistics Conference (AAAL 2014)
- Place of Presentation
  Portland, Oregon, USA
- Year and Date
  20140322-20140325
[Presentation] Improving the classroom discourse of non-native teachers of English language by developing, annotating, and analyzing spoken language corpora2014
- Author(s)
  Noriaki Katagiri and Goh Kawai
- Organizer
  Proc. of the American Association for Applied Linguistics Conference (AAAL 2014)
- Place of Presentation
  Portland, Oregon, USA
- Year and Date
  20140322-20140325
[Presentation] 標準話者母音スペクトルへの変換に基づく話者正準化2014
- Author(s)
  久保田雄一, 大町基, 小川哲司, 小林哲則, 新田恒雄
- Organizer
  日本音響学会 2014年春季研究発表会 3月 12日
- Place of Presentation
  日本大学 (理工学部)
- Year and Date
  20140310-20140312
[Presentation] Flip the classroom: Gimmick or revolution?2013
- Author(s)
  Don Hinkelman and Goh Kawai
- Organizer
  Sapporo Gakuin University CALL-Plus Workshop 2013
- Place of Presentation
  Sapporo, Japan
- Year and Date
  20131116-20131116
- Invited
[Presentation] Online learning of introductory technical writing using figures and tables2013
- Author(s)
  Goh Kawai and Akio Ohnishi
- Organizer
  Sapporo Gakuin University CALL-Plus Workshop 2013
- Place of Presentation
  Sapporo, Japan
- Year and Date
  20131116-20131116
[Presentation] How to convert a printed textbook and DVD to CALL2013
- Author(s)
  Goh Kawai and Akio Ohnishi
- Organizer
  Proc. of the Japan Association of Language Teachers Conference (JALT 2013)
- Place of Presentation
  Kobe, Japan
- Year and Date
  20131025-20131028
[Presentation] 日本人英語学習者による母音産出の特徴－聴覚評定実験による検討2013
- Author(s)
  張亜明，林良子，Donna Erickson，吐師道子，阿栄娜
- Organizer
  日本音響学会2013年秋季研究発表会
- Place of Presentation
  豊橋技術科学大学
- Year and Date
  20130925-20130927
[Presentation] Voice Conversion For Arbitrary Speakers Using Articulatory-Movement To Vocal-Tract Parameter Mapping2013
- Author(s)
  Narpendyah W. Ariwardhani, Yurie Iribe, Kouichi Katsurada, Tsuneo Nitta
- Organizer
  Proc of.MLSP2013 (IEEE International Workshop on Machine Learning for Signal Processing)
- Place of Presentation
  Southampton, United Kingdom
- Year and Date
  20130922-20130925
[Presentation] Naturalness on Japanese pronunciation before and after shadowing training and prosody modified stimuli2013
- Author(s)
  Rongna A, Ryoko Hayashi, Tatsuya Kitamura
- Organizer
  Interspeech 2013 Satellite workshop on Speech and Language Technology in Education (SLaTE)
- Place of Presentation
  グルノーブル大学　(仏)
- Year and Date
  20130830-20130901
[Presentation] Acceleration of Spoken Term Detection Using a Suffix Array by Assigning Optimal Threshold Values to Sub-Keywords2013
- Author(s)
  Kouichi Katsurada, Seiichi Miura, Kheang Seng, Yurie Iribe, Tsuneo Nitta
- Organizer
  Proc of. Interspeech 2013
- Place of Presentation
  Lyon, France
- Year and Date
  20130825-20130829
[Presentation] How to bring a printed textbook and DVD online2013
- Author(s)
  Goh Kawai and Akio Ohnishi
- Organizer
  Proc. of the Language Education and Technology Conference (LET 2013)
- Place of Presentation
  Tokyo, Japan
- Year and Date
  20130807-20130809
[Presentation] Visualizing pitch contours may improve production and reception of intonation2013
- Author(s)
  Akio Ohnishi and Goh Kawai
- Organizer
  Proc. of WorldCALL Conference (WorldCALL 2013)
- Place of Presentation
  Glasgow, UK
- Year and Date
  20130710-20130713
[Presentation] Introducing Articulatory Ancho-point to ANN Training for Corrective Learning of Pronunciation2013
- Author(s)
  Yurie Iribe, Silasak Manosavanh, Kouichi Katsurada, Ryoko Hayashi and Chunyue Zhu, Tsuneo Nitta
- Organizer
  Proc of. ICASSP 2013 (IEEE International Conference on Acoustics, Speech, and Signal Processing)
- Place of Presentation
  Vancouver, BC, Canada
- Year and Date
  20130526-20130531
[Patent(Industrial Property Rights)] 音声認識方法及び音声認識プログラム2014
- Inventor(s)
  新田恒雄
- Industrial Property Rights Holder
  新田恒雄
- Industrial Property Rights Type
  特許
- Industrial Property Number
  特願2014-39639
- Filing Date
  2014-02-28

2013 Fiscal Year Annual Research Report

音声からの調音運動抽出に基づく発音マップ・調音動作アニメ表示法と発音矯正への適用

Principal Investigator

新田 恒雄 早稲田大学, グリーンコンピューティングシステム研究機構, 教授 (70314101)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Mapping Articulatory-Feature to Vocal-Tract Parameters for Voice Conversion2014

Author(s)

Journal Title

DOI

[Journal Article] Solving the Phoneme Conflict in Grapheme-to-Phoneme Conversion Using a Two-Stage Neural Network-Based Approach2014

Author(s)

Journal Title

DOI

[Journal Article] 調音特徴―声道音響パラメータ変換を用いた調音特徴運動HMM音声合成2013

Author(s)

Journal Title

[Journal Article] Suffix Arrayを用いた高速音声検索語検出システムの性能評価2013

Author(s)

Journal Title

[Presentation] Online learning of introductory technical writing by using captions of figures and tables for English as a foreign language2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Improving the classroom discourse of non-native teachers of English language by developing, annotating, and analyzing spoken language corpora2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 標準話者母音スペクトルへの変換に基づく話者正準化2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Flip the classroom: Gimmick or revolution?2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Online learning of introductory technical writing using figures and tables2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] How to convert a printed textbook and DVD to CALL2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 日本人英語学習者による母音産出の特徴－聴覚評定実験による検討2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Voice Conversion For Arbitrary Speakers Using Articulatory-Movement To Vocal-Tract Parameter Mapping2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Naturalness on Japanese pronunciation before and after shadowing training and prosody modified stimuli2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Acceleration of Spoken Term Detection Using a Suffix Array by Assigning Optimal Threshold Values to Sub-Keywords2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] How to bring a printed textbook and DVD online2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Visualizing pitch contours may improve production and reception of intonation2013

Author(s)

Organizer

Place of Presentation

新田恒雄早稲田大学, グリーンコンピューティングシステム研究機構, 教授 (70314101)