2011 Fiscal Year Annual Research Report

音声生成における高次機能から末梢機能までの計測とモデル化に関する研究

Research Project

Project/Area Number	22500150
Research Institution	Japan Advanced Institute of Science and Technology
Principal Investigator	党建武北陸先端科学技術大学院大学, 情報科学研究科, 教授 (80334796)
Co-Investigator(Kenkyū-buntansha)	徳田功立命館大学, 理工学部, 准教授 (00261389) 末光厚夫北陸先端科学技術大学院大学, 情報科学研究科, 助教 (20422199)
Keywords	音声生成 / 聴覚誘導の音声生成関数 / 学習過程 / 調音計測 / 変形聴覚フィードバック / 復唱による学習 / 言語習得 / Quantal Theory
Research Abstract	人間の音声生成では、発話計画や調音器官の運動制御、聴覚による発話のスクリーニングなど、脳の高次機能から生理物理運動までの過程が関わっている。人間の音声生成機能に関連する、言語音声の獲得や発話障害などの未解明問題が山積しているが、技術的・倫理的な制限のため、実験的な手法により解明することが困難である。そこで、本研究は、生理学的発話機構モデルを基に、発話計画、発話運動制御、視覚・聴覚フィードバックなど脳の高次機構を取り入れ生理学的ニューロン計算モデルを構築し、モデルによるシミュレーションと計測データとの比較により、人間の音声生成過程および発話障害の解明を行っている。平成23年度は、観測データに基づいて、本グループが開発した生理学的発話機構モデルを用いて幼児の発音学習過程を模擬しながら、筋活動空間(運動指令)から音響空間への投影関係を考察して、モデルの制御法を検討した。また、調音筋に部分的に障害があった場合の代償機能について、モデルを用いて検証を行った。人間の第二言語習得過程における音声生成と知覚の関係は未知な要因が多く存在している。本研究では、復唱による母音学習過程に着目し、長期間にわたって、被験者の復唱音声と内観を分析することで、学習において利用される音響特徴量や学習過程における知覚カテゴリの変化を考察した。その結果、復唱学習実験における被験者の内観から学習途中には目標母音に対する知覚カテゴリの変化が生じたものと生じなかったものが見られた。これは、人間の知覚および調音空間上には音響的にも調音的にも安定した場所と不安定な場所が存在しており、音響的にも調音的にも安定した場所を探索するように学習が行われている可能性が考えられる。このことは調音目標の形成においてQuantal Theoryにより記述された現象が存在することを明らかにした。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 本研究の計画は基本的に合理的で、実験はおおむね順調である。生理学的音声生成モデルの制御方法と復唱による学習過程についていくつかの成果を挙げた。それらの成果を3つの学術雑誌論文と9つの学会研究発表にまとめた。平成24年度は、音声生成の生理学的ニューロン計算モデルの構築とモデルによるシミュレーションを中心として研究を展開する予定である。
Strategy for Future Research Activity	これまでの研究成果を基に、音声生成の生理学的ニューロン計算モデルの構築を行う。多様な条件を設定し、モデルのパラメータを変動させながら、個々構成要素の挙動および全体機能を調べる。モデルシミュレーションについて、音声生成過程の模擬、言語音声獲得過程の模擬および発話障害の模擬を行う予定である。さらに、上記のシミュレーション結果と生理データを比較することによって、人間の音声生成メカニズムについて考察を行う。これにより、言葉の鎖における脳高次機能から末梢器官までの機能的役割を明確化させる。

Research Products
(12 results)

All 2012 2011

All Journal Article (3 results) (of which Peer Reviewed: 3 results) Presentation (9 results)

[Journal Article] Voice Activity Detection Based On An Unsupervised Learning Framework2011
- Author(s)
  D.Ying, Y.Yan, J.Dang, F.Soong
- Journal Title
  
  IEEE Trans.Audio, Speech and Language Processing
  
  Volume: 19 Pages: 2624-2633
- DOI
  10.1109/TASL.2011.2125953
- Peer Reviewed
[Journal Article] Investigation of Auditory-Guided Speech Production while Learning Unfamiliar Speech Sounds2011
- Author(s)
  Kazuya Fujii, Qiang Fang, Jianwu Dang
- Journal Title
  
  Journal of Signal Processing
  
  Volume: 15 Pages: 287-290
- DOI
  http://hdl.handle.net/10119/10276
- Peer Reviewed
[Journal Article] Study of Control Strategy Mimicking Speech Motor Learning for a Physiological Articulatory Model2011
- Author(s)
  Xiyu Wu, Jianguo Wei, Jianwu Dang
- Journal Title
  
  Journal of Signal Processing
  
  Volume: 15 Pages: 295-298
- DOI
  http://hdl.handle.net/10119/10277
- Peer Reviewed
[Presentation] Noise Estimation Using a Constrained Sequential HMM In Log-Spectral Domain2012
- Author(s)
  D. Ying, X. Lu, J. Li, Y. Yan, J. Dang,F. Soong
- Organizer
  ICASSP 2012
- Place of Presentation
  Kyoto, Japan
- Year and Date
  2012-03-27
[Presentation] 形態学的情報に基づく個人化発話機構モデルの構築2012
- Author(s)
  西村奈々, 川本真一, 党建武
- Organizer
  日本音響学会2012年春季研究発表会講演論文
- Place of Presentation
  神奈川大(神奈川)
- Year and Date
  2012-03-13
[Presentation] Morphological personalization according to human mechanismusing MR images2012
- Author(s)
  Nana Nishimura, Shin-ichi Kawamoto, Jianwu Dang
- Organizer
  Proc.2012 RISP International workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP12)
- Place of Presentation
  Hawaii, USA
- Year and Date
  2012-03-05
[Presentation] Investigation of perceptual effects during learning process viavowel imitation2012
- Author(s)
  Kazuya Fujii, Atsuo Suemitsu, Jianwu Dang
- Organizer
  Proc.2012 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP'12)
- Place of Presentation
  Hawaii, USA
- Year and Date
  2012-03-05
[Presentation] Noise Power Estimation Based on a Sequential Hidden Markov Model2011
- Author(s)
  Dongwen YING, Yonghong YAN, Jianwu DANG, Frank K.SOONG
- Organizer
  the 8th International Conference on Information, Communications and Signal Processing (ICICS 2011)
- Place of Presentation
  Singapore
- Year and Date
  2011-12-15
[Presentation] 復唱による母音学習過程における音声知覚に関する考察2011
- Author(s)
  藤井一哉, 末光厚夫, 党建武
- Organizer
  聴覚研究会資料vol.41,no.9,pp.689-694
- Place of Presentation
  熊本県立大(熊本)
- Year and Date
  2011-12-11
[Presentation] Comparative Electromagnetic Articulographic (EMA) study of English rhythm as produced by native and non-native speakers2011
- Author(s)
  Erickson, D., Suemitsu, A., Shibuya, Y., Lee, S., Tanaka Y.
- Organizer
  Proc.162nd Meeting of the Acoustical Society of America (ASA), CA.130.4.2.2, pp.2567
- Place of Presentation
  San Diego, USA
- Year and Date
  2011-11-04
[Presentation] Emotional intonation in a tone language : experimental evidence from Chinese2011
- Author(s)
  Li, A., Fang, Q., Dang, J.
- Organizer
  ICPhS'2011
- Place of Presentation
  Hong Kong, China
- Year and Date
  2011-08-18
[Presentation] Rhythm and emphasis in American English : Comparison of native and non-native speakers' productions2011
- Author(s)
  Erickson, D., Shibuya, Y., Suemitsu, A.
- Organizer
  Proc.2011 International Seminar on Speech Production (ISSP'11), pp.345-352
- Place of Presentation
  Montreal, Canada
- Year and Date
  2011-06-22

2011 Fiscal Year Annual Research Report

音声生成における高次機能から末梢機能までの計測とモデル化に関する研究

Principal Investigator

党 建武 北陸先端科学技術大学院大学, 情報科学研究科, 教授 (80334796)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Voice Activity Detection Based On An Unsupervised Learning Framework2011

Author(s)

Journal Title

DOI

[Journal Article] Investigation of Auditory-Guided Speech Production while Learning Unfamiliar Speech Sounds2011

Author(s)

Journal Title

DOI

[Journal Article] Study of Control Strategy Mimicking Speech Motor Learning for a Physiological Articulatory Model2011

Author(s)

Journal Title

DOI

[Presentation] Noise Estimation Using a Constrained Sequential HMM In Log-Spectral Domain2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 形態学的情報に基づく個人化発話機構モデルの構築2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Morphological personalization according to human mechanismusing MR images2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Investigation of perceptual effects during learning process viavowel imitation2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Noise Power Estimation Based on a Sequential Hidden Markov Model2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 復唱による母音学習過程における音声知覚に関する考察2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Comparative Electromagnetic Articulographic (EMA) study of English rhythm as produced by native and non-native speakers2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Emotional intonation in a tone language : experimental evidence from Chinese2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Rhythm and emphasis in American English : Comparison of native and non-native speakers' productions2011

Author(s)

Organizer

Place of Presentation

Year and Date

党建武北陸先端科学技術大学院大学, 情報科学研究科, 教授 (80334796)