2021 Fiscal Year Research-status Report

脳波と眼球運動を用いた音声生成と知覚の神経メカニズムに関する研究

Research Project

Project/Area Number	20K11883
Research Institution	Japan Advanced Institute of Science and Technology
Principal Investigator	党建武北陸先端科学技術大学院大学, 先端科学技術研究科, 教授 (80334796)
Co-Investigator(Kenkyū-buntansha)	赤木正人北陸先端科学技術大学院大学, 先端科学技術研究科, 教授 (20242571)
Project Period (FY)	2020-04-01 – 2023-03-31
Keywords	音声生成 / 音声理解 / 脳ネットワーク / 脳活動の動的特性 / 音声生成の神経学的モデル
Outline of Annual Research Achievements	連続音声理解における神経学的メカニズムを考察するため、物語の録音とその逆回しをそれぞれ聴取させ、EEG信号を収録した。機能的ハイパーアラインメント法を用いてEEGのノイズを低減してから、EEGから脳内のソースを再構築し、音声入力からソース領域までのTRFを推定した。TRFの相関性に基づいて脳ネットワークを構築して、コミュニティ検出法を用いて脳ネットワークの特性を調査した。その結果、通常物語の音声理解における脳の活動領域は、従来fMRIに基づいたものと一致した。それに対して同じ物語の時間反転音声(無意味文)に対して語彙認知に関する領域は活動がなくなかった。音声生成の神経学的メカニズムを究明するため、連続文を朗読する際、EEGと眼球運動を記録して、EEGからソースの再構築をしたうえ、空間と時間および周波数の側面を統一したフレームワークで脳の音声処理メカニズムを考察した。音声理解過程には、アモーダルな意味センター(ATL)と腹側通路に沿ったモダリティ特定の感覚運動系との頻繁な相互作用及び、背側通路に沿った運動系の積極的な関与が含まれることがわかった。同様に、音声生成には、予測と聴覚フィードバック補正のため聴覚システムも積極的に関与していることを確認した。階層的な言語構造は、異なる周波数帯域での神経振動によって対処しており、スケールの異なる言語単位の相互作用は、クロス周波数結合のメカニズムを介して実現することがわかった。上記の結果に基づいて、我々は音声生成と理解の神経学的機能モデルの雛形を構築した。そこで、既存モデルを語彙レベルから文レベルに拡張したうえ、ボトムアップ/トップダウンの機能を動的な組織化システムに取り入れた。このモデルは、脳ネットワーク内でのコミュニケーションの神経振動現象を説明でき、更に音声機能の基礎となる神経メカニズムのより深い考察に応用することも期待できる。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason コロナ禍のため、音声生成と音声知覚(復唱)のEEG実験の一部を計画通り実施できなかった。
Strategy for Future Research Activity	引き続き連続文の復唱EEG実験の実施を努める同時に、これまで収録した連続文朗読の実験データに対して、音声フィードバックに関する部分を再分析し、音声生成と音声知覚との相互作用を考察して、構築して音声生成の神経学的モデルを完成する予定である。

Research Products
(11 results)

All 2022 2021 Other

All Int'l Joint Research (2 results) Journal Article (4 results) (of which Int'l Joint Research: 4 results, Peer Reviewed: 4 results, Open Access: 4 results) Presentation (5 results) (of which Int'l Joint Research: 5 results)

[Int'l Joint Research] 天津大学/天津理工大学(中国)
- Country Name
  CHINA
- Counterpart Institution
  天津大学/天津理工大学
[Int'l Joint Research] シンガポール国立大学/南洋理工大学(シンガポール)
- Country Name
  SINGAPORE
- Counterpart Institution
  シンガポール国立大学/南洋理工大学
[Journal Article] Learning affective representations based on magnitude and dynamic relative phase information for speech emotion recognition2022
- Author(s)
  Guo Lili、Wang Longbiao、Dang Jianwu、Chng Eng Siong、Nakagawa Seiichi
- Journal Title
  
  Speech Communication
  
  Volume: 136 Pages: 118～127
- DOI
  10.1016/j.specom.2021.11.005
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Constructing Accurate and Efficient Deep Spiking Neural Networks With Double-Threshold and Augmented Schemes2022
- Author(s)
  Yu Qiang、Ma Chenxiang、Song Shiming、Zhang Gaoyan、Dang Jianwu、Tan Kay Chen
- Journal Title
  
  IEEE Transactions on Neural Networks and Learning Systems
  
  Volume: 33 Pages: 1～13
- DOI
  10.1109/TNNLS.2020.3043415
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling2022
- Author(s)
  Qin Siqing、Wang Longbiao、Li Sheng、Dang Jianwu、Pan Lixin
- Journal Title
  
  EURASIP Journal on Audio, Speech, and Music Processing
  
  Volume: 2022 Pages: 1－10
- DOI
  10.1186/s13636-021-00233-4
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Weighted RSA: An Improved Framework on the Perception of Audio-visual Affective Speech in Left Insula and Superior Temporal Gyrus2021
- Author(s)
  J. Xu, H. Dong, N. Li, Z. Wang, F. Guo, J. Wei and J. Dang
- Journal Title
  
  Neuroscience
  
  Volume: 469 Pages: 46－58
- DOI
  10.1016/j.neuroscience.2021.06.002
- Peer Reviewed / Open Access / Int'l Joint Research
[Presentation] Multi-Modal Emotion Recognition Based On deep Learning Of EEG And Audio Signals2021
- Author(s)
  Zhongjie Li, Gaoyan Zhang, Jianwu Dang, Longbiao Wang, Jianguo Wei
- Organizer
  2021 International Joint Conference on Neural Networks (IJCNN)
- Int'l Joint Research
[Presentation] ONSK-GCN: Conversational Semantic-and Knowledge-Oriented Graph Convolutional Network for Multimodal Emotion Recognition2021
- Author(s)
  Y Fu, S Okada, L Wang, L Guo, Y Song, J Liu, J Dang
- Organizer
  2021 IEEE International Conference on Multimedia and Expo (ICME)
- Int'l Joint Research
[Presentation] Multimodal Emotion Recognition with Capsule Graph Convolutional Based Representation Fusion2021
- Author(s)
  J Liu, S Chen, L Wang, Z Liu, Y Fu, L Guo, J Dang
- Organizer
  ICASSP 2021
- Int'l Joint Research
[Presentation] Representation Learning with Spectro-Temporal-Channel Attention for Speech Emotion Recognition2021
- Author(s)
  L Guo, L Wang, C Xu, J Dang, ES Chng, H Li
- Organizer
  ICASSP 2021
- Int'l Joint Research
[Presentation] Robust Voice Activity Detection Using a Masked Auditory Encoder Based Convolutional Neural Network2021
- Author(s)
  N Li, L Wang, M Unoki, S Li, R Wang, M Ge, J Dang
- Organizer
  ICASSP 2021
- Int'l Joint Research

2021 Fiscal Year Research-status Report

脳波と眼球運動を用いた音声生成と知覚の神経メカニズムに関する研究

Principal Investigator

党 建武 北陸先端科学技術大学院大学, 先端科学技術研究科, 教授 (80334796)

Current Status of Research Progress

Reason

Research Products

[Int'l Joint Research] 天津大学/天津理工大学(中国)

Country Name

Counterpart Institution

[Int'l Joint Research] シンガポール国立大学/南洋理工大学(シンガポール)

Country Name

Counterpart Institution

[Journal Article] Learning affective representations based on magnitude and dynamic relative phase information for speech emotion recognition2022

Author(s)

Journal Title

DOI

[Journal Article] Constructing Accurate and Efficient Deep Spiking Neural Networks With Double-Threshold and Augmented Schemes2022

Author(s)

Journal Title

DOI

[Journal Article] Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling2022

Author(s)

Journal Title

DOI

[Journal Article] Weighted RSA: An Improved Framework on the Perception of Audio-visual Affective Speech in Left Insula and Superior Temporal Gyrus2021

Author(s)

Journal Title

DOI

[Presentation] Multi-Modal Emotion Recognition Based On deep Learning Of EEG And Audio Signals2021

Author(s)

Organizer

[Presentation] ONSK-GCN: Conversational Semantic-and Knowledge-Oriented Graph Convolutional Network for Multimodal Emotion Recognition2021

Author(s)

Organizer

[Presentation] Multimodal Emotion Recognition with Capsule Graph Convolutional Based Representation Fusion2021

Author(s)

Organizer

[Presentation] Representation Learning with Spectro-Temporal-Channel Attention for Speech Emotion Recognition2021

Author(s)

Organizer

[Presentation] Robust Voice Activity Detection Using a Masked Auditory Encoder Based Convolutional Neural Network2021

Author(s)

Organizer

党建武北陸先端科学技術大学院大学, 先端科学技術研究科, 教授 (80334796)