2017 Fiscal Year Annual Research Report

振幅変調の概念に基づいた聴知覚における質感認識メカニズムの理解

Publicly Offered Research

Project Area	Understanding human recognition of material properties for innovation in SHITSUKAN science and technology
Project/Area Number	16H01669
Research Institution	Japan Advanced Institute of Science and Technology
Principal Investigator	鵜木祐史北陸先端科学技術大学院大学, 先端科学技術研究科, 教授 (00343187)
Project Period (FY)	2016-04-01 – 2018-03-31
Keywords	質感認識 / 変調スペクトル / 変調伝達関数 / 変調フィルタバンク / 振幅変調
Outline of Annual Research Achievements	本研究課題では，1. 音の質感認識における「粗さ」に係わる物理量が何であるのか，2. 音の質感認識には音源だけでなく伝送系の質も関係しているのか，そして，3. ヒトが音源と伝送系を切り分けて質感認識を行っているかどうか，全体考察も踏まえ，一つ一つ明らかにしていく．本年度は，音声の非言語情報（感情と個人性）を質感認識の一つととらえ，課題2と課題3に取り組んだ．まず，雑音駆動型声分析合成系における振幅包絡線情報に関して，音環境（波形雑音や残響）が，音源（音声信号）の非言語情報（個人性と感情）の知覚にどのような影響を与えるか調査した．次に，室内の背景雑音や残響の影響を変調伝達関数でモデル化することによって音源の変調とは独立に制御し，音源の質感に伝送系の変調がどのように影響するのかを検討した．実験では，Schroederの室内インパルス応答を利用した残響環境（残響時間0.1, 0.2, 0.5, 1.0, 2.0秒）と白色性ガウス雑音を利用した雑音環境（SN比 20, 15, 10, 5, 0, -5 dB）と想定し，音源はこれまでと同じ音声データを利用した．残響環境では，話者・感情認識とも残響時間が2秒のときのみ他の条件と有意差があった．雑音環境では，話者認識のときSN比が-5 dBのときのみ他の条件と有意差があり，感情認識のときSN比が0 dB以下のときのみ他の条件と有意差があった．最後に，これらの結果を俯瞰的に眺め全体考察したところ，極めて劣悪な状況を除く日常的な音環境では，非言語情報（個人性や感情）の知覚は残響や雑音の影響を受けにくいことがわかった．このことから，非言語情報知覚を質感知覚の一つと解釈すれば，音源（非言語情報）と伝送経路（残響や背景雑音）を切り分けて音の質感を認識していると解釈できることがわかった．
Research Progress Status	29年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	29年度が最終年度であるため、記入しない。
Remarks	北國新聞，掲載日平成30年1月22日（月），平成30年1月21日（日）、日本海イノベーション会議［北陸先端科学技術大学院大学プログラム］（本学、北國新聞社主催）に関する記事北國新聞，掲載日平成30年2月27日（火），平成29年度北陸先端科学技術大学院大学プログラム「日本海イノベーション会議」耳寄りの話，人の聴覚の特性を生かしよりよい聞こえの実現へ

Research Products
(26 results)

All 2018 2017 Other

All Journal Article (7 results) (of which Peer Reviewed: 7 results, Open Access: 2 results) Presentation (18 results) (of which Int'l Joint Research: 5 results) Remarks (1 results)

[Journal Article] Contributions of Temporal Cue on the Perception of Speaker Individuality and Vocal Emotion for Noise-Vocoded Speech2018
- Author(s)
  Zhi Zhu, Yukiko Araki, Ryota Miyauchi and Masashi Unoki
- Journal Title
  
  Acoustical Science and Technology
  
  Volume: 39(3) Pages: -
- Peer Reviewed / Open Access
[Journal Article] Speech Emotion Recognition Using MPCRNN based on Gammatone auditory filterbank2017
- Author(s)
  Zhichao Peng, Zhi Zhu, Masashi Unoki, Jianwu Dang, Masato Akagi
- Journal Title
  
  Proc. APSIPA2017
  
  Volume: - Pages: -
- DOI
  10.1109/APSIPA.2017.8282316
- Peer Reviewed
[Journal Article] Important role of temporal cues in speaker identification for simulated cochlear implants2017
- Author(s)
  Zhi Zhu, Ryota Miyauchi, Yukiko Araki, and Masashi Unoki
- Journal Title
  
  Proc. 1st International Workshop on Challenges in Hearing Assistive Technology (CHAT-2017)
  
  Volume: - Pages: 51-55
- Peer Reviewed
[Journal Article] Feasibility of Vocal Emotion Conversion on Modulation Spectrogram for Simulated Cochlear Implants2017
- Author(s)
  Zhi Zhu, Ryota Miyauchi, Yukiko Araki, and Masashi Unoki
- Journal Title
  
  Proc. EUSIPCO2017
  
  Volume: - Pages: -
- DOI
  10.23919/EUSIPCO.2017.8081526
- Peer Reviewed
[Journal Article] Method of Blindly Estimating Speech Transmission Index in Noisy Reverberant Environments2017
- Author(s)
  Masashi Unoki, Akikazu Miyazaki, Shota Morita, and Masato Akagi
- Journal Title
  
  Journal of Information Hiding and Multimedia Signal Processing
  
  Volume: 8(6) Pages: 1430-1445
- Peer Reviewed / Open Access
[Journal Article] Study on method for protecting speech privacy by actively controlling speech transmission index in simulated room2017
- Author(s)
  Masashi Unoki, Yuta Kashihara, Maori Kobayashi, and Masato Akagi
- Journal Title
  
  Proc. APSIPA2017
  
  Volume: - Pages: -
- DOI
  10.1109/APSIPA.2017.8282212
- Peer Reviewed
[Journal Article] Robust method for estimating F0 of complex tone based on pitch perception of amplitude modulated signal2017
- Author(s)
  Kenichiro Miwa and Masashi Unoki
- Journal Title
  
  Proc. Interspeech2017
  
  Volume: - Pages: 2311-2315
- DOI
  10.21437/Interspeech.2017-1061
- Peer Reviewed
[Presentation] 発話音声を用いた骨導音声の伝達特性の分析2018
- Author(s)
  鳥谷輝樹，Peter Birkholz, 鵜木祐史
- Organizer
  電子情報通信学会応用音響研究会
[Presentation] 変調スペクトルに着目した騒音抑圧法の検討2018
- Author(s)
  磯山拓都，鵜木祐史
- Organizer
  電子情報通信学会応用音響研究会
[Presentation] 残響が雑音駆動音声の個人性・感情知覚に与える影響の検討2018
- Author(s)
  朱治, 関谷伸一, 鵜木祐史
- Organizer
  日本音響学会2018年度春季研究発表会
[Presentation] 変調スペクトルを用いた騒音低減手法の検討2018
- Author(s)
  磯山拓都，鵜木祐史
- Organizer
  日本音響学会2018年度春季研究発表会
[Presentation] End-to-end speech emotion recognition using 3-d convolutional recurrent neural networks based on modulation spectral features2018
- Author(s)
  Zhichao Peng, Zhi Zhu, Masashi Unoki, Jianwu Dang, Masato Akagi
- Organizer
  日本音響学会2018年度春季研究発表会
[Presentation] 雑音環境が駆動声の個人性・感情知覚に与える影響2018
- Author(s)
  川村美帆，朱治，鵜木祐史
- Organizer
  日本音響学会聴覚研究会
[Presentation] Study on the relationship between modulation spectral features and the perception of nonlinguistic information with noise-vocoded speech2017
- Author(s)
  Zhi Zhu, Ryota Miyauchi, Yukiko Araki, Masashi Unoki
- Organizer
  日本音響学会聴覚研究会
[Presentation] 高次統計量を用いた音声・非音声の変調スペクトルの特徴分析に関する検討2017
- Author(s)
  磯山拓都，鵜木祐史
- Organizer
  第32回信号処理（SIP）シンポジウム
[Presentation] 音声・非音声の変調スペクトルの特徴分析に関する検討2017
- Author(s)
  磯山拓都, 鵜木祐史
- Organizer
  電子情報通信学会電気音響研究会
[Presentation] 雑音駆動音声の言語・非言語知覚と室内音響特性による影響の検討2017
- Author(s)
  関谷伸一, 朱治, 鵜木祐史
- Organizer
  日本音響学会聴覚研究会
[Presentation] 了解性における音源の変調スペクトルと音環境の変調伝達関数の関係2017
- Author(s)
  小林まおり, 鵜木祐史, 赤木正人
- Organizer
  日本音響学会聴覚研究会
[Presentation] 雑音駆動音声の個人性知覚に寄与する変調周波数成分の検討2017
- Author(s)
  朱治, 宮内良太, 荒木友希子，鵜木祐史
- Organizer
  日本音響学会2017年度秋季研究発表会
[Presentation] 音声の変調スペクトルと音環境の変調伝達関数の関係が了解性に及ぼす影響2017
- Author(s)
  小林まおり，鵜木祐史，赤木正人
- Organizer
  日本音響学会2017年度秋季研究発表会
[Presentation] Speech Emotion Recognition Using MPCRNN based on Gammatone auditory filterbank2017
- Author(s)
  Zhichao Peng, Zhi Zhu, Masashi Unoki, Jianwu Dang, Masato Akagi
- Organizer
  Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2017
- Int'l Joint Research
[Presentation] Feasibility of Vocal Emotion Conversion on Modulation Spectrogram for Simulated Cochlear Implants2017
- Author(s)
  Zhi Zhu, Ryota Miyauchi, Yukiko Araki, and Masashi Unoki
- Organizer
  25th European Association for Signal Processing
- Int'l Joint Research
[Presentation] The role of spectral and temporal cues for vocal emotion recognition by cochlear implant simulations2017
- Author(s)
  Zhi Zhu, Ryota Miyauchi, Yukiko Araki, and Masashi Unoki
- Organizer
  173rd Meeting of the Acoustical Society of America and the 8th Forum Acusticum
- Int'l Joint Research
[Presentation] Study on method for protecting speech privacy by actively controlling speech transmission index in simulated room2017
- Author(s)
  Masashi Unoki, Yuta Kashihara, Maori Kobayashi, and Masato Akagi
- Organizer
  Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2017
- Int'l Joint Research
[Presentation] Robust method for estimating F0 of complex tone based on pitch perception of amplitude modulated signal2017
- Author(s)
  Kenichiro Miwa and Masashi Unoki
- Organizer
  Conference of the International Speech Communication Association 2017
- Int'l Joint Research
[Remarks] 多元質感知・公募研究D01-7
- URL
  http://shitsukan.jp/ISST/advertise/index.html

2017 Fiscal Year Annual Research Report

振幅変調の概念に基づいた聴知覚における質感認識メカニズムの理解

Principal Investigator

鵜木 祐史 北陸先端科学技術大学院大学, 先端科学技術研究科, 教授 (00343187)

Research Products

[Journal Article] Contributions of Temporal Cue on the Perception of Speaker Individuality and Vocal Emotion for Noise-Vocoded Speech2018

Author(s)

Journal Title

[Journal Article] Speech Emotion Recognition Using MPCRNN based on Gammatone auditory filterbank2017

Author(s)

Journal Title

DOI

[Journal Article] Important role of temporal cues in speaker identification for simulated cochlear implants2017

Author(s)

Journal Title

[Journal Article] Feasibility of Vocal Emotion Conversion on Modulation Spectrogram for Simulated Cochlear Implants2017

Author(s)

Journal Title

DOI

[Journal Article] Method of Blindly Estimating Speech Transmission Index in Noisy Reverberant Environments2017

Author(s)

Journal Title

[Journal Article] Study on method for protecting speech privacy by actively controlling speech transmission index in simulated room2017

Author(s)

Journal Title

DOI

[Journal Article] Robust method for estimating F0 of complex tone based on pitch perception of amplitude modulated signal2017

Author(s)

Journal Title

DOI

[Presentation] 発話音声を用いた骨導音声の伝達特性の分析2018

Author(s)

Organizer

[Presentation] 変調スペクトルに着目した騒音抑圧法の検討2018

Author(s)

Organizer

[Presentation] 残響が雑音駆動音声の個人性・感情知覚に与える影響の検討2018

Author(s)

Organizer

[Presentation] 変調スペクトルを用いた騒音低減手法の検討2018

Author(s)

Organizer

[Presentation] End-to-end speech emotion recognition using 3-d convolutional recurrent neural networks based on modulation spectral features2018

Author(s)

Organizer

[Presentation] 雑音環境が駆動声の個人性・感情知覚に与える影響2018

Author(s)

Organizer

[Presentation] Study on the relationship between modulation spectral features and the perception of nonlinguistic information with noise-vocoded speech2017

Author(s)

Organizer

[Presentation] 高次統計量を用いた音声・非音声の変調スペクトルの特徴分析に関する検討2017

Author(s)

Organizer

[Presentation] 音声・非音声の変調スペクトルの特徴分析に関する検討2017

Author(s)

Organizer

[Presentation] 雑音駆動音声の言語・非言語知覚と室内音響特性による影響の検討2017

Author(s)

Organizer

[Presentation] 了解性における音源の変調スペクトルと音環境の変調伝達関数の関係2017

Author(s)

Organizer

[Presentation] 雑音駆動音声の個人性知覚に寄与する変調周波数成分の検討2017

Author(s)

Organizer

[Presentation] 音声の変調スペクトルと音環境の変調伝達関数の関係が了解性に及ぼす影響2017

Author(s)

Organizer

[Presentation] Speech Emotion Recognition Using MPCRNN based on Gammatone auditory filterbank2017

Author(s)

Organizer

[Presentation] Feasibility of Vocal Emotion Conversion on Modulation Spectrogram for Simulated Cochlear Implants2017

Author(s)

Organizer

[Presentation] The role of spectral and temporal cues for vocal emotion recognition by cochlear implant simulations2017

Author(s)

Organizer

[Presentation] Study on method for protecting speech privacy by actively controlling speech transmission index in simulated room2017

Author(s)

鵜木祐史北陸先端科学技術大学院大学, 先端科学技術研究科, 教授 (00343187)