認知モデルと常識ベースに基づく情動を含む音声コミュニケーション

Research Project

Project/Area Number	08F08049
Research Category	Grant-in-Aid for JSPS Fellows
Allocation Type	Single-year Grants
Section	外国
Research Field	Perception information processing/Intelligent robotics
Research Institution	The University of Tokyo
Principal Investigator	広瀬啓吉 The University of Tokyo, 大学院・情報理工学系研究科, 教授
Co-Investigator(Kenkyū-buntansha)	SHAIKH Mostata Al Masum 東京大学, 大学院・情報理工学系研究科, 外国人特別研究員 SHAIKH Mostafa Al Masum 東京大学, 大学院・情報理工学系研究科, 外国人特別研究員
Project Period (FY)	2008 – 2009
Project Status	Completed (Fiscal Year 2009)
Budget Amount *help	¥1,600,000 (Direct Cost: ¥1,600,000) Fiscal Year 2009: ¥800,000 (Direct Cost: ¥800,000) Fiscal Year 2008: ¥800,000 (Direct Cost: ¥800,000)
Keywords	情動・感性 / 認知モデル / 音声合成 / 韻律 / 基本周波数 / 発話速度 / 情動判別 / Life Logging / Support Vector Machine
Research Abstract	昨年度、文の情動の程度を数値として表し、そこに含まれる感情の指標を抽出することを進めた。本年度は、その手法を高度化するとともに、得られる指標を合成音声に反映させることを中心に研究を進め、下記成果を達成した。 1.ニュース文について、動詞に着目して各句の肯定/否定の程度を評点として数値化した上で、順接、逆節といった句間の関係から、文全体の肯定/否定の程度を評点として与える手法を開発した。評点を用いて、英語音声合成フリーウェアのMARY音声合成システムの韻律を制御することを行った。お祭りのニュースなど、文内容が肯定的な場合は基本周波数/発話速度を上げ、事故のような、否定的な場合は、下げることを基本とする制御を行うことにより、文内容にふさわしい合成音声を得た。 2.認知モデルの立場から、喜び、悲しみなどの感情を、肯定/否定、興奮/抑制といった軸によって定式化し、文内容に含まれる感性情報を抽出する手法を開発した。肯定/否定、興奮/抑制の値によりMARY音声合成システムの韻律を制御することを行い、合成音声の聴取実験により抽出した感情が適切に反映されることを確認した。 3.音声からそこに含まれる情動/感性を抽出する手法について、音響部分の構築として、スペクトルの周波数と時間方向の変化の特徴と韻律的特徴を用い、Support Vector Machine等による判別を行うことで、定型文に限定されているが、肯定と否定の情動の判別率90%を達成した。 4.人間が生活する際に発生する種々の音から、人間の活動を推定する手法(Life Logging)の開発を進めた。音声認識で使われているMFCCを特徴量としたHMMを用いることで良好な音認識が可能なことを示した。

Report

(2 results)

2009 Annual Research Report
2008 Annual Research Report

Research Products
(8 results)

All 2010 2009 2008

All Journal Article (6 results) (of which Peer Reviewed: 6 results) Presentation (2 results)

[Journal Article] Easy Living in the Virtual World : A Noble Approach to Integrate Real World Activities to Virtual Worlds2010
- Author(s)
  Mostafa A1 Masum Shaikh
- Journal Title
  
  International Journal of Web Intelligence and Agent Systems 1(印刷中)(掲載確定)
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Improving TTS Synthesis fbr Emotional Expressivity by a Prosodic Parameterization of Affect based on Linguistic Analysis2010
- Author(s)
  Mostafa A1 Masum Shaikh
- Journal Title
  
  Proceedings of INTERSPEECH 2009 1(印刷中)(掲載確定)
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Emotional Speech Synthesis by Sensing Affective Information from Text2009
- Author(s)
  Mostafa A1 Masum Shaikh
- Journal Title
  
  Proc. Int'l Conf. on Affective Computing and Intelligent Interaction 1
  
  Pages: 466-471
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Assigning suitable phrasal tones and pitch accents by sensing affective information from text to synthesize human-like speech2008
- Author(s)
  Mostafa Al Masum Shaikh
- Journal Title
  
  Proceedings of Interspeech 1(CD-ROM)
  
  Pages: 326-329
- Related Report
  2008 Annual Research Report
- Peer Reviewed
[Journal Article] An Approach for ambient communication by detecting real-world activities from environmental sound cues2008
- Author(s)
  Mostafa Al Masum Shaikh
- Journal Title
  
  Proceedings of Internet/WWW 1(CD-ROM)
  
  Pages: 504-507
- Related Report
  2008 Annual Research Report
- Peer Reviewed
[Journal Article] Automatic life-logging : A novel approach to sense real-world activities by environmental sound cues and common sense2008
- Author(s)
  Mostafa Al Masum Shaikh
- Journal Title
  
  Proceedings of 11th International Conference on Computer and Information Technology 1(CD-ROM)
  
  Pages: 294-299
- Related Report
  2008 Annual Research Report
- Peer Reviewed
[Presentation] How to Improve TTS Systems for Emotional Exprcssivity2009
- Author(s)
  Antonio Rui Ferreira Rebordao
- Organizer
  INTERSPEECH 2009
- Place of Presentation
  Brighton Center, Brighton, U.K.
- Year and Date
  2009-09-07
- Related Report
  2009 Annual Research Report
[Presentation] Affective speech based interaction in pervasive applications2008
- Author(s)
  Mostafa Al Masum Shaikh
- Organizer
  日本音響学会
- Place of Presentation
  九州大学
- Year and Date
  2008-09-10
- Related Report
  2008 Annual Research Report

認知モデルと常識ベースに基づく情動を含む音声コミュニケーション

Principal Investigator

広瀬 啓吉 The University of Tokyo, 大学院・情報理工学系研究科, 教授

¥1,600,000 (Direct Cost: ¥1,600,000)

Report

Research Products

[Journal Article] Easy Living in the Virtual World : A Noble Approach to Integrate Real World Activities to Virtual Worlds2010

Author(s)

Journal Title

Related Report

[Journal Article] Improving TTS Synthesis fbr Emotional Expressivity by a Prosodic Parameterization of Affect based on Linguistic Analysis2010

Author(s)

Journal Title

Related Report

[Journal Article] Emotional Speech Synthesis by Sensing Affective Information from Text2009

Author(s)

Journal Title

Related Report

[Journal Article] Assigning suitable phrasal tones and pitch accents by sensing affective information from text to synthesize human-like speech2008

Author(s)

Journal Title

Related Report

[Journal Article] An Approach for ambient communication by detecting real-world activities from environmental sound cues2008

Author(s)

Journal Title

Related Report

[Journal Article] Automatic life-logging : A novel approach to sense real-world activities by environmental sound cues and common sense2008

Author(s)

Journal Title

Related Report

[Presentation] How to Improve TTS Systems for Emotional Exprcssivity2009

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Affective speech based interaction in pervasive applications2008

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

広瀬啓吉 The University of Tokyo, 大学院・情報理工学系研究科, 教授