2016 Fiscal Year Annual Research Report

A speech interaction system created by "speech"

Research Project

Project/Area Number	26540083
Research Institution	Nagoya Institute of Technology
Principal Investigator	徳田恵一名古屋工業大学, 工学(系)研究科(研究院), 教授 (20217483)
Co-Investigator(Kenkyū-buntansha)	李晃伸名古屋工業大学, 工学(系)研究科(研究院), 教授 (80332766) 南角吉彦名古屋工業大学, 工学(系)研究科(研究院), 准教授 (80397497) 山本大介名古屋工業大学, 工学(系)研究科(研究院), 准教授 (00402470)
Project Period (FY)	2014-04-01 – 2017-03-31
Keywords	音声合成 / 音声認識 / 音声対話 / 音声インタフェース
Outline of Annual Research Achievements	本研究の目的は、音声インタフェースのコンテンツ制作において、コンテンツ制作者が「しゃべる」ことにより、その音声情報を利用してコンテンツを制作できるインタフェースの構築法を確立することである。本件研究目的を達成するための研究課題は、(1)音声からの様々な情報の獲得、(2)獲得情報のコンテンツへの反映、(3)実証実験及び有効性の検証、の3つの課題に分類することができる。当該年度は前年度に引き続き研究課題(1), (2)を進めると同時に研究課題(3)に取り組んだ。研究課題(1), (2)については近年様々な分野で高い性能を示しているディープニューラルネットワークを用いた手法の検討を行った。音声からその発話における感情、強調などの発話表現を抽象化した情報を取り出し、それらの獲得情報を利用して合成音声を生成する枠組みの研究開発に取り組んだ。また、研究課題(3)については試作システムを限定的に公開し、一般ユーザによる意見を集めた。本研究では研究課題(1)音声からの様々な情報の獲得、(2)獲得情報のコンテンツへの反映については、提案時には想定していなかったディープニューラルネットワークを利用した枠組みへと発展させることで、より抽象化された情報の獲得や高品質な合成音声の生成が可能となった。さらに、研究課題(3)実証実験及び有効性の検証については、提案システムを試作し、研究グループ内だけでなく一般ユーザの意見を集めたことで多くの知見を得ることができた。これらの成果により、手による操作を主とする従来型のインタフェースでは実現することが困難である、音声インタフェース固有の「魅力」のひとつである、生き生きとしたインタラクティブな音声対話システムの構築を実現した。

Research Products
(31 results)

All 2017 2016 Other

All Presentation (26 results) (of which Int'l Joint Research: 8 results) Remarks (5 results)

[Presentation] DNN-GMMハイブリッドモデルに基づく声質変換の検討2017
- Author(s)
  市川裕詞, 橋本佳, 大浦圭一郎, 南角吉彦, 徳田恵一
- Organizer
  日本音響学会2017年春季研究発表会
- Place of Presentation
  明治大学 (神奈川)
- Year and Date
  2017-03-15 – 2017-03-17
[Presentation] ニューラルネットワークに基づく音声合成における音響特徴量抽出条件の検討2017
- Author(s)
  村瀬栞, 橋本佳, 大浦圭一郎, 南角吉彦, 徳田恵一
- Organizer
  日本音響学会2017年春季研究発表会
- Place of Presentation
  明治大学 (神奈川)
- Year and Date
  2017-03-15 – 2017-03-17
[Presentation] ニューラルネットワーク言語モデルを用いた２パス型音声認識デコーダの実装2017
- Author(s)
  後藤良介，李晃伸
- Organizer
  日本音響学会2017年春季研究発表会
- Place of Presentation
  明治大学 (神奈川)
- Year and Date
  2017-03-15 – 2017-03-17
[Presentation] 音声対話システムからの話しかけによる対話性認知の獲得－話しかけ内容および心理特性との関連－2017
- Author(s)
  村上拓也，李晃伸
- Organizer
  日本音響学会2017年春季研究発表会講演論文集
- Place of Presentation
  明治大学 (神奈川)
- Year and Date
  2017-03-15 – 2017-03-17
[Presentation] 周辺環境とインタラクション可能な音声対話用BLEビーコンの開発2017
- Author(s)
  佐野敦志，堤修平，山本大介，高橋直久
- Organizer
  DEIM2017
- Place of Presentation
  高山グリーンホテル(岐阜)
- Year and Date
  2017-03-06 – 2017-03-08
[Presentation] 逆進検知機能を有する案内粒度変更可能な音声経路案内システム2017
- Author(s)
  浮田弥，山本大介,高橋直久
- Organizer
  DEIM2017
- Place of Presentation
  高山グリーンホテル（岐阜）
- Year and Date
  2017-03-06 – 2017-03-08
[Presentation] 簡単化された実行履歴に基づく音声対話コンテンツ編集システム2017
- Author(s)
  山口大介，堤修平，山本大介，高橋直久
- Organizer
  DEIM2017
- Place of Presentation
  高山グリーンホテル（岐阜）
- Year and Date
  2017-03-06 – 2017-03-08
[Presentation] 音声対話コンテンツの利用履歴における頻出パターン解析手法の検討2017
- Author(s)
  田中佑太朗，堤修平，山本大介，高橋直久
- Organizer
  DEIM2017
- Place of Presentation
  高山グリーンホテル（岐阜）
- Year and Date
  2017-03-06 – 2017-03-08
[Presentation] DNN音声合成における音響特徴量系列とその時間構造の同時モデル化2017
- Author(s)
  橋本佳, 大浦圭一郎, 南角吉彦, 徳田恵一,
- Organizer
  音声研究会
- Place of Presentation
  東京大学（東京）
- Year and Date
  2017-01-21 – 2017-01-21
[Presentation] オーディオブックを用いた表現豊かな音声合成のための言語特徴の検討2017
- Author(s)
  浅井千明, 沢田慶, 橋本佳, 大浦圭一郎, 南角吉彦, 徳田恵一
- Organizer
  音声研究会
- Place of Presentation
  東京大学（東京）
- Year and Date
  2017-01-21 – 2017-01-21
[Presentation] Spoken keyword detection using recurrent neural network language model2016
- Author(s)
  Shuhei Koike, Akinobu Lee
- Organizer
  5th Joint Meeting Acoustical Society of America and Acoustical Society of Japan
- Place of Presentation
  Hawaii（USA）
- Year and Date
  2016-11-28 – 2016-12-02
- Int'l Joint Research
[Presentation] 音声対話システムにおける環境および知識の共有表出と話しかけやすさの関連調査2016
- Author(s)
  興梠斗吾，李晃伸
- Organizer
  人工知能学会音声・言語理解と対話処理研究会 SIG-SLUD
- Place of Presentation
  早稲田大学（東京）
- Year and Date
  2016-10-05 – 2016-10-06
[Presentation] 話しやすい音声対話システム実現のための対人対話における心理特性の関連性調査2016
- Author(s)
  佐藤翔平，李晃伸
- Organizer
  人工知能学会音声・言語理解と対話処理研究会 SIG-SLUD
- Place of Presentation
  早稲田大学（東京）
- Year and Date
  2016-10-05 – 2016-10-06
[Presentation] The NITech text-to-speech system for the Blizzard Challenge 20162016
- Author(s)
  Kei Sawada, Chiaki Asai, Kei Hashimoto, Keiichiro Oura, and Keiichi Tokuda
- Organizer
  Blizzard Challenge 2016 Workshop
- Place of Presentation
  California(USA)
- Year and Date
  2016-09-16 – 2016-09-16
- Int'l Joint Research
[Presentation] Temporal modeling in neural network based statistical parametric speech synthesis2016
- Author(s)
  Keiichi Tokuda, Kei Hashimoto, Keiichiro Oura, and Yoshihiko Nankaku
- Organizer
  9th ISCA Speech Synthesis Workshop
- Place of Presentation
  California(USA)
- Year and Date
  2016-09-13 – 2016-09-15
- Int'l Joint Research
[Presentation] Redefining the linguistic context feature set for HMM and DNN TTS through position and parsing2016
- Author(s)
  Rasmus Dall, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
- Organizer
  Interspeech 2016
- Place of Presentation
  California(USA)
- Year and Date
  2016-09-08 – 2016-09-12
- Int'l Joint Research
[Presentation] Singing voice synthesis based on deep neural networks2016
- Author(s)
  Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
- Organizer
  Interspeech 2016
- Place of Presentation
  California(USA)
- Year and Date
  2016-09-08 – 2016-09-12
- Int'l Joint Research
[Presentation] A hierarchical predictor of synthetic speech naturalness using neural networks2016
- Author(s)
  Takenori Yoshimura, Gustav Eje Henter, Oliver Watts, Mirjam Wester, Junichi Yamagishi, and Keiichi Tokuda
- Organizer
  Interspeech 2016
- Place of Presentation
  California(USA)
- Year and Date
  2016-09-08 – 2016-09-12
- Int'l Joint Research
[Presentation] Voice conversion based on trajectory model training of neural networks considering global variance2016
- Author(s)
  Naoki Hosaka, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
- Organizer
  Interspeech 2016
- Place of Presentation
  California(USA)
- Year and Date
  2016-09-08 – 2016-09-12
- Int'l Joint Research
[Presentation] Related Word Recommendation Mechanism for Speech Dialogue System2016
- Author(s)
  Yuto Ishida, Takahiro Uchiya, Kouhei Yamamoto, Daisuke Yamamoto, Ryota Nishimura, Ichi Takumi
- Organizer
  NBiS2016
- Place of Presentation
  Ostrava(Czech Republic)
- Year and Date
  2016-09-07 – 2016-09-09
- Int'l Joint Research
[Presentation] 音声対話システムのオープンコンテンツ化実現のためのモジュール仕様および管理手法2016
- Author(s)
  山西元樹，船谷内泰斗，李晃伸
- Organizer
  情報処理学会音声言語情報処理研究会
- Place of Presentation
  天童温泉(山形)
- Year and Date
  2016-07-28 – 2016-07-30
[Presentation] ユーザフレンドリィな音声対話システム実現のためのユーザ話速および発話内容に基づくシステム話速制御手法の検討2016
- Author(s)
  三原寛哉，李晃伸
- Organizer
  情報処理学会音声言語情報処理研究会
- Place of Presentation
  天童温泉(山形)
- Year and Date
  2016-07-28 – 2016-07-30
[Presentation] Android端末のための実効履歴を用いた音声対話コンテンツ編集システム2016
- Author(s)
  山口大介, 堤修平, 山本大介, 高橋直久
- Organizer
  DICOMO2016
- Place of Presentation
  鳥羽シーサイドホテル(三重)
- Year and Date
  2016-07-06 – 2016-07-08
[Presentation] 領域グラフと利用者の位置に基づく音声対話シナリオ更新手法2016
- Author(s)
  田中亮佑, 堤修平, 山本大介, 高橋直久
- Organizer
  DICOMO2016
- Place of Presentation
  鳥羽シーサイドホテル(三重)
- Year and Date
  2016-07-06 – 2016-07-08
[Presentation] 子供の負担を考慮した子供音声収集システム2016
- Author(s)
  河原誠斗, 堤修平, 山本大介, 高橋直久
- Organizer
  DICOMO2016
- Place of Presentation
  鳥羽シーサイドホテル(三重)
- Year and Date
  2016-07-06 – 2016-07-08
[Presentation] BLEビーコンを用いた学内見学支援システムの改善2016
- Author(s)
  佐藤清隆, 打矢隆弘, 山本大介, 内匠逸
- Organizer
  DICOMO2016
- Place of Presentation
  鳥羽シーサイドホテル(三重)
- Year and Date
  2016-07-06 – 2016-07-08
[Remarks] 音声対話システム構築ツールキットMMDAgent
- URL
  http://www.mmdagent.jp/
[Remarks] HMM音声合成ツールキットHTS
- URL
  http://hts.sp.nitech.ac.jp/
[Remarks] 音声信号処理ツールキットSPTK
- URL
  http://sp-tk.sourceforge.net/
[Remarks] HMM音声合成エンジンhts_engine API
- URL
  http://hts-engine.sourceforge.net/
[Remarks] 日本語テキスト音声合成システムOpen JTalk
- URL
  http://open-jtalk.sourceforge.net/

2016 Fiscal Year Annual Research Report

A speech interaction system created by "speech"

Principal Investigator

徳田 恵一 名古屋工業大学, 工学(系)研究科(研究院), 教授 (20217483)

Research Products

[Presentation] DNN-GMMハイブリッドモデルに基づく声質変換の検討2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ニューラルネットワークに基づく音声合成における音響特徴量抽出条件の検討2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ニューラルネットワーク言語モデルを用いた２パス型音声認識デコーダの実装2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声対話システムからの話しかけによる対話性認知の獲得 －話しかけ内容および心理特性との関連－2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 周辺環境とインタラクション可能な音声対話用BLEビーコンの開発2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 逆進検知機能を有する案内粒度変更可能な音声経路案内システム2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 簡単化された実行履歴に基づく音声対話コンテンツ編集システム2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声対話コンテンツの利用履歴における頻出パターン解析手法の検討2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] DNN音声合成における音響特徴量系列とその時間構造の同時モデル化2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] オーディオブックを用いた表現豊かな音声合成のための言語特徴の検討2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Spoken keyword detection using recurrent neural network language model2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声対話システムにおける環境および知識の共有表出と話しかけやすさの関連調査2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 話しやすい音声対話システム実現のための対人対話における心理特性の関連性調査2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] The NITech text-to-speech system for the Blizzard Challenge 20162016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Temporal modeling in neural network based statistical parametric speech synthesis2016

Author(s)

Organizer

Place of Presentation

Year and Date

徳田恵一名古屋工業大学, 工学(系)研究科(研究院), 教授 (20217483)

[Presentation] 音声対話システムからの話しかけによる対話性認知の獲得－話しかけ内容および心理特性との関連－2017