2016 Fiscal Year Annual Research Report

Automatic Improvement of Acoustic and Language Models of Automatic Speech Recognition through Spoken Dialogue

Research Project

Project/Area Number	15K16051
Research Institution	Osaka University
Principal Investigator	武田龍大阪大学, 産業科学研究所, 助教 (20749527)
Project Period (FY)	2015-04-01 – 2017-03-31
Keywords	音声対話 / 音響モデル / 言語モデル / メンテナンスフリー
Outline of Annual Research Achievements	本研究課題では，音声認識の各モデルに関してメンテナンスフリーな音声対話システムの構築を行った．本年度は研究課題として挙げた，a) ロボット上での音声対話システムの構築やb) 音響モデル・c) 言語モデルの基礎技術開発に取り組んだ．主な研究成果として，1) Deep Neural Network (DNN) に基づく省メモリ・高速な音響モデル，2) DNN音源定位の教師なし適応, 3) 話し言葉に対する教師なし音素列の単語分割方法の構築，の3点を挙げる． 1) では，DNNパラメータの量子化とノードプルーニングを併用することで，CPU上の演算でも95%のメモリ削減と4倍の高速化を達成した．これによりリソースが限られた環境でも DNNを効率的に利用可能となる．2) では，未学習の音環境における定位性能の向上を目指し，正解ラベルなしで適応する技術に取り組んだ．音源位置に対するロバスト性の分析や周波数領域でのパラメータ適応など，この研究で得られた知見は音響モデル適応に生かすことができる．これら2点は，「ロボット上での音声対話システムの構築」や「音響モデルの高精度化」に必要不可欠な基礎技術である． 3) に関しては，話し言葉で未知語の切り出しを教師なしで行うため，ベイズ言語モデルの1つである隠れセミマルコフモデルに基づく言語モデルの拡張を行った．音素数を連鎖確率としてモデルに組み込むことで，収束速度の改善が見られた．昨年度成果のベイズ言語モデルと併用することで，話し言葉に対する未知語の検出や確率計算が可能となり，「言語モデルの高精度化」に大きく前進した．

Research Products
(8 results)

All 2017 2016

All Journal Article (2 results) (of which Peer Reviewed: 2 results, Acknowledgement Compliant: 2 results, Open Access: 1 results) Presentation (6 results) (of which Int'l Joint Research: 4 results)

[Journal Article] Acoustic model training based on node-wise weight boundary model for fast and small-footprint deep neural networks2017
- Author(s)
  Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani
- Journal Title
  
  Computer Speech & Language
  
  Volume: 印刷中 Pages: 印刷中
- DOI
  10.1016/j.csl.2017.02.002
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Noise-robust MUSIC-based Sound Source Localization using Steering Vector Transformation for Small Humanoids2017
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Journal Title
  
  Journal of Robotics and Mechatronics
  
  Volume: 29 Pages: 26-36
- DOI
  10.20965/jrm.2017.p0026
- Peer Reviewed / Open Access / Acknowledgement Compliant
[Presentation] Unsupervised Adaptation of Deep Neural Networks for Sound Source Localization using Entropy Minimization2017
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Organizer
  IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- Place of Presentation
  New Orleans, Louisiana, USA
- Year and Date
  2017-03-07
- Int'l Joint Research
[Presentation] Discriminative Multiple Sound Source Localization based on Deep Neural Networks using Independent Location Model2016
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Organizer
  IEEE Workshop on Spoken Language Technology (SLT)
- Place of Presentation
  San Diego, California, USA
- Year and Date
  2016-12-16
- Int'l Joint Research
[Presentation] Bayesian Language Model based on Mixture of Segmental Contexts for Spontaneous Utterances with Unexpected Words2016
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Organizer
  International Conference on Computational Linguistics (COLING)
- Place of Presentation
  Osaka, Japan
- Year and Date
  2016-12-13
- Int'l Joint Research
[Presentation] 量子化 Deep Neural Network のための有界重みモデルに基づく音響モデル学習2016
- Author(s)
  武田龍, 中臺一博, 駒谷和範
- Organizer
  第46回 AIチャレンジ研究会
- Place of Presentation
  東京, 日本
- Year and Date
  2016-11-09
[Presentation] Toward Lexical Acquisition during Dialogues through Implicit Confirmation for Closed-Domain Chatbots2016
- Author(s)
  Kohei Ono, Ryu Takeda, Eric Nichols, Mikio Nakano and Kazunori Komatani
- Organizer
  Second Workshop on Chatbots and Conversational Agent Technologies (WOCHAT)
- Place of Presentation
  Los Angeles, California, USA
- Year and Date
  2016-09-20
- Int'l Joint Research
[Presentation] 方向依存活性化関数を用いた Deep Neural Network に基づく識別的音源定位2016
- Author(s)
  武田龍, 駒谷和範
- Organizer
  第112回音声言語情報処理研究会
- Place of Presentation
  山形県, 日本
- Year and Date
  2016-07-30

2016 Fiscal Year Annual Research Report

Automatic Improvement of Acoustic and Language Models of Automatic Speech Recognition through Spoken Dialogue

Principal Investigator

武田 龍 大阪大学, 産業科学研究所, 助教 (20749527)

Research Products

[Journal Article] Acoustic model training based on node-wise weight boundary model for fast and small-footprint deep neural networks2017

Author(s)

Journal Title

DOI

[Journal Article] Noise-robust MUSIC-based Sound Source Localization using Steering Vector Transformation for Small Humanoids2017

Author(s)

Journal Title

DOI

[Presentation] Unsupervised Adaptation of Deep Neural Networks for Sound Source Localization using Entropy Minimization2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Discriminative Multiple Sound Source Localization based on Deep Neural Networks using Independent Location Model2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Bayesian Language Model based on Mixture of Segmental Contexts for Spontaneous Utterances with Unexpected Words2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 量子化 Deep Neural Network のための有界重みモデルに基づく音響モデル学習2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Toward Lexical Acquisition during Dialogues through Implicit Confirmation for Closed-Domain Chatbots2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 方向依存活性化関数を用いた Deep Neural Network に基づく識別的音源定位2016

Author(s)

Organizer

Place of Presentation

Year and Date

武田龍大阪大学, 産業科学研究所, 助教 (20749527)