Automatic Improvement of Acoustic and Language Models of Automatic Speech Recognition through Spoken Dialogue

Research Project

Project/Area Number	15K16051
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Multi-year Fund
Research Field	Intelligent informatics
Research Institution	Osaka University
Principal Investigator	Takeda Ryu 大阪大学, 産業科学研究所, 助教 (20749527)
Project Period (FY)	2015-04-01 – 2017-03-31
Project Status	Completed (Fiscal Year 2016)
Budget Amount *help	¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000) Fiscal Year 2016: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000) Fiscal Year 2015: ¥2,600,000 (Direct Cost: ¥2,000,000、Indirect Cost: ¥600,000)
Keywords	音声対話 / 音声認識 / 音響モデル / 言語モデル / メンテナンスフリー
Outline of Final Research Achievements	Our purpose is the development of a spoken dialogue system of which models used in speech recognition is maintenance-free. The main issues are the development of spoken dialogue systems on robots, and the development of essential technologies on acoustic model and language model. The main outcomes are 1) the development of the acoustic model based on DNN and its adaptation method, 2) the development of the un-supervised segmentation of phoneme sequences for spontaneous utterances, 3) the development of the dialogue strategy for unknown word acquisition using implicit confirmation requests, and 4) the development of sound source localization method based on DNN for human-robot interaction.

Report

(3 results)

2016 Annual Research Report Final Research Report ( PDF )
2015 Research-status Report

Research Products
(11 results)

All 2017 2016 2015

All Journal Article (2 results) (of which Peer Reviewed: 2 results, Acknowledgement Compliant: 2 results, Open Access: 1 results) Presentation (9 results) (of which Int'l Joint Research: 7 results)

[Journal Article] Noise-Robust MUSIC-Based Sound Source Localization Using Steering Vector Transformation for Small Humanoids2017
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Journal Title
  
  Journal of Robotics and Mechatronics
  
  Volume: 29 Issue: 1 Pages: 26-36
- DOI
  10.20965/jrm.2017.p0026
- NAID
  130007519770
- ISSN
  0915-3942, 1883-8049
- Year and Date
  2017-02-20
- Related Report
  2016 Annual Research Report
- Peer Reviewed / Open Access / Acknowledgement Compliant
[Journal Article] Acoustic model training based on node-wise weight boundary model for fast and small-footprint deep neural networks2017
- Author(s)
  Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani
- Journal Title
  
  Computer Speech & Language
  
  Volume: 印刷中 Pages: 461-480
- DOI
  10.1016/j.csl.2017.02.002
- Related Report
  2016 Annual Research Report
- Peer Reviewed / Acknowledgement Compliant
[Presentation] Unsupervised Adaptation of Deep Neural Networks for Sound Source Localization using Entropy Minimization2017
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Organizer
  IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- Place of Presentation
  New Orleans, Louisiana, USA
- Year and Date
  2017-03-07
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] Discriminative Multiple Sound Source Localization based on Deep Neural Networks using Independent Location Model2016
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Organizer
  IEEE Workshop on Spoken Language Technology (SLT)
- Place of Presentation
  San Diego, California, USA
- Year and Date
  2016-12-16
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] Bayesian Language Model based on Mixture of Segmental Contexts for Spontaneous Utterances with Unexpected Words2016
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Organizer
  International Conference on Computational Linguistics (COLING)
- Place of Presentation
  Osaka, Japan
- Year and Date
  2016-12-13
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] 量子化 Deep Neural Network のための有界重みモデルに基づく音響モデル学習2016
- Author(s)
  武田龍, 中臺一博, 駒谷和範
- Organizer
  第46回 AIチャレンジ研究会
- Place of Presentation
  東京, 日本
- Year and Date
  2016-11-09
- Related Report
  2016 Annual Research Report
[Presentation] Toward Lexical Acquisition during Dialogues through Implicit Confirmation for Closed-Domain Chatbots2016
- Author(s)
  Kohei Ono, Ryu Takeda, Eric Nichols, Mikio Nakano and Kazunori Komatani
- Organizer
  Second Workshop on Chatbots and Conversational Agent Technologies (WOCHAT)
- Place of Presentation
  Los Angeles, California, USA
- Year and Date
  2016-09-20
- Related Report
  2016 Annual Research Report
- Int'l Joint Research
[Presentation] 方向依存活性化関数を用いた Deep Neural Network に基づく識別的音源定位2016
- Author(s)
  武田龍, 駒谷和範
- Organizer
  第112回音声言語情報処理研究会
- Place of Presentation
  山形県, 日本
- Year and Date
  2016-07-30
- Related Report
  2016 Annual Research Report
[Presentation] Sound Source Localization based on Deep Neural Networks with Directional Activate Function Exploiting Phase Information2016
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Organizer
  IEEE International Conference on Acoustics, Speech and Signal Processing
- Place of Presentation
  Shanghai, China
- Year and Date
  2016-03-23
- Related Report
  2015 Research-status Report
- Int'l Joint Research
[Presentation] Acoustic Model Training based on Node-wise Weight Boundary Model Increasing Speed of Discrete Neural Networks2015
- Author(s)
  Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani
- Organizer
  IEEE Automatic Speech Recognition and Understanding Workshop
- Place of Presentation
  Scottsdale, Arizona, USA
- Year and Date
  2015-12-14
- Related Report
  2015 Research-status Report
- Int'l Joint Research
[Presentation] Performance Comparison of MUSIC-based Sound Localization Methods on Small Humanoid under Low SNR Conditions2015
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Organizer
  IEEE-RAS International Conference on Humanoid Robots
- Place of Presentation
  Seoul, Korea
- Year and Date
  2015-11-04
- Related Report
  2015 Research-status Report
- Int'l Joint Research

Automatic Improvement of Acoustic and Language Models of Automatic Speech Recognition through Spoken Dialogue

Principal Investigator

Takeda Ryu 大阪大学, 産業科学研究所, 助教 (20749527)

¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)

Report

Research Products

[Journal Article] Noise-Robust MUSIC-Based Sound Source Localization Using Steering Vector Transformation for Small Humanoids2017

Author(s)

Journal Title

DOI

NAID

ISSN

Year and Date

Related Report

[Journal Article] Acoustic model training based on node-wise weight boundary model for fast and small-footprint deep neural networks2017

Author(s)

Journal Title

DOI

Related Report

[Presentation] Unsupervised Adaptation of Deep Neural Networks for Sound Source Localization using Entropy Minimization2017

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Discriminative Multiple Sound Source Localization based on Deep Neural Networks using Independent Location Model2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Bayesian Language Model based on Mixture of Segmental Contexts for Spontaneous Utterances with Unexpected Words2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 量子化 Deep Neural Network のための有界重みモデルに基づく音響モデル学習2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Toward Lexical Acquisition during Dialogues through Implicit Confirmation for Closed-Domain Chatbots2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] 方向依存活性化関数を用いた Deep Neural Network に基づく識別的音源定位2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Sound Source Localization based on Deep Neural Networks with Directional Activate Function Exploiting Phase Information2016

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Acoustic Model Training based on Node-wise Weight Boundary Model Increasing Speed of Discrete Neural Networks2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Performance Comparison of MUSIC-based Sound Localization Methods on Small Humanoid under Low SNR Conditions2015

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report