2013 Fiscal Year Final Research Report

Study on spoken language understanding framework integrating knowkedges among multiple layers

Research Project

Project/Area Number	21300066
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Perception information processing/Intelligent robotics
Research Institution	Nagoya Institute of Technology
Principal Investigator	LEE Akinobu 名古屋工業大学, 工学(系)研究科(研究院), 准教授 (80332766)
Co-Investigator(Kenkyū-buntansha)	KOMATANI Kazunori 名古屋大学, 工学(系)研究科(研究院), 准教授 (40362579) NANJO Hiroaki 龍谷大学, 理工学部, 助教 (50388162) NISIMURA Ryuuichi 和歌山大学, システム工学部, 助教 (00379611) NISHIDA Masafumi 同志社大学, 理工学部, 准教授 (80361442) SHINOZAKI Takahiro 東京工業大学, 総合理工学研究科(研究院), 准教授 (80447903) AKITA Yuya 京都大学, 学内共同利用施設等, 助教 (90402742)
Project Period (FY)	2009-04-01 – 2014-03-31
Keywords	音声認識 / 音声言語理解 / 音声対話 / 音声信号処理
Research Abstract	This study focuses on developing a framework that integrates handling of multiple knowledge layer from speech signal processing to spoken language understanding directly into speech recognition process in a statistical mannar. Statistical models at layers of language model, acoustic model and dialogue model are widely investigated. For integration, speech decoding based on Bayes-risk minimization in which all the constraint can be expressed as Bayes risk, and some integration methods that utilizes speech information for dialogue management and turn taking was investigated. Part of the results are publicly available as part of an open-source voice interaction building tool MMDAgent and Julius.

Research Products
(22 results)

All 2014 2013 2012 2011 2010 2009

All Journal Article (6 results) (of which Peer Reviewed: 6 results) Presentation (16 results) (of which Invited: 5 results)

[Journal Article] オープンソース音声認識エンジンJulius へのベイズリスク最小化機能の実装と評価2013
- Author(s)
  南條浩輝, 古谷遼, 西田昌史
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: Vol.J96-D, No.10 Pages: 2530-2539
- Peer Reviewed
[Journal Article] 音声入力型情報探索におけるベイズリスク最小化音声認識のための単語重要度の自動推定2013
- Author(s)
  古谷遼, 七里崇, 南條浩輝
- Journal Title
  
  情報処理学会論文誌
  
  Volume: Vol.54, No.7 Pages: 1967-1977
- Peer Reviewed
[Journal Article] 講演に対する読点の複数アノテーションに基づく自動挿入2013
- Author(s)
  秋田祐哉
- Journal Title
  
  情報処理学会論文
  
  Volume: Vol.54, No.2 Pages: 463-470
- Peer Reviewed
[Journal Article] 文単位で分割されたテキストで学習した言語モデルによる単語信頼度を用いた文境界検出2011
- Author(s)
  鈴木伸尚, 西田昌史, 山本誠一
- Journal Title
  
  第10回情報科学技術フォーラム(FIT)講演論文集
  
  Volume: 第2分冊 Pages: 35-38
- Peer Reviewed
[Journal Article] Statistical transformation of language and pronunciation models for spontaneous speech recognition2010
- Author(s)
  Yuya Akita
- Journal Title
  
  IEEE Trans. Audio, Speech & Language Process
  
  Volume: 18巻 Pages: 1539-1549
- Peer Reviewed
[Journal Article] An Efficient Prosody Application to HMM-based Speech Synthesis2010
- Author(s)
  Hosan Kamiyama, Takahiro Shinozaki, Koji Iwano and Sadaoki Furui
- Journal Title
  
  Proc. Asia‐ Pacific Signal and Information Processing Association (APSIPA)
  
  Volume: 1巻 Pages: 82-85
- Peer Reviewed
[Presentation] 「音声認識」は今後こうなる!2014
- Author(s)
  篠崎隆宏
- Organizer
  SIG-SLP 第100回記念シンポジウム
- Place of Presentation
  伊豆長岡温泉ホテルサンバレー富士見
- Year and Date
  20140131-0201
- Invited
[Presentation] Hierarchical Utterance Understanding for Robust Human-Robot Spoken Dialogues2014
- Author(s)
  Kazunori Komatani
- Organizer
  International Workshop on Spoken Dialogue Systems (IWSDS2014)
- Place of Presentation
  Napa, California, US
- Year and Date
  20140118-20
- Invited
[Presentation] Restoring Incorrectly Segmented Keywords and Turn-Taking Caused by Short Pauses2014
- Author(s)
  Kazunori Komatani, Naoki Hotta, Satoshi Sato
- Organizer
  International Workshop on Spoken Dialogue Systems (IWSDS2014)
- Place of Presentation
  Napa, California, US
- Year and Date
  20140118-20
[Presentation] MMDAgent - A Fully Open-Source Toolkit for Voice Interaction Systems2013
- Author(s)
  Akinobu Lee, Keiichiro Oura, Keiichi Tokuda
- Organizer
  IEEE ICASSP2013
- Place of Presentation
  Vancouver, BC, Canada
- Year and Date
  20130526-31
[Presentation] ユーザ参加型双方向音声案内デジタルサイネージシステムの開発・設置・運用事例2013
- Author(s)
  徳田恵一
- Organizer
  日本音響学会研究発表会
- Place of Presentation
  東京工科大学
- Year and Date
  20130313-15
- Invited
[Presentation] 音声対話システムのさらなる普及には何が必要か2013
- Author(s)
  李晃伸
- Organizer
  第95回音声言語情報処理研究会SIG-SLP(第3回対話システムシンポジウム)パネルディスカッション
- Place of Presentation
  静岡県熱海市
- Year and Date
  20130201-02
- Invited
[Presentation] Pipeline Decomposition of Speech Decoders and Their Implementation Based on Delayed Evaluation2012
- Author(s)
  Takahiro Shinozaki
- Organizer
  APSIPA Annual Summit and Conference 2012
- Place of Presentation
  Hollywood, California, US
- Year and Date
  20121203-06
[Presentation] Detecting child speaker based on auditory feature vectors for VTL estimation2012
- Author(s)
  Ryuichi Nisimura
- Organizer
  APSIPA Annual Summit and Conference 2012
- Place of Presentation
  Hollywood, California, USA
- Year and Date
  20121203-06
[Presentation] Automatic transcription of lecture speech using language model based on speaking-style transformation of proceeding texts2012
- Author(s)
  Yuya Akita
- Organizer
  INTERSPEECH 2012
- Place of Presentation
  Portland, Oregon, US
- Year and Date
  20120909-13
[Presentation] 音声対話システム技術の現状と課題2012
- Author(s)
  駒谷和範
- Organizer
  電気関係学会東海支部連合大会
- Place of Presentation
  静岡大学
- Year and Date
  2012-09-25
- Invited
[Presentation] Developing a method to build Japanese speech recognition system based on 3-gram language model expansion with Google database2011
- Author(s)
  Toshiaki Shimada, Ryuichi Nisimura, Masayasu Tanaka, Hideki Kawahara, Toshio Irino
- Organizer
  ICISS2011 (2011 IEEE International Conference on Intelligent Computing and Integrated Systems)
- Place of Presentation
  Guilin, China
- Year and Date
  2011-10-26
[Presentation] Topic-Dependent Language Modeling for VoiceWeb Systems2009
- Author(s)
  Kentaro Suzuta, Ryuichi Nisimura., et al
- Organizer
  WESPAC X 2009
- Place of Presentation
  Beijing, China
- Year and Date
  2009-09-23
[Presentation] Ranking Help Message Candidates Based on Robust Grammar Verification Results and Utterance History in Spoken Dialogue Systems2009
- Author(s)
  Kazunori Komatani., 他4名
- Organizer
  10th Annual SIGDIAL Meeting on Discourse and Dialogue
- Place of Presentation
  London, UK
- Year and Date
  2009-09-12
[Presentation] Automatic Transcription System for Meetings of the Japanese National Congres2009
- Author(s)
  Yuya Akita
- Organizer
  ISCA Interspeech 2009
- Place of Presentation
  Brighton Centre, Brighton, UK
- Year and Date
  2009-09-07
[Presentation] Automatic Transcription System for Meetings of the Japanese National Congress2009
- Author(s)
  Yuya Akita
- Organizer
  ISCA Interspeech 2009
- Place of Presentation
  Brighton Centre, Brighton, UK
- Year and Date
  2009-09-07
[Presentation] Topic-Dependent Language Modeling for VoiceWeb Systems2009
- Author(s)
  Ryuichi Nisimura
- Organizer
  HCI International 2009
- Place of Presentation
  San Diego, CA, US
- Year and Date
  2009-07-22

2013 Fiscal Year Final Research Report

Study on spoken language understanding framework integrating knowkedges among multiple layers

Principal Investigator

LEE Akinobu 名古屋工業大学, 工学(系)研究科(研究院), 准教授 (80332766)

Research Products

[Journal Article] オープンソース音声認識エンジンJulius へのベイズリスク最小化機能の実装と評価2013

Author(s)

Journal Title

[Journal Article] 音声入力型情報探索におけるベイズリスク最小化音声認識のための単語重要度の自動推定2013

Author(s)

Journal Title

[Journal Article] 講演に対する読点の複数アノテーションに基づく自動挿入2013

Author(s)

Journal Title

[Journal Article] 文単位で分割されたテキストで学習した言語モデルによる単語信頼度を用いた文境界検出2011

Author(s)

Journal Title

[Journal Article] Statistical transformation of language and pronunciation models for spontaneous speech recognition2010

Author(s)

Journal Title

[Journal Article] An Efficient Prosody Application to HMM-based Speech Synthesis2010

Author(s)

Journal Title

[Presentation] 「音声認識」は今後こうなる!2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Hierarchical Utterance Understanding for Robust Human-Robot Spoken Dialogues2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Restoring Incorrectly Segmented Keywords and Turn-Taking Caused by Short Pauses2014

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] MMDAgent - A Fully Open-Source Toolkit for Voice Interaction Systems2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] ユーザ参加型双方向音声案内デジタルサイネージシステムの開発・設置・運用事例2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声対話システムのさらなる普及には何が必要か2013

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Pipeline Decomposition of Speech Decoders and Their Implementation Based on Delayed Evaluation2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Detecting child speaker based on auditory feature vectors for VTL estimation2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Automatic transcription of lecture speech using language model based on speaking-style transformation of proceeding texts2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声対話システム技術の現状と課題2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Developing a method to build Japanese speech recognition system based on 3-gram language model expansion with Google database2011

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Topic-Dependent Language Modeling for VoiceWeb Systems2009

Author(s)