• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Improvement of large vocabulary speech recognition performance based on high-precision lexical prosody prediction

Research Project

Project/Area Number 25540064
Research Category

Grant-in-Aid for Challenging Exploratory Research

Allocation TypeMulti-year Fund
Research Field Perceptual information processing
Research InstitutionThe University of Tokyo

Principal Investigator

Minematsu Nobuaki  東京大学, 工学(系)研究科(研究院), 教授 (90273333)

Project Period (FY) 2013-04-01 – 2016-03-31
Project Status Completed (Fiscal Year 2015)
Budget Amount *help
¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000)
Fiscal Year 2014: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2013: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Keywords音声認識 / 韻律的特徴 / アクセント句境界 / アクセント核位置 / リランキング / Average perceptron / CRF / 構造的表象 / 仮説探索 / アクセント核
Outline of Final Research Achievements

Japanese has unique characteristics where lexical prosody often vary when words are combined together. In speech recognition research, re-ranking is often used to re-evaluate multiple recognition hypotheses generated from a recognizer and determine the final one. In re-ranking, it is expected that, by comparing lexical prosody predicted from each of the hypotheses and that estimated from an input utterance, better re-ranking is made possible. We implemented successfully 1) lexical prosody prediction from hypotheses and 2) re-ranking of hypotheses based on lexical prosody but it was found to be extremely difficult to build a module that can estimate lexical prosody information precisely only from an utterance. Then, we turned into another strategy of applying quasi-prosody to re-ranking. In the new strategy, structural features are predicted from hypotheses and are also estimated from an input utterance. Experiments showed a high effectiveness of structural re-ranking.

Report

(4 results)
  • 2015 Annual Research Report   Final Research Report ( PDF )
  • 2014 Research-status Report
  • 2013 Research-status Report
  • Research Products

    (10 results)

All 2015 2014 2013

All Journal Article (6 results) (of which Peer Reviewed: 6 results,  Open Access: 1 results) Presentation (4 results)

  • [Journal Article] Discriminative re-ranking for automatic recognition by leveraging invariant structures2015

    • Author(s)
      M. Suzuki, G. Kurata, M. Nishimura, N. Minematsu
    • Journal Title

      Speech Communication

      Volume: 72 Pages: 208-217

    • DOI

      10.1016/j.specom.2015.06.007

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] 基本周波数パターン生成過程モデルのモデルパラメータ自動推定とHMM音声合成への適用2015

    • Author(s)
      橋本浩弥,齋藤大輔,峯松信明,広瀬啓吉
    • Journal Title

      電子情報通信学会和文論文誌,

      Volume: J98-D Pages: 481-491

    • Related Report
      2014 Research-status Report
    • Peer Reviewed
  • [Journal Article] Leveraging phonetic context dependent invariant structure for continous speech recognition2014

    • Author(s)
      C. Zhang, M. Suzuki, G. Kurata, M. Nishimura, N. Minematsu
    • Journal Title

      oc. IEEE China Summit & International Conference on Signal and Information Processing

      Volume: 1 Pages: 52-56

    • DOI

      10.1109/chinasip.2014.6889200

    • Related Report
      2014 Research-status Report
    • Peer Reviewed
  • [Journal Article] Semi-supervised noise dictionary adaptation for exemplar-based noise robust speech recognition2014

    • Author(s)
      Y. Luan, D. Saito, Y. Kashiwagi, N. Minematsu, K. Hirose
    • Journal Title

      Proc. ICASSP

      Volume: 1 Pages: 1764-1767

    • Related Report
      2014 Research-status Report
    • Peer Reviewed
  • [Journal Article] Discriminative piecewise linear transformation based on deep learning for noise robust automatic speech recognition2013

    • Author(s)
      Y. Kashiwagi, D. Saito, N. Minematsu, K. Hirose
    • Journal Title

      Proc. ASRU

      Volume: 1 Pages: 350-355

    • Related Report
      2013 Research-status Report
    • Peer Reviewed
  • [Journal Article] 条件付き確率場を用いた日本語東京方言のアクセント結合自動推定2013

    • Author(s)
      鈴木雅之,黒岩龍,印南佳祐,小林俊平,清水信哉,峯松信明,広瀬啓吉
    • Journal Title

      電子情報通信学会論文誌

      Volume: J96-D Pages: 644-654

    • NAID

      110009593032

    • Related Report
      2013 Research-status Report
    • Peer Reviewed
  • [Presentation] 識別的アプローチによる分布間距離推定の検討とその言語識別への応用2015

    • Author(s)
      柏木陽祐,齋藤大輔,峯松信明,広瀬啓吉
    • Organizer
      電子情報通信学会音声研究会資料
    • Place of Presentation
      かたくら諏訪湖ホテル(長野県諏訪市)
    • Year and Date
      2015-07-16
    • Related Report
      2015 Annual Research Report
  • [Presentation] 制約付き話者コードの同時推定によるニューラルネット音響モデルの話者正規化学習2014

    • Author(s)
      木陽佑,齋藤大輔,峯松信明,広瀬啓吉
    • Organizer
      日本音響学会
    • Place of Presentation
      北海学園大学(北海道・札幌)
    • Year and Date
      2014-09-03
    • Related Report
      2014 Research-status Report
  • [Presentation] CRFによる日本語東京方言アクセント変化推定の改善2014

    • Author(s)
      橋本浩弥,峯松信明,広瀬啓吉
    • Organizer
      日本音響学会春季研究発表会
    • Place of Presentation
      東京,日本大学
    • Related Report
      2013 Research-status Report
  • [Presentation] Deep Learningに基づくクリーン音声状態識別による雑音環境下音声認識2013

    • Author(s)
      柏木陽佑,齋藤大輔,峯松信明,広瀬啓吉
    • Organizer
      日本音響学会秋季研究発表会
    • Place of Presentation
      愛知,豊橋技術科学大学
    • Related Report
      2013 Research-status Report

URL: 

Published: 2014-07-25   Modified: 2019-07-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi