• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Constraint Free Training of Speech Recognition Systems Based on Full Bayes Modeling

Research Project

Project/Area Number 17K20001
Research Category

Grant-in-Aid for Challenging Research (Exploratory)

Allocation TypeMulti-year Fund
Research Field Human informatics and related fields
Research InstitutionTokyo Institute of Technology

Principal Investigator

Shinozaki Takahiro  東京工業大学, 工学院, 准教授 (80447903)

Co-Investigator(Kenkyū-buntansha) 持橋 大地  統計数理研究所, 数理・推論研究系, 准教授 (80418508)
Project Period (FY) 2017-06-30 – 2020-03-31
Project Status Completed (Fiscal Year 2019)
Budget Amount *help
¥6,240,000 (Direct Cost: ¥4,800,000、Indirect Cost: ¥1,440,000)
Fiscal Year 2018: ¥3,250,000 (Direct Cost: ¥2,500,000、Indirect Cost: ¥750,000)
Fiscal Year 2017: ¥2,990,000 (Direct Cost: ¥2,300,000、Indirect Cost: ¥690,000)
Keywords音声認識 / 教師なし学習 / 半教師あり学習 / 強化学習 / ノンパラメトリックベイズ法 / 発音辞書 / 音声等認識 / 機械学習
Outline of Final Research Achievements

The dependency on supervised learning using paired data is a major bottle-neck of current speech recognition systems. The goal of this research is to improve the flexibility of the system learning by using unpaired data. We have proposed a method to automatically extend the pronunciation dictionary from unmatched phoneme data and text data by applying the nonparametric Bayes method and weighted finite transducer. We have also worked on reinforcement learning of speech recognition systems by formulating the whole encoder-decoder based system as a policy function. We have shown that our proposed reinforcement learning methods significantly improve learning efficiency.

Academic Significance and Societal Importance of the Research Achievements

人間は成長の過程でほとんど無意識のうちに平均して一日5単語以上を学習する優れた言語学習能力を持っている。それに対して現在の音声認識システムは教師あり学習に頼っておりシステム開発に多大な手間を必要とするとともに、日々生み出される新しい単語や小さなコミュニティ内でのみ使用される表現などを自動的に学習する能力を欠いている問題がある。人と機械の間での自然な音声対話の実現を目指し、本研究では自律的な学習技術の実現に取り組んだ。従来の教師あり学習に代わる教師なし学習や強化学習による学習手法を提案し、実験により有効性を示した。

Report

(4 results)
  • 2019 Annual Research Report   Final Research Report ( PDF )
  • 2018 Research-status Report
  • 2017 Research-status Report
  • Research Products

    (45 results)

All 2020 2019 2018 2017 Other

All Int'l Joint Research (2 results) Journal Article (18 results) (of which Int'l Joint Research: 3 results,  Peer Reviewed: 12 results,  Open Access: 5 results) Presentation (23 results) (of which Int'l Joint Research: 2 results) Book (1 results) Remarks (1 results)

  • [Int'l Joint Research] JHU(米国)

    • Related Report
      2018 Research-status Report
  • [Int'l Joint Research] Johns Hopkins University/Carnegie Mellon University/MERL(米国)

    • Related Report
      2017 Research-status Report
  • [Journal Article] 音声認識の現状と将来2020

    • Author(s)
      篠崎隆宏
    • Journal Title

      シミュレーション

      Volume: 39

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Effective and Stable Neuron Model Optimization Based on Aggregated CMA-ES2019

    • Author(s)
      Xu Han、Shinozaki Takahiro、Kobayashi Ryota
    • Journal Title

      ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

      Volume: ICASSP 2019 Pages: 1264-1268

    • DOI

      10.1109/icassp.2019.8682825

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Cross-Domain Speaker Recognition using Cycle-Consistent Adversarial Networks2019

    • Author(s)
      Liu Yi、Zhuang Bairong、Li Zhiyu、Shinozaki Takahiro
    • Journal Title

      Proc. APSIPA

      Volume: - Pages: 2070-2074

    • DOI

      10.1109/apsipaasc47483.2019.9023042

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Efficient Free Keyword Detection Based on CNN and End-to-End Continuous DP-Matching2019

    • Author(s)
      Tanaka Tomohiro、Shinozaki Takahiro
    • Journal Title

      Proc. ASRU

      Volume: - Pages: 637-644

    • DOI

      10.1109/asru46091.2019.9004021

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Effective and Stable Neuron Model Optimization Based on Aggregated CMA-ES2019

    • Author(s)
      Xu Han, Takahiro Shinozaki, Ryota Kobayashi
    • Journal Title

      Proc. IEEE ICASSP

      Volume: - Pages: 1264-1268

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] Investigation of Attention-Based Multimodal Fusion and Maximum Mutual Information Objective for DSTC7 Track32019

    • Author(s)
      Bairong Zhuang, Wenbo Wang, Takahiro Shinozaki
    • Journal Title

      Proc. DSTC7

      Volume: -

    • Related Report
      2018 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] 自動音声認識技術と英語教育--仕組みと研究動向,今できること・できないこと--2019

    • Author(s)
      篠崎 隆宏
    • Journal Title

      英語教育

      Volume: 67 Pages: 40-41

    • Related Report
      2018 Research-status Report
  • [Journal Article] Evolution-Strategy-Based Automation of System Development for High-Performance Speech Recognition2018

    • Author(s)
      Takafumi Moriya, Tomohiro Tanaka, Takahiro Shinozaki, Shinji Watanabe, Kevin Duh
    • Journal Title

      IEEE/ACM Transactions on Audio, Speech, and Language Processing

      Volume: 27 Issue: 1 Pages: 77-88

    • DOI

      10.1109/taslp.2018.2871755

    • Related Report
      2018 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Reward Only Training of Encoder-Decoder Digit Recognition Systems Based on Policy Gradient Methods2018

    • Author(s)
      Yilong Peng, Hayato Shibata, Takahiro Shinozaki
    • Journal Title

      Proc. APSIPA

      Volume: - Pages: 1934-1939

    • Related Report
      2018 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] F-Measure Based End-To-End Optimization of Neural Network Keyword Detectors2018

    • Author(s)
      Tomohiro Tanaka, Takahiro Shinozaki
    • Journal Title

      Proc. APSIPA

      Volume: - Pages: 1456-1461

    • Related Report
      2018 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Reinforcement Learning of Speech Recognition System Based on Policy Gradient and Hypothesis Selection2018

    • Author(s)
      Taku Kato, Takahiro Shinozaki
    • Journal Title

      Proc. IEEE ICASSP

      Volume: - Pages: 5759-5763

    • Related Report
      2018 Research-status Report
    • Peer Reviewed
  • [Journal Article] 音声認識仮説を用いたベイズ的半教師あり発音辞書学習の検討2018

    • Author(s)
      池下裕紀, 篠崎隆宏
    • Journal Title

      日本音響学会2018年春季研究発表会講演論文集

      Volume: - Pages: 123-124

    • Related Report
      2017 Research-status Report
  • [Journal Article] 方策勾配法と仮説選択に基づくDNN音声認識システムの強化学習2018

    • Author(s)
      加藤拓, 篠崎隆宏
    • Journal Title

      日本音響学会2018年春季研究発表会講演論文集

      Volume: - Pages: 15-16

    • Related Report
      2017 Research-status Report
  • [Journal Article] End-to-Endニューラル対話モデルにおける単語分散表現の比較検討2018

    • Author(s)
      鄭 崇輝,李 知雨,王 文博,庄 佰融,篠崎 隆宏
    • Journal Title

      日本音響学会2018年春季研究発表会講演論文集

      Volume: - Pages: 125-126

    • Related Report
      2017 Research-status Report
  • [Journal Article] Evolution Strategy Based Automatic Tuning of Neural Machine Translation Systems2017

    • Author(s)
      Hao Qin, Takahiro Shinozaki, Kevin Duh
    • Journal Title

      Proc. International Workshop on Spoken Language Translation (IWSLT)

      Volume: - Pages: 120-128

    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Comparative Analysis of Word Embedding Methods for DSTC6 End-to-End Conversation Modeling Track[C]2017

    • Author(s)
      Zhuang Bairong, Wang Wenbo, Li Zhiyu, Zheng Chonghui, Takahiro Shinozaki
    • Journal Title

      Proc. Dialog System Technology Challenges (DSTC6)

      Volume: - Pages: 1-5

    • Related Report
      2017 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] 英語学習者の発声自動評価を目的としたDNN音声認識システムの検討2017

    • Author(s)
      加藤 拓, 篠崎 隆宏
    • Journal Title

      情報処理学会研究報告

      Volume: Vol.2017-SLP-119 Pages: 1-4

    • Related Report
      2017 Research-status Report
  • [Journal Article] ベイズ推論を用いた半教師あり学習の日本語適用2017

    • Author(s)
      池下裕紀, 篠崎隆宏, 渡部晋治, 持橋大地, Graham Neubig
    • Journal Title

      情報処理学会研究報告

      Volume: Vol.2017-SLP-118 Pages: 1-4

    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] 二重相続進化戦略による音声認識システムの最適化2020

    • Author(s)
      日野 健人,木村 友祐,Dong Yue,篠崎 隆宏
    • Organizer
      日本音響学会2020年春季研究発表会
    • Related Report
      2019 Annual Research Report
  • [Presentation] CNNフロントエンドによる高速なEnd-to-End連続DPマッチングの実現2020

    • Author(s)
      田中 智宏,篠崎 隆宏
    • Organizer
      日本音響学会2020年春季研究発表会
    • Related Report
      2019 Annual Research Report
  • [Presentation] Robust Multichannel End-to-End Speech Recognition Based on Multi-Output Densenet2020

    • Author(s)
      Zheng Chonghui, Shinozaki Takahiro
    • Organizer
      情報処理学会 声言語情報処理研究会
    • Related Report
      2019 Annual Research Report
  • [Presentation] 入力画像勾配を用いたモデル構造フリーな教師無し音源ローカライゼーション2019

    • Author(s)
      田中 智宏, 篠﨑隆宏
    • Organizer
      日本音響学会2019年秋季研究発表会
    • Related Report
      2019 Annual Research Report
  • [Presentation] CNNフロントエンドによるEnd-to-End連続DPマッチングの高速化2019

    • Author(s)
      田中 智宏, 篠﨑 隆宏
    • Organizer
      音声言語処理研究会(SLP)
    • Related Report
      2019 Annual Research Report
  • [Presentation] 連続単語検出のための 2D-RNN を用いた End-to-EndDPマッチング2019

    • Author(s)
      田中智宏, 篠崎隆宏
    • Organizer
      日本音響学会 2019年 春季研究発表会
    • Related Report
      2018 Research-status Report
  • [Presentation] 連続対応検出ネットワークによる音声動画からの教師なし物体セグメンテーションおよび関連学習の検討2019

    • Author(s)
      田中智宏, 篠崎隆宏
    • Organizer
      日本音響学会 2019年 春季研究発表会
    • Related Report
      2018 Research-status Report
  • [Presentation] 大規模 End-to-End 音声認識システムの教師なし強化学習の実現に向けた検討2019

    • Author(s)
      PengYilong, 篠崎隆宏
    • Organizer
      日本音響学会 2019年 春季研究発表会
    • Related Report
      2018 Research-status Report
  • [Presentation] Analysis of Attention-Based Multimodal Fusion and Maximum Mutual Information Objective for DSTC7 Audio Visual Scene-Aware Dialog Track2019

    • Author(s)
      王 文博,庄 佰融,篠崎 隆宏
    • Organizer
      日本音響学会 2019年 春季研究発表会
    • Related Report
      2018 Research-status Report
  • [Presentation] I-vector Domain Adaptation Using Cycle-Consistent Adversarial Networks for Speaker Recognition2019

    • Author(s)
      Yi Liu, Takahiro Shinozaki
    • Organizer
      情報処理学会 SLP-126
    • Related Report
      2018 Research-status Report
  • [Presentation] マルチゲートGRUユニットを用いた2D-RNNによるEnd-to-End始終端フリー単語検出2018

    • Author(s)
      田中智宏, 篠崎隆宏
    • Organizer
      情報処理学会 SLP-125
    • Related Report
      2018 Research-status Report
  • [Presentation] Improving the audio visual scene-aware dialog system in DSTC7 by using attentional multimodal fusion and MMI objective2018

    • Author(s)
      Wenbo Wang,Bairong Zhuang,Takahiro Shinozaki
    • Organizer
      情報処理学会 SLP-125
    • Related Report
      2018 Research-status Report
  • [Presentation] 音声認識システムの教師なし強化学習における報酬と報酬ノイズの影響の検討2018

    • Author(s)
      PengYilong, 柴田駿人, 篠崎隆宏
    • Organizer
      日本音響学会 2018年 秋季研究発表会
    • Related Report
      2018 Research-status Report
  • [Presentation] 単語検出性能を目的関数とした単語検出器学習法の提案2018

    • Author(s)
      田中智宏, 篠崎隆宏
    • Organizer
      日本音響学会 2018年 秋季研究発表会
    • Related Report
      2018 Research-status Report
  • [Presentation] 強化学習による報酬のみを用いたend-to-end 認識システム学習2018

    • Author(s)
      柴田駿人, PengYilong, 篠崎隆宏
    • Organizer
      日本音響学会 2018年 秋季研究発表会
    • Related Report
      2018 Research-status Report
  • [Presentation] End-to-end音声認識システムの強化学習の検討2018

    • Author(s)
      PengYilong, 柴田駿人, 篠崎隆宏
    • Organizer
      情報処理学会 SLP-123
    • Related Report
      2018 Research-status Report
  • [Presentation] 音声認識仮説を用いたベイズ的半教師あり発音辞書学習の検討2018

    • Author(s)
      池下 裕紀
    • Organizer
      日本音響学会春季研究発表会
    • Related Report
      2017 Research-status Report
  • [Presentation] 方策勾配法と仮説選択に基づくDNN音声認識システムの強化学習2018

    • Author(s)
      加藤 拓
    • Organizer
      日本音響学会春季研究発表会
    • Related Report
      2017 Research-status Report
  • [Presentation] End-to-Endニューラル対話モデルにおける単語分散表現の比較検討2018

    • Author(s)
      鄭 崇輝
    • Organizer
      日本音響学会春季研究発表会
    • Related Report
      2017 Research-status Report
  • [Presentation] Evolution Strategy Based Automatic Tuning of Neural Machine Translation Systems2017

    • Author(s)
      Hao Qin
    • Organizer
      International Workshop on Spoken Language Translation
    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] Comparative Analysis of Word Embedding Methods for DSTC6 End-to-End Conversation Modeling Track[C]2017

    • Author(s)
      Zhuang Bairong
    • Organizer
      Dialog System Technology Challenges (DSTC6)
    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] 英語学習者の発声自動評価を目的としたDNN音声認識システムの検討2017

    • Author(s)
      加藤 拓
    • Organizer
      情報処理学会音声言語情報処理研究会
    • Related Report
      2017 Research-status Report
  • [Presentation] ベイズ推論を用いた半教師あり学習の日本語適用2017

    • Author(s)
      池下 裕紀
    • Organizer
      情報処理学会音声言語情報処理研究会
    • Related Report
      2017 Research-status Report
  • [Book] Automated Development of DNN Based Spoken Language Systems Using Evolutionary Algorithms2020

    • Author(s)
      Takahiro Shinozaki, Shinji Watanabe, Kevin Duh
    • Total Pages
      33
    • Publisher
      Springer
    • Related Report
      2019 Annual Research Report
  • [Remarks] 篠崎研究室

    • Related Report
      2019 Annual Research Report

URL: 

Published: 2017-07-21   Modified: 2022-02-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi