• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A study on acoustic model adaptation for deep-learning-based speech recognition

Research Project

Project/Area Number 16K00227
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Perceptual information processing
Research InstitutionYamagata University

Principal Investigator

Kosaka Tetsuo  山形大学, 大学院理工学研究科, 教授 (50359569)

Research Collaborator KATO Masaharu  
Project Period (FY) 2016-04-01 – 2019-03-31
Project Status Completed (Fiscal Year 2018)
Budget Amount *help
¥4,550,000 (Direct Cost: ¥3,500,000、Indirect Cost: ¥1,050,000)
Fiscal Year 2018: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2017: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Fiscal Year 2016: ¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
Keywords音声認識 / 音響モデル / ディープニューラルネットワーク / 適応技術 / 話し言葉 / 感情音声 / 音声区間検出 / ディープラーニング / 感情音声認識 / ニューラルネットワーク / 話者適応
Outline of Final Research Achievements

Although the deep-learning-based speech recognition technology has made great achievements in recent years, the spontaneous-speech-recognition technology has not yet obtained sufficient results. As major factors of performance degradation in speech recognition, a variety of speaker characteristics, acoustic environments, and speaking styles can be mentioned. To solve these problems, I developed techniques centered around acoustic-model adaptation to improve the speech-recognition performance. Consequently, performance improvement was achieved with regard to spontaneous and emotional speech. Additionally, the performance of voice-activity detection was also improved.

Academic Significance and Societal Importance of the Research Achievements

本研究により,1)話し言葉音声認識における適応精度の向上,2)雑音下音声区間検出の精度向上,3)感情音声認識の性能向上を達成した.1)は話し言葉音声認識に限らず,異なる分野においても応用可能な適応手法で汎用性の高い技術である.2)の成果を利用してマルチモーダル対話コーパスが整備されており,当該分野の研究者にとって有益と考えられる.また3)についてもロボットと人間との会話など様々な分野に利用が可能である.以上,本研究で開発した技術は波及効果が高く,学術的,社会的意義が高いと考えられる.

Report

(4 results)
  • 2018 Annual Research Report   Final Research Report ( PDF )
  • 2017 Research-status Report
  • 2016 Research-status Report
  • Research Products

    (24 results)

All 2019 2018 2017 2016 Other

All Journal Article (6 results) (of which Peer Reviewed: 6 results,  Open Access: 6 results) Presentation (13 results) (of which Int'l Joint Research: 1 results) Remarks (5 results)

  • [Journal Article] Unsupervised Cross Adaptation Using Deep Neural Networks in Speech Recognition Systems2018

    • Author(s)
      冨田 健斗、高木 瑛、加藤 正治、小坂 哲夫
    • Journal Title

      電子情報通信学会論文誌D 情報・システム

      Volume: J101-D Issue: 8 Pages: 1190-1199

    • DOI

      10.14923/transinfj.2017JDP7076

    • ISSN
      1880-4535, 1881-0225
    • Year and Date
      2018-08-01
    • Related Report
      2018 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Acoustic Model Adaptation for Emotional Speech Recognition Using Twitter-Based Emotional Speech Corpus2018

    • Author(s)
      Kosaka Tetsuo、Aizawa Yoshitaka、Kato Masaharu、Nose Takashi
    • Journal Title

      Proc. of APSIPA ASC 2018

      Volume: - Pages: 1747-1751

    • DOI

      10.23919/apsipa.2018.8659756

    • Related Report
      2018 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Improving Voice Activity Detection for Multimodal Movie Dialogue Corpus2018

    • Author(s)
      Kosaka Tetsuo、Suga Ikumi、Inoue Masashi
    • Journal Title

      2018 IEEE 7th Global Conference on Consumer Electronics (GCCE)

      Volume: - Pages: 481-484

    • DOI

      10.1109/gcce.2018.8574730

    • Related Report
      2018 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Large-scale multimodal movie dialogue corpus2016

    • Author(s)
      Ryu Yasuhara, Masashi Inoue, Ikumi Suga and Tetsuo Kosaka
    • Journal Title

      Proc. of the 18th ACM International Conference on Multimodal Interaction

      Volume: - Pages: 414-415

    • DOI

      10.1145/2993148.2998523

    • Related Report
      2016 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Many-to-many voice conversion using hidden Markov model-based speech recognition and synthesis2016

    • Author(s)
      Y. Aizawa, M. Kato and T. Kosaka
    • Journal Title

      The Journal of the Acoustical Society of America

      Volume: 140 Issue: 4_Supplement Pages: 2964-2964

    • DOI

      10.1121/1.4969167

    • Related Report
      2016 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Voice activity detection in movies using multi-class deep neural networks2016

    • Author(s)
      I. Suga, R. Yasuhara, M. Inoue and T. Kosaka
    • Journal Title

      The Journal of the Acoustical Society of America

      Volume: 140 Issue: 4_Supplement Pages: 3116-3116

    • DOI

      10.1121/1.4969758

    • Related Report
      2016 Research-status Report
    • Peer Reviewed / Open Access
  • [Presentation] 日本語感情音声コーパスJTESを対象とした感情認識の基礎検討2019

    • Author(s)
      羽田優花,加藤正治,小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Related Report
      2018 Annual Research Report
  • [Presentation] 言語モデルの改良による感情音声の認識と韻律制御声質変換の性能向上2019

    • Author(s)
      佐伯和哉,加藤正治,小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Related Report
      2018 Annual Research Report
  • [Presentation] 感情音声認識における音響モデル適応と声質変換への応用2018

    • Author(s)
      小坂哲夫,相澤佳孝,加藤正治,能勢隆
    • Organizer
      日本音響学会秋季講演論文集
    • Related Report
      2018 Annual Research Report
  • [Presentation] DNNを用いた教師なしクロス適応の性能評価2018

    • Author(s)
      冨田建斗,加藤正治,小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Related Report
      2017 Research-status Report
  • [Presentation] 自発対話音声を用いた感情認識の学習データによる検討2018

    • Author(s)
      真壁大介,加藤正治,小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Related Report
      2017 Research-status Report
  • [Presentation] 映画からのマルチモーダル対話コーパスの作成2017

    • Author(s)
      井上雅史,安原龍,菅郁巳,小坂哲夫
    • Organizer
      人工知能学会全国大会
    • Related Report
      2017 Research-status Report
  • [Presentation] 感情音声データベースJTESを用いた感情音声認識におけるDNN-HMM音響モデル適応の検討2017

    • Author(s)
      相澤佳孝,小坂哲夫,加藤正治,能勢隆
    • Organizer
      日本音響学会秋季講演論文集
    • Related Report
      2017 Research-status Report
  • [Presentation] DNNを用いた映画の音声区間検出におけるクラス分類の検討2017

    • Author(s)
      菅郁巳,小坂哲夫,井上雅史
    • Organizer
      日本音響学会秋季講演論文集
    • Related Report
      2017 Research-status Report
  • [Presentation] 感情音声データベースJTESを用いた感情音声認識におけるモデル適応の性能向上の検討2017

    • Author(s)
      相澤佳孝,小坂哲夫,加藤正治,能勢隆
    • Organizer
      情報処理学会研究報告
    • Related Report
      2017 Research-status Report
  • [Presentation] DNNによる音声認識を用いた感情音声の声質変換の検討2017

    • Author(s)
      笹田拓臣,相澤佳孝, 小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学
    • Related Report
      2016 Research-status Report
  • [Presentation] 高精度な初期モデルを用いた教師なしクロス適応の評価2016

    • Author(s)
      冨田健斗, 高木瑛, 加藤正治, 小坂哲夫
    • Organizer
      日本音響学会秋季講演論文集
    • Place of Presentation
      富山大学
    • Year and Date
      2016-09-14
    • Related Report
      2016 Research-status Report
  • [Presentation] HMM認識・合成による感情音声の声質変換の性能向上2016

    • Author(s)
      相澤佳孝, 中川由暁, 加藤正治, 小坂哲夫
    • Organizer
      日本音響学会秋季講演論文集
    • Place of Presentation
      富山大学
    • Year and Date
      2016-09-14
    • Related Report
      2016 Research-status Report
  • [Presentation] Voice Conversion of emotional speech using hidden Markov model-based speech recognition and synthesis2016

    • Author(s)
      Tetsuo Kosaka, Yoshiaki Nakagawa and Masaharu Kato
    • Organizer
      Proc. of 22nd International Congress on Acoustics
    • Place of Presentation
      Buenos Aires, Argentina
    • Year and Date
      2016-09-05
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research
  • [Remarks] 小坂研究室

    • URL

      https://speech-lab.yz.yamagata-u.ac.jp/

    • Related Report
      2018 Annual Research Report
  • [Remarks] Movie Dialogue Corpus

    • URL

      http://www.ice.tohtech.ac.jp/~inoue/moviedialcorpus/index.html

    • Related Report
      2018 Annual Research Report
  • [Remarks] 小坂研究室

    • URL

      http://speech-lab.yz.yamagata-u.ac.jp/

    • Related Report
      2017 Research-status Report
  • [Remarks] 小坂研究室

    • URL

      http://speech-lab.yz.yamagata-u.ac.jp/index.html

    • Related Report
      2016 Research-status Report
  • [Remarks] Movie Dialogue Corpus

    • URL

      http://i.yz.yamagata-u.ac.jp/moviedialcorpus/

    • Related Report
      2016 Research-status Report

URL: 

Published: 2016-04-21   Modified: 2020-03-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi