• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Accurate speech recognition system with deep neural network introducing human auditory characteristic in real environments

Research Project

Project/Area Number 15K00233
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Perceptual information processing
Research InstitutionChubu University (2017)
Toyohashi University of Technology (2015-2016)

Principal Investigator

YAMAMOTO Kazumasa  中部大学, 工学部, 准教授 (40324230)

Co-Investigator(Kenkyū-buntansha) 中川 聖一  豊橋技術科学大学, リーディング大学院教育推進機構, 特命教授 (20115893)
Project Period (FY) 2015-04-01 – 2018-03-31
Project Status Completed (Fiscal Year 2017)
Budget Amount *help
¥4,810,000 (Direct Cost: ¥3,700,000、Indirect Cost: ¥1,110,000)
Fiscal Year 2017: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2016: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2015: ¥2,600,000 (Direct Cost: ¥2,000,000、Indirect Cost: ¥600,000)
Keywords音声認識 / 深層学習 / Deep Neural Network / 聴覚特性 / 音響特徴量 / フィルタバンク / 話者適応
Outline of Final Research Achievements

Currently, deep learning has been introduced into speech recognition technology and the speech recognition technology is gradually being used practically, but speech recognition performance is still not sufficient in noisy environments or for distant-talking. The purpose of this research is to improve speech recognition accuracy by combining DNN (Deep Neural Network) acoustic model with human auditory characteristics.
In this research, we proposed a method to automatically learn feature extraction filterbanks at the bottom of DNN acoustic model by using deep learning considering human auditory characteristics. By using this method, improvement of speech recognition accuracy was obtained for speaker-independent speech recognition. In addition, the proposed method improved speaker-adapted speech recognition accuracy even under the condition that the amount of adaptation data is small. The results showed the effectiveness of the proposed method.

Report

(4 results)
  • 2017 Annual Research Report   Final Research Report ( PDF )
  • 2016 Research-status Report
  • 2015 Research-status Report
  • Research Products

    (22 results)

All 2017 2016 2015

All Journal Article (2 results) (of which Peer Reviewed: 2 results,  Open Access: 1 results) Presentation (20 results) (of which Int'l Joint Research: 11 results)

  • [Journal Article] Speech Recognition of Short Time Utterance Based on Speaker Clustering2017

    • Author(s)
      関博史、榎並大介、朱発強、山本一公、中川聖一
    • Journal Title

      電子情報通信学会論文誌D 情報・システム

      Volume: J100-D Issue: 1 Pages: 81-92

    • DOI

      10.14923/transinfj.2016JDP7063

    • ISSN
      1880-4535, 1881-0225
    • Year and Date
      2017-01-01
    • Related Report
      2016 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Chat-like Spoken Dialogue System with Multiple Agents2016

    • Author(s)
      藤堂祐樹, 西村良太, 山本一公, 中川聖一
    • Journal Title

      電子情報通信学会論文誌D 情報・システム

      Volume: J99-D Issue: 2 Pages: 188-200

    • DOI

      10.14923/transinfj.2015JDP7010

    • ISSN
      1880-4535, 1881-0225
    • Year and Date
      2016-02-01
    • Related Report
      2015 Research-status Report
    • Peer Reviewed
  • [Presentation] DNNに基づくフィルタバンクの再学習による話者クラス適応の検討2017

    • Author(s)
      関博史, 山本一公, 中川聖一
    • Organizer
      日本音響学会2017年春季研究発表会
    • Place of Presentation
      明治大学生田キャンパス
    • Year and Date
      2017-03-15
    • Related Report
      2016 Research-status Report
  • [Presentation] 音声感情のコンテキスト情報を考慮したラベリングと認識手法の検討2017

    • Author(s)
      竹部真晃, 山本一公, 中川聖一
    • Organizer
      日本音響学会2017年春季研究発表会
    • Place of Presentation
      明治大学生田キャンパス
    • Year and Date
      2017-03-15
    • Related Report
      2016 Research-status Report
  • [Presentation] ドメイン間遷移を持つ雑談音声対話システムの検討2017

    • Author(s)
      芝原優真, 山本一公, 中川聖一
    • Organizer
      日本音響学会2017年春季研究発表会
    • Place of Presentation
      明治大学生田キャンパス
    • Year and Date
      2017-03-15
    • Related Report
      2016 Research-status Report
  • [Presentation] 講義スライド中の文章・図表を対象とする説明箇所自動推定手法の検討2017

    • Author(s)
      辻村祥子, 山本一公, 中川聖一
    • Organizer
      日本音響学会2017年春季研究発表会
    • Place of Presentation
      明治大学生田キャンパス
    • Year and Date
      2017-03-15
    • Related Report
      2016 Research-status Report
  • [Presentation] A deep neural network integrated with filterbank learning for speech recognition2017

    • Author(s)
      Hiroshi Seki, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2017)
    • Place of Presentation
      New Orleans, Louisiana, USA
    • Year and Date
      2017-03-05
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research
  • [Presentation] Lyric recognition in monophonic singing using pitch-dependent DNN2017

    • Author(s)
      Dairoku Kawai, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2017)
    • Place of Presentation
      New Orleans, Louisiana, USA
    • Year and Date
      2017-03-05
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research
  • [Presentation] Robust lecture speech translation for speech misrecognition and its rescoring effect from multiple candidates2017

    • Author(s)
      Sahashi Koya, Goto Norioki, Seki Hiroshi, Yamamoto Kazumasa, Akiba Tomoyoshi, Nakagawa Seiichi
    • Organizer
      4th International Conference on Advance Informatics: Concepts, Theory and Applications (ICAICTA 2017)
    • Related Report
      2017 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Automatic Explanation Spot Estimation Method Targeted at Text and Figures in Lecture Slides2017

    • Author(s)
      Tsujimura Shoko, Yamamoto Kazumasa, Nakagawa Seiichi
    • Organizer
      INTERSPEECH 2017
    • Related Report
      2017 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Detection of overlapping acoustic events based on NMF with shared basis vectors2017

    • Author(s)
      Yamamoto Kazumasa, Ishikawa Chikara, Sahashi Koya, Nakagawa Seiichi
    • Organizer
      IEEE 6th Global Conference on Consumer Electronics (GCCE 2017)
    • Related Report
      2017 Annual Research Report
    • Int'l Joint Research
  • [Presentation] 大規模データベースCSJを用いたDNNに基づくフィルタバンク学習の評価2017

    • Author(s)
      関博史、山本一公、秋葉友良、中川聖一
    • Organizer
      日本音響学会2017年秋期研究発表会
    • Related Report
      2017 Annual Research Report
  • [Presentation] Investigation of glottal features and annotation procedure for speech emotion recognition2016

    • Author(s)
      Masashi Takebe, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2016)
    • Place of Presentation
      Jeju, Korea
    • Year and Date
      2016-12-13
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research
  • [Presentation] 音声認識のためのDNNに基づくフィルタバンクの学習の検討2016

    • Author(s)
      関博史, 山本一公, 中川聖一
    • Organizer
      日本音響学会2016年秋季研究発表会
    • Place of Presentation
      富山大学五福キャンパス
    • Year and Date
      2016-09-14
    • Related Report
      2016 Research-status Report
  • [Presentation] Effect of sympathetic relation and unsympathetic relation in multi-agent spoken dialogue system2016

    • Author(s)
      Yuma Shibahara, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      International Conference on Advanced Infomatics: Concepts, Theory and Applications (ICAICTA 2016)
    • Place of Presentation
      Jeju, Korea
    • Year and Date
      2016-08-17
    • Related Report
      2016 Research-status Report
    • Int'l Joint Research
  • [Presentation] Speech analysis of sung-speech and lyric recognition in monophonic singing2016

    • Author(s)
      Dairoku Kawai, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      IEEE International Conference on Acoustics, Speech, and Signal Processing
    • Place of Presentation
      Shanghai, China
    • Year and Date
      2016-03-20
    • Related Report
      2015 Research-status Report
    • Int'l Joint Research
  • [Presentation] 畳み込みニューラルネットワークの教師なし逐次適応学習の検討2016

    • Author(s)
      関博史,山本一公,中川聖一
    • Organizer
      日本音響学会
    • Place of Presentation
      桐蔭横浜大学
    • Year and Date
      2016-03-09
    • Related Report
      2015 Research-status Report
  • [Presentation] NMFによる任意の音楽重畳音声の認識2016

    • Author(s)
      橋本尚亮,山本一公,中川聖一
    • Organizer
      日本音響学会
    • Place of Presentation
      桐蔭横浜大学
    • Year and Date
      2016-03-09
    • Related Report
      2015 Research-status Report
  • [Presentation] 歌声音声の特徴分析とピッチ特徴量を考慮した歌詞認識の検討2016

    • Author(s)
      川井大陸,山本一公,中川聖一
    • Organizer
      日本音響学会
    • Place of Presentation
      桐蔭横浜大学
    • Year and Date
      2016-03-09
    • Related Report
      2015 Research-status Report
  • [Presentation] Speech recognition based on Itakura-Saito divergence and dynamics / sparseness constraints from mixed sound of speech and music by non-negative matrix factorization2015

    • Author(s)
      Naoaki Hashimoto, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
    • Place of Presentation
      Hong Kong
    • Year and Date
      2015-12-16
    • Related Report
      2015 Research-status Report
    • Int'l Joint Research
  • [Presentation] Deep neural network based acoustic model using speaker-class information for short time utterance2015

    • Author(s)
      Hiroshi Seki, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
    • Place of Presentation
      Hong Kong
    • Year and Date
      2015-12-16
    • Related Report
      2015 Research-status Report
    • Int'l Joint Research
  • [Presentation] Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction2015

    • Author(s)
      Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa
    • Organizer
      INTERSPEECH
    • Place of Presentation
      Dresden, Germany
    • Year and Date
      2015-09-06
    • Related Report
      2015 Research-status Report
    • Int'l Joint Research

URL: 

Published: 2015-04-16   Modified: 2019-03-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi