• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2018 Fiscal Year Final Research Report

A study on acoustic model adaptation for deep-learning-based speech recognition

Research Project

  • PDF
Project/Area Number 16K00227
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Perceptual information processing
Research InstitutionYamagata University

Principal Investigator

Kosaka Tetsuo  山形大学, 大学院理工学研究科, 教授 (50359569)

Research Collaborator KATO Masaharu  
Project Period (FY) 2016-04-01 – 2019-03-31
Keywords音声認識 / 音響モデル / ディープニューラルネットワーク / 適応技術 / 話し言葉 / 感情音声 / 音声区間検出
Outline of Final Research Achievements

Although the deep-learning-based speech recognition technology has made great achievements in recent years, the spontaneous-speech-recognition technology has not yet obtained sufficient results. As major factors of performance degradation in speech recognition, a variety of speaker characteristics, acoustic environments, and speaking styles can be mentioned. To solve these problems, I developed techniques centered around acoustic-model adaptation to improve the speech-recognition performance. Consequently, performance improvement was achieved with regard to spontaneous and emotional speech. Additionally, the performance of voice-activity detection was also improved.

Free Research Field

音声情報処理

Academic Significance and Societal Importance of the Research Achievements

本研究により,1)話し言葉音声認識における適応精度の向上,2)雑音下音声区間検出の精度向上,3)感情音声認識の性能向上を達成した.1)は話し言葉音声認識に限らず,異なる分野においても応用可能な適応手法で汎用性の高い技術である.2)の成果を利用してマルチモーダル対話コーパスが整備されており,当該分野の研究者にとって有益と考えられる.また3)についてもロボットと人間との会話など様々な分野に利用が可能である.以上,本研究で開発した技術は波及効果が高く,学術的,社会的意義が高いと考えられる.

URL: 

Published: 2020-03-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi