• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Development of high-accuracy system for recognizing spontaneous speech

Research Project

Project/Area Number 22500144
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field Perception information processing/Intelligent robotics
Research InstitutionYamagata University

Principal Investigator

KOSAKA Tetsuo  山形大学, 大学院・理工学研究科, 教授 (50359569)

Co-Investigator(Renkei-kenkyūsha) KATO Masaharu  山形大学, 大学院・理工学研究科, 助教 (10250953)
Project Period (FY) 2010 – 2012
Project Status Completed (Fiscal Year 2012)
Budget Amount *help
¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)
Fiscal Year 2012: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2011: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2010: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Keywords音声認識 / 話し言葉 / 音響モデル / 言語モデル / 話者適応 / 話し言葉音声認識 / 教師無し話者適応 / 単語グラフ統合 / クロスバリデーション / 話者インデキシング / 話者ベクトル / クロス適応 / 音素環境依存モデル / 話者クラス音響モデル
Research Abstract

In our research, we aimed to improve the system performance for recognizing spontaneousspeech, which was considered to be more difficult than recognizing read speech. We focused on three technical issues: (1) acoustic and language models, (2) system combinationtechniques, and (3) speaker indexing. For improving the performance of acoustic models,we investigated a discrete-mixture hidden Markov model based on discriminative training, speaker-class model, quinphone, and a reverberation-class model. Some systemco(a) mbinationtechniquesw(a) ere investigated, such as the combination of continuous anddiscrete models, the combination of various quinphones, and the combination of reverberation-class models. For the issues of language models, we proposed the cross adaptation and cross-validation adaptation techniques. In addition, we improved theperformance of speaker indexing techniques based on speaker vectors required during theexecution of speaker adaptation.

Report

(4 results)
  • 2012 Annual Research Report   Final Research Report ( PDF )
  • 2011 Annual Research Report
  • 2010 Annual Research Report
  • Research Products

    (52 results)

All 2013 2012 2011 2010 Other

All Journal Article (22 results) (of which Peer Reviewed: 20 results) Presentation (25 results) Book (4 results) Remarks (1 results)

  • [Journal Article] A time-synchronous histogram equalization for noise robust speech recognition2013

    • Author(s)
      Fumiya Takahashi, Masaharu Kato and Tetsuo Kosaka
    • Journal Title

      Proc. of ICA

      Volume: 採録決定 Pages: 5-5

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] An investigation of vowel substitution rules in the automatic evaluation system of English pronunciation2013

    • Author(s)
      Kei Sato, Masaharu Kato and Tetsuo Kosaka
    • Journal Title

      Proc. of ICA

      Volume: 採録決定 Pages: 5-5

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] 識別学習を用いた離散混合分布 HMMによる音声認識2013

    • Author(s)
      小坂哲夫,加藤正治
    • Journal Title

      情報処理学会論文誌

      Volume: Vol. 54 No. 2 Pages: 436-442

    • NAID

      110009537036

    • URL

      https://ipsj.ixsq.nii.ac.jp/ej/index.php?active_action=repository_view_main_item_detail&item_id=90262&item_no=1&page_id=13&block_id=8

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] 識別学習を用いた離散混合分布HMMによる音声認識2013

    • Author(s)
      小坂哲夫,加藤正治
    • Journal Title

      情報処理学会論文誌

      Volume: 54 Pages: 436-442

    • NAID

      110009537036

    • Related Report
      2012 Annual Research Report
    • Peer Reviewed
  • [Journal Article] An investigation of vowel substitution rules in the automatic evaluation system of English pronunciation2013

    • Author(s)
      Kei Sato, Masaharu Kato and Tetsuo Kosaka
    • Journal Title

      Proc. of International Congress on Acoustics 2013

      Volume: 1 Pages: 1-5

    • Related Report
      2012 Annual Research Report
    • Peer Reviewed
  • [Journal Article] A time-synchronous histogram equalization for noise robust speech recognition2013

    • Author(s)
      Fumiya Takahashi, Masaharu Kato and Tetsuo Kosaka
    • Journal Title

      Proc. of International Congress on Acoustics 2013

      Volume: 1 Pages: 1-5

    • Related Report
      2012 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Unsupervised Cross-Adaptation Approach for Speech Recognition by Combined Language Model and Acoustic Model Adaptation2011

    • Author(s)
      Tetsuo Kosaka, Taro Miyamoto and Masaharu Kato
    • Journal Title

      Proc. of APSIPA ASC 2011, Thu-PM

      Pages: 4-4

    • URL

      http://www.apsipa.org/proceedings_2011/pdf/APSIPA177.pdf

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Speaker Vector-Based Verification by Phonetic Class-Based Modeling2011

    • Author(s)
      Tetsuo Kosaka, Naoki Tadokoro, Masaharu Kato and Masaki Kohda
    • Journal Title

      Journal of Information Assurance and Security

      Volume: Vol. 6, No.3 Pages: 186-194

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Lecture Speech Recognition Using Discrete-Mixture HMMs2011

    • Author(s)
      Tetsuo Kosaka, Akiyoshi Yamamoto, Takuya Kumakura, Masaharu Kato and Masaki Kohda
    • Journal Title

      IEEJ Transactions on Electrical and Electronic Engineering

      Volume: Vol. 6 No. 1 Issue: 1 Pages: 23-29

    • DOI

      10.1002/tee.20602

    • NAID

      10027629753

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Unsupervised Cross-Adaptation Approach for Speech Recognition by Combined Language Model and Acoustic Model Adaptation2011

    • Author(s)
      Tetsuo Kosaka, Taro Miyamoto, Masaharu Kato
    • Journal Title

      Proc.of APSIPA ASC 2011

      Volume: (CD-ROM)

    • Related Report
      2011 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Lecture Speech Recognition Using Discrete-Mixture HMMs2011

    • Author(s)
      Tetsuo Kosaka, Akiyoshi Yamamoto, Takuya Kumakura, Masaharu Kato, Masaki Kohda
    • Journal Title

      IEEJ Transactions on Electrical and Electromc Engineering

      Volume: Vol.6, No.1 Pages: 23-29

    • NAID

      10027629753

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Speaker Vector-Based Verification by Phonetic Class-Based Modeling2011

    • Author(s)
      Tetsuo Kosaka, Naoki Tadokoro, Masaharu Kato, Masaki Kohda
    • Journal Title

      Journal of Information Assurance and Security

      Volume: Vo1.6, No.3 Pages: 186-194

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Performance Improvement in Automatic Evaluation System of English Pronunciation by Using Various Normalization Methods2010

    • Author(s)
      Masaru Kusumi, Masaharu Kato, Tetsuo Kosaka and Itaru Matsunaga
    • Journal Title

      Proc. of International Congress on Acoustics 2010

      Volume: 257 Pages: 6-6

    • URL

      http://www.acoustics.asn.au/conference_proceedings/ICA2010/cdrom-ICA2010/papers/p257.pdf

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Speech Recognition in Noise by Using Word Graph Combinations2010

    • Author(s)
      Shunsuke Kuramata, Masaharu Kato and Tetsuo Kosaka
    • Journal Title

      Proc. of International Congress on Acoustics 2010

      Volume: 341 Pages: 6-6

    • URL

      http://www.acoustics.asn.au/conference_proceedings/ICA2010/cdrom-ICA2010/papers/p341.pdf

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Speaker Adaptation Based on System Combination Using Speaker-Class Models2010

    • Author(s)
      Tetsuo Kosaka, Takashi Ito, Masaharu Kato and Masaki Kohda
    • Journal Title

      Proc. of Interspeech2010

      Pages: 546-549

    • URL

      http://www.isca-speech.org/archive/interspeech_2010/i10_0546.html

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Lecture Speech Recognition by Combining Word Graphs of Various Acoustic Models2010

    • Author(s)
      Tetsuo Kosaka, Keisuke Goto, Takashi Ito and Masaharu Kato
    • Journal Title

      Proc. of Interspeech2010

      Pages: 2978-2981

    • URL

      http://www.isca-speech.org/archive/interspeech_2010/i10_2978.html

    • Related Report
      2012 Final Research Report
    • Peer Reviewed
  • [Journal Article] Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition2010

    • Author(s)
      Tetsuo Kosaka, Yuui Takeda, Takashi Ito, Masaharu Kato, Masaki Kohda
    • Journal Title

      IEICE Transactions on Information and Systems

      Volume: Vo1.E93-D, No.9 Pages: 2363-2369

    • NAID

      10027640196

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Speech Recognition in Noise by Using Word Graph Combinations2010

    • Author(s)
      Shunsuke Kuramata, MasaharuKato, Tetsuo Kosaka
    • Journal Title

      Proc.of International Congress on Acoustics 2010

      Volume: CD-ROM

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Speaker Adaptation Based on System Combination Using Speaker-Class Models2010

    • Author(s)
      Tetsuo Kosaka, Takashi Ito, Masaharu Kato, Masaki Kohda
    • Journal Title

      Proc.of Interspeech2010

      Volume: CD-ROM Pages: 546-549

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Lecture Speech Recognition by Combining Word Graphs of Various Acoustic Models2010

    • Author(s)
      Tetsuo Kosaka, Keisuke Goto, Takashi Ito, Masaharu Kato
    • Journal Title

      Proc.of Interspeech2010

      Volume: CD-ROM Pages: 2978-2981

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Quinphone HM-netを用いた単語グラフ統合に基づく講演音声認識2010

    • Author(s)
      加藤正治, 小坂哲夫, 伊藤彰則, 牧野正三
    • Journal Title

      電子情報通信学会技術研究報告

      Volume: SP2010-28 Pages: 37-42

    • NAID

      110007969989

    • Related Report
      2010 Annual Research Report
  • [Journal Article] 単語グラフ統合を用いた種々の雑音環境下での音声認識2010

    • Author(s)
      倉又俊輔, 加藤正治, 小坂哲夫
    • Journal Title

      電子情報通信学会技術研究報告

      Volume: SP2010-41 Pages: 37-42

    • NAID

      110007890249

    • Related Report
      2010 Annual Research Report
  • [Presentation] クロスバリデーションによる教師なし言語適応における各種パラメータの最適化2013

    • Author(s)
      高木瑛,加藤正治,小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学工学部
    • Year and Date
      2013-03-11
    • Related Report
      2012 Final Research Report
  • [Presentation] 入力音声の韻律情報を用いたHMM音声合成2013

    • Author(s)
      栗原大樹,加藤正治,小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学工学部
    • Year and Date
      2013-03-11
    • Related Report
      2012 Final Research Report
  • [Presentation] 話者クラス音響モデルを用いた講演音声認識におけるクラスタリング手法の各種検討2012

    • Author(s)
      今野和樹,大山拓也,加藤正治,小坂哲夫
    • Organizer
      音声言語情報処理研究報告
    • Place of Presentation
      東京工業大学
    • Year and Date
      2012-12-21
    • Related Report
      2012 Annual Research Report 2012 Final Research Report
  • [Presentation] 日本人英語の自動発音評定における誤り規則の検討2012

    • Author(s)
      佐藤慶,加藤正治,小坂哲夫
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      信州大学
    • Year and Date
      2012-09-21
    • Related Report
      2012 Annual Research Report 2012 Final Research Report
  • [Presentation] 雑音下音声認識におけるフレーム重みづけヒストグラム同等化法の検討2012

    • Author(s)
      高橋郁也,加藤正治,小坂哲夫
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      信州大学
    • Year and Date
      2012-09-19
    • Related Report
      2012 Annual Research Report 2012 Final Research Report
  • [Presentation] 単語グラフ統合を用いた残響下音声認識の検討2012

    • Author(s)
      倉又俊輔,加藤正治,小坂哲夫
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      神奈川大学横浜キャンパス
    • Year and Date
      2012-03-13
    • Related Report
      2012 Final Research Report
  • [Presentation] 単語グラフ統合を用いた残響下音声認識の検討2012

    • Author(s)
      倉又俊輔, 加藤正治, 小坂哲夫
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      神奈川大学
    • Year and Date
      2012-03-13
    • Related Report
      2011 Annual Research Report
  • [Presentation] 教師なし話者適応における各種パラメータの最適化2012

    • Author(s)
      今野聡介,加藤正治,小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学工学部
    • Year and Date
      2012-03-09
    • Related Report
      2012 Final Research Report
  • [Presentation] 自動発音評定における母音置換規則の検討2012

    • Author(s)
      佐藤慶,加藤正治,小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学工学部
    • Year and Date
      2012-03-09
    • Related Report
      2012 Final Research Report
  • [Presentation] 雑音下音声認識におけるヒストグラム同等化法の改良2012

    • Author(s)
      高橋郁也,加藤正治,小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学工学部
    • Year and Date
      2012-03-09
    • Related Report
      2012 Final Research Report
  • [Presentation] 教師なし話者適応における各種パラメータの最適化2012

    • Author(s)
      今野聡介, 加藤正治, 小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学
    • Year and Date
      2012-03-09
    • Related Report
      2011 Annual Research Report
  • [Presentation] 自動発音評定における母音置換規則の検討2012

    • Author(s)
      佐藤慶, 加藤正治, 小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学
    • Year and Date
      2012-03-09
    • Related Report
      2011 Annual Research Report
  • [Presentation] 雑音下音声認識におけるヒストグラム同等化法の改良2012

    • Author(s)
      高橋郁也, 加藤正治, 小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学
    • Year and Date
      2012-03-09
    • Related Report
      2011 Annual Research Report
  • [Presentation] 少量のデータによるヒストグラム同等化法の検討2011

    • Author(s)
      湊竜一,加藤正治,小坂哲夫
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      島根大学松江キャンパス
    • Year and Date
      2011-09-20
    • Related Report
      2012 Final Research Report
  • [Presentation] 少量のデータによるヒストグラム同等化法の検討2011

    • Author(s)
      湊竜一, 加藤正治, 小坂哲夫
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      島根大学
    • Year and Date
      2011-09-20
    • Related Report
      2011 Annual Research Report
  • [Presentation] 教師なし音響・言語モデル適応の性能改善2011

    • Author(s)
      宮本太郎,加藤正治,小坂哲夫
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      早稲田大学
    • Year and Date
      2011-03-10
    • Related Report
      2012 Final Research Report 2010 Annual Research Report
  • [Presentation] 日本人英語の自動発音評定における精度向上の検討2011

    • Author(s)
      久住大,加藤正治,小坂哲夫
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      早稲田大学
    • Year and Date
      2011-03-10
    • Related Report
      2012 Final Research Report 2010 Annual Research Report
  • [Presentation] 日本人英語と米国人英語の音素モデル間距離の検討2010

    • Author(s)
      久住大,加藤正治,小坂哲夫
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      関西大学千里山キャンパス
    • Year and Date
      2010-09-16
    • Related Report
      2012 Final Research Report
  • [Presentation] 日本人英語と米国人英語の音素モデル間距離の検討2010

    • Author(s)
      久住大, 加藤正治, 小坂哲夫
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      関西大学
    • Year and Date
      2010-09-16
    • Related Report
      2010 Annual Research Report
  • [Presentation] Quinphone HM-Netに基づく講演音声認識2010

    • Author(s)
      加藤正治,小坂哲夫,伊藤彰則,牧野正三
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      関西大学千里山キャンパス
    • Year and Date
      2010-09-14
    • Related Report
      2012 Final Research Report
  • [Presentation] Quinphone HM-Netに基づく講演音声認識2010

    • Author(s)
      加藤正治, 小坂哲夫, 伊藤彰則, 牧野正三
    • Organizer
      日本音響学会講演論文集
    • Place of Presentation
      関西大学
    • Year and Date
      2010-09-14
    • Related Report
      2010 Annual Research Report
  • [Presentation] 単語グラフ統合を用いた種々の雑音環境下での音声認識2010

    • Author(s)
      倉又俊輔,加藤正治,小坂哲夫
    • Organizer
      電子情報通信学会技術研究報告
    • Place of Presentation
      仙台市秋保温泉
    • Year and Date
      2010-07-23
    • Related Report
      2012 Final Research Report
  • [Presentation] Quinphone HM-netを用いた単語グラフ統合に基づく講演音声認識2010

    • Author(s)
      加藤正治,小坂哲夫,伊藤彰則,牧野正三
    • Organizer
      電子情報通信学会技術研究報告
    • Place of Presentation
      九州大学筑紫キャンパス
    • Year and Date
      2010-06-18
    • Related Report
      2012 Final Research Report
  • [Presentation] 入力音声の韻律情報を用いたHMM音声合成

    • Author(s)
      栗原大樹, 加藤正治, 小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学工学部
    • Related Report
      2012 Annual Research Report
  • [Presentation] クロスバリデーションによる教師なし言語適応における各種パラメータの最適化

    • Author(s)
      高木瑛, 加藤正治, 小坂哲夫
    • Organizer
      情報処理学会東北支部研究会
    • Place of Presentation
      山形大学工学部
    • Related Report
      2012 Annual Research Report
  • [Book] 電子情報通信学会知識ベース ,2群画像・音・言語,7編音声認識と合成 ,「2-4話者・環境適応」(原島博, 他編)2011

    • Author(s)
      小坂哲夫
    • Total Pages
      3
    • Publisher
      電子情報通信学会
    • Related Report
      2012 Final Research Report
  • [Book] "Improvement of Lecture Speech Recognition by Using Unsupervised Adaptation," E-Activity and IntelligentWeb Construction: Effects of Social Design2011

    • Author(s)
      Tetsuo Kosaka, Takashi Kusama, Masaharu Kato and Masaki Kohda(T.Matsuo and T.Fujimoto ed.)
    • Publisher
      Information Science Reference
    • Related Report
      2012 Final Research Report
  • [Book] E-Activity and Intelligent Web Construction, "Improvement of Lecture Speech Recognition by Using Unsupervised Adaptation"(16章)2011

    • Author(s)
      T.Matsuo, 他編
    • Publisher
      IGI Global
    • Related Report
      2011 Annual Research Report
  • [Book] 電子情報通信学会知識ベース, 群画像・音・言語, 7編音声認識と合成, 「2-4話者・環境適応」, 小坂哲夫(執筆担当)2011

    • Author(s)
      原島博, 他編
    • Total Pages
      4
    • Publisher
      電子情報通信学会
    • Related Report
      2010 Annual Research Report
  • [Remarks] 小坂研究室

    • URL

      http://eieweb.yz.yamagata-u.ac.jp/~kosaka/

    • Related Report
      2012 Annual Research Report

URL: 

Published: 2010-08-23   Modified: 2019-07-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi