Algorithm of Spontaneous Speech Recognition Based on A^<**> Search

Research Project

Project/Area Number	07680379
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	Yamagata University
Principal Investigator	KOHDA Masaki Yamagata University, Faculty of Engineering, Professor, 工学部, 教授 (00205337)
Co-Investigator(Kenkyū-buntansha)	KATOH Masaharu Yamagata University, Faculty of Engineering, Asistant, 工学部, 助手 (10250953)
Project Period (FY)	1995 – 1996
Project Status	Completed (Fiscal Year 1996)
Budget Amount *help	¥1,700,000 (Direct Cost: ¥1,700,000) Fiscal Year 1996: ¥300,000 (Direct Cost: ¥300,000) Fiscal Year 1995: ¥1,400,000 (Direct Cost: ¥1,400,000)
Keywords	Speech Recognition / Acoustic Model / Language Model / Hidden Markov Model / Hidden Markov Net / N-gram / Phoneme Decision Tree / Likelihood Normalization / サーチ手法 / 単語予備選択
Research Abstract	Spontaneous speech recognition is regarded as a problem of graph search considering various restrictions through acoustic model, lexicon, language model and so on. In order to reduce a computation amount for recognition processing without degradation of recognition performance, some key technologies of spontaneous speech recognition were investigated. (1) Acoustic model and speaker adaptation The important aspects of context-dependent acoustic modeling using a limited training data set are how to tie the model parameters and how to handle the unseen contexts. We proposed the decision tree-based successive state splitting algorithm, and showed that HM-Net generated with this algorithm had high accuracy and enabled to represent any contexts. Speaker adaptation of acoustic model parameters based on MAP estimation method was also investigated. (2) Fast matching and likelihood normalization In large vocabulary word recognition, a fast preselection of word candidates was investigated. Phoneme recognition of input speech was carried out and an optimal phoneme sequence was obtained from the input speech. To select word candidates, DP matching was executed with the optimal phoneme sequence. The word candidates were verified by Viterbi scoring between input speech and HMM-based word model. Normalization technique of word likelihood for spontaneous speech recognition was also investigated. (3) Language model and task adaptation N-gram language models were constructed from EDR corpus, 5-million-word Japanese corpus. The models were investigated under various conditions about training text size, vocabulary and cutoff condition. The result of experiments clarified the optimum condition under a certain training text size. We carried out another experiments about task adaptation. An N-gram model from a dialog was mixed with the N-gram from EDR corpus, which made about 60% reduction of perplexity.

Report

(3 results)

1996 Annual Research Report Final Research Report Summary
1995 Annual Research Report

Research Products
(24 results)

All Other

All Publications (24 results)

[Publications] 伊藤彰則: "かな・漢字文字列の連鎖統計による言語モデル" 電子情報通信学会論文誌. 79-D-II,12. 2062-2069 (1996)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] 伊藤彰則: "Language modelling by string pattern N-gram for Japanese speech recognition" 音声言語処理に関する国際会議(ICSLP). Vol.1. 490-493 (1996)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] 伊藤彰則: "対話音声認識のための事前タスク適応の検討" 電子情報通信学会技術研究報告. S96-81. 25-32 (1996)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] 堀貴明: "音素決定木に基づく逐次状態分割法によるHMnetの性能改善" 電子情報通信学会技術研究報告. S96-80. 17-24 (1996)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] 加藤正治: "最適音素系列に基づく単語予備選択法の検討" 電子情報通信学会技術研究報告. S96-13. 9-14 (1996)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] 伊藤彰則: "大語彙言語データベースからのN-gram構築とタスク適応の検討" 情報処理学会音声言語情報処理研究会. SLP-11-5. 25-30 (1996)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] A.Ito, M.Kohda: "Language Modeling by Kana and Kanji String N-gram" Trans.IEICE. Vol.J79-D-II,No.12. 2062-2069 (1996)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] A.Ito, M.Kohda: "Language Modelling by String Pattern N-gram for Japanese Speech Recongition" Proc.International Conference on Spoken Language Processing. Vol.1. 490-493 (1996)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] A.Ito, M.Kohda: "Task Adaptation of a Stochastic Language Model for Dialogue Speech Recognition" Technical Report of IEICE. SP96-81. 25-32 (1996)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] T.Hori, M.Katoh, A.Ito, M.Kohda: "A Study on Improvement of HM-Nets Using Decision Tree-based Successive State Splitting" Technical Report of IEICE. SP96-80. 17-24 (1996)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] M.Katoh, A.Ito, M.Kohda: "A Study on Word Preselection Utilizing Optimal Phoneme Sequence" Technical Report of IEICE. SP96-13. 9-14 (1996)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] A.Ito, N.Daishima, A.maruyama, M.Katoh, M.Kohda: "N-gram Estimation from Japanese Large Corpus and Task Adaptation of N-gram" Technical Report of IPSJ. SLP-11-5. 25-30 (1996)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1996 Final Research Report Summary
[Publications] 伊藤彰則: "かな・漢字文字列の連鎖統計による言語モデル" 電子情報通信学会論文誌. 79-D-II,12. 2062-2069 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 伊藤彰則: "Language modelling by string pattern N-gram for Japanese speech recognition" 音声言語処理に関する国際会議(ICSLP). Vol.1. 490-493 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 伊藤彰則: "対話音声認識のための事前タスク適応の検討" 電子情報通信学会技術研究報告. SP96-81. 25-32 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 堀貴明: "音素決定木に基づく逐次状態分割法によるHMnetの性能改善" 電子情報通信学会技術研究報告. SP96-80. 17-24 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 加藤正治: "最適音素系列に基づく単語予備選択法の検討" 電子情報通信学会技術研究報告. SP96-13. 9-14 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 伊藤彰則: "大語彙言語データベースからのN-gram構築とタスク適応の検討" 情報処理学会音声言語情報処理研究会. SLP-11-5. 25-30 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 加藤正治: "HMMによるワードスポッティングにおけるViterbi best-firsサーチの検討" 情報処理学会東北支部研究会資料. 94-4-4. 1-6 (1995)
- Related Report
  1995 Annual Research Report
[Publications] 大関雅和: "確率文脈自由文法を用いたHMM-LR文節音声認識におけるフレーム同期ビームサーチの検討" 情報処理学会東北支部研究会資料. 94-4-5. 1-8 (1995)
- Related Report
  1995 Annual Research Report
[Publications] 伊藤彰則: "文字列パターンのN-gramによる文節モデルの検討" 電子情報通信学会技術研究報告. SP95-96. 19-24 (1995)
- Related Report
  1995 Annual Research Report
[Publications] 堀貴明: "音素決定木に基づく逐次状態分割によるHM-Netの検討" 日本音響学会研究発表会講演論文集. 1. 145-146 (1996)
- Related Report
  1995 Annual Research Report
[Publications] 加藤正治: "最適音素系列に基づく単語予備選択法の検討" 日本音響学会研究発表会講演論文集. 1. 79-80 (1996)
- Related Report
  1995 Annual Research Report
[Publications] 代島直人: "大語彙言語データベースからのN-gram構築とタスク適応の検討" 情報処理学会東北支部研究会資料. 95-5-17. 1-7 (1996)
- Related Report
  1995 Annual Research Report

Algorithm of Spontaneous Speech Recognition Based on A^<**> Search

Principal Investigator

KOHDA Masaki Yamagata University, Faculty of Engineering, Professor, 工学部, 教授 (00205337)

¥1,700,000 (Direct Cost: ¥1,700,000)

Report

Research Products

[Publications] 伊藤彰則: "かな・漢字文字列の連鎖統計による言語モデル" 電子情報通信学会論文誌. 79-D-II,12. 2062-2069 (1996)

Description

Related Report

[Publications] 伊藤彰則: "Language modelling by string pattern N-gram for Japanese speech recognition" 音声言語処理に関する国際会議(ICSLP). Vol.1. 490-493 (1996)

Description

Related Report

[Publications] 伊藤彰則: "対話音声認識のための事前タスク適応の検討" 電子情報通信学会技術研究報告. S96-81. 25-32 (1996)

Description

Related Report

[Publications] 堀貴明: "音素決定木に基づく逐次状態分割法によるHMnetの性能改善" 電子情報通信学会技術研究報告. S96-80. 17-24 (1996)

Description

Related Report

[Publications] 加藤正治: "最適音素系列に基づく単語予備選択法の検討" 電子情報通信学会技術研究報告. S96-13. 9-14 (1996)

Description

Related Report

[Publications] 伊藤彰則: "大語彙言語データベースからのN-gram構築とタスク適応の検討" 情報処理学会音声言語情報処理研究会. SLP-11-5. 25-30 (1996)

Description

Related Report

[Publications] A.Ito, M.Kohda: "Language Modeling by Kana and Kanji String N-gram" Trans.IEICE. Vol.J79-D-II,No.12. 2062-2069 (1996)

Description

Related Report

[Publications] A.Ito, M.Kohda: "Language Modelling by String Pattern N-gram for Japanese Speech Recongition" Proc.International Conference on Spoken Language Processing. Vol.1. 490-493 (1996)

Description

Related Report

[Publications] A.Ito, M.Kohda: "Task Adaptation of a Stochastic Language Model for Dialogue Speech Recognition" Technical Report of IEICE. SP96-81. 25-32 (1996)

Description

Related Report

[Publications] T.Hori, M.Katoh, A.Ito, M.Kohda: "A Study on Improvement of HM-Nets Using Decision Tree-based Successive State Splitting" Technical Report of IEICE. SP96-80. 17-24 (1996)

Description

Related Report

[Publications] M.Katoh, A.Ito, M.Kohda: "A Study on Word Preselection Utilizing Optimal Phoneme Sequence" Technical Report of IEICE. SP96-13. 9-14 (1996)

Description

Related Report

[Publications] A.Ito, N.Daishima, A.maruyama, M.Katoh, M.Kohda: "N-gram Estimation from Japanese Large Corpus and Task Adaptation of N-gram" Technical Report of IPSJ. SLP-11-5. 25-30 (1996)

Description

Related Report

[Publications] 伊藤彰則: "かな・漢字文字列の連鎖統計による言語モデル" 電子情報通信学会論文誌. 79-D-II,12. 2062-2069 (1996)

Related Report

[Publications] 伊藤彰則: "Language modelling by string pattern N-gram for Japanese speech recognition" 音声言語処理に関する国際会議(ICSLP). Vol.1. 490-493 (1996)

Related Report

[Publications] 伊藤彰則: "対話音声認識のための事前タスク適応の検討" 電子情報通信学会技術研究報告. SP96-81. 25-32 (1996)

Related Report

[Publications] 堀貴明: "音素決定木に基づく逐次状態分割法によるHMnetの性能改善" 電子情報通信学会技術研究報告. SP96-80. 17-24 (1996)

Related Report

[Publications] 加藤正治: "最適音素系列に基づく単語予備選択法の検討" 電子情報通信学会技術研究報告. SP96-13. 9-14 (1996)

Related Report

[Publications] 伊藤彰則: "大語彙言語データベースからのN-gram構築とタスク適応の検討" 情報処理学会音声言語情報処理研究会. SLP-11-5. 25-30 (1996)

Related Report

[Publications] 加藤正治: "HMMによるワードスポッティングにおけるViterbi best-firsサーチの検討" 情報処理学会東北支部研究会資料. 94-4-4. 1-6 (1995)

Related Report

[Publications] 大関雅和: "確率文脈自由文法を用いたHMM-LR文節音声認識におけるフレーム同期ビームサーチの検討" 情報処理学会東北支部研究会資料. 94-4-5. 1-8 (1995)

Related Report

[Publications] 伊藤彰則: "文字列パターンのN-gramによる文節モデルの検討" 電子情報通信学会技術研究報告. SP95-96. 19-24 (1995)

Related Report

[Publications] 堀貴明: "音素決定木に基づく逐次状態分割によるHM-Netの検討" 日本音響学会研究発表会講演論文集. 1. 145-146 (1996)

Related Report

[Publications] 加藤正治: "最適音素系列に基づく単語予備選択法の検討" 日本音響学会研究発表会講演論文集. 1. 79-80 (1996)

Related Report

[Publications] 代島直人: "大語彙言語データベースからのN-gram構築とタスク適応の検討" 情報処理学会東北支部研究会資料. 95-5-17. 1-7 (1996)

Related Report