1996 Fiscal Year Final Research Report Summary

Algorithm of Spontaneous Speech Recognition Based on A^<**> Search

Research Project

Project/Area Number	07680379
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	Yamagata University
Principal Investigator	KOHDA Masaki Yamagata University, Faculty of Engineering, Professor, 工学部, 教授 (00205337)
Co-Investigator(Kenkyū-buntansha)	KATOH Masaharu Yamagata University, Faculty of Engineering, Asistant, 工学部, 助手 (10250953)
Project Period (FY)	1995 – 1996
Keywords	Speech Recognition / Acoustic Model / Language Model / Hidden Markov Model / Hidden Markov Net / N-gram / Phoneme Decision Tree / Likelihood Normalization
Research Abstract	Spontaneous speech recognition is regarded as a problem of graph search considering various restrictions through acoustic model, lexicon, language model and so on. In order to reduce a computation amount for recognition processing without degradation of recognition performance, some key technologies of spontaneous speech recognition were investigated. (1) Acoustic model and speaker adaptation The important aspects of context-dependent acoustic modeling using a limited training data set are how to tie the model parameters and how to handle the unseen contexts. We proposed the decision tree-based successive state splitting algorithm, and showed that HM-Net generated with this algorithm had high accuracy and enabled to represent any contexts. Speaker adaptation of acoustic model parameters based on MAP estimation method was also investigated. (2) Fast matching and likelihood normalization In large vocabulary word recognition, a fast preselection of word candidates was investigated. Phoneme recognition of input speech was carried out and an optimal phoneme sequence was obtained from the input speech. To select word candidates, DP matching was executed with the optimal phoneme sequence. The word candidates were verified by Viterbi scoring between input speech and HMM-based word model. Normalization technique of word likelihood for spontaneous speech recognition was also investigated. (3) Language model and task adaptation N-gram language models were constructed from EDR corpus, 5-million-word Japanese corpus. The models were investigated under various conditions about training text size, vocabulary and cutoff condition. The result of experiments clarified the optimum condition under a certain training text size. We carried out another experiments about task adaptation. An N-gram model from a dialog was mixed with the N-gram from EDR corpus, which made about 60% reduction of perplexity.

Research Products
(12 results)

All Other

All Publications (12 results)

[Publications] 伊藤彰則: "かな・漢字文字列の連鎖統計による言語モデル" 電子情報通信学会論文誌. 79-D-II,12. 2062-2069 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 伊藤彰則: "Language modelling by string pattern N-gram for Japanese speech recognition" 音声言語処理に関する国際会議(ICSLP). Vol.1. 490-493 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 伊藤彰則: "対話音声認識のための事前タスク適応の検討" 電子情報通信学会技術研究報告. S96-81. 25-32 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 堀貴明: "音素決定木に基づく逐次状態分割法によるHMnetの性能改善" 電子情報通信学会技術研究報告. S96-80. 17-24 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 加藤正治: "最適音素系列に基づく単語予備選択法の検討" 電子情報通信学会技術研究報告. S96-13. 9-14 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 伊藤彰則: "大語彙言語データベースからのN-gram構築とタスク適応の検討" 情報処理学会音声言語情報処理研究会. SLP-11-5. 25-30 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] A.Ito, M.Kohda: "Language Modeling by Kana and Kanji String N-gram" Trans.IEICE. Vol.J79-D-II,No.12. 2062-2069 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] A.Ito, M.Kohda: "Language Modelling by String Pattern N-gram for Japanese Speech Recongition" Proc.International Conference on Spoken Language Processing. Vol.1. 490-493 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] A.Ito, M.Kohda: "Task Adaptation of a Stochastic Language Model for Dialogue Speech Recognition" Technical Report of IEICE. SP96-81. 25-32 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] T.Hori, M.Katoh, A.Ito, M.Kohda: "A Study on Improvement of HM-Nets Using Decision Tree-based Successive State Splitting" Technical Report of IEICE. SP96-80. 17-24 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] M.Katoh, A.Ito, M.Kohda: "A Study on Word Preselection Utilizing Optimal Phoneme Sequence" Technical Report of IEICE. SP96-13. 9-14 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] A.Ito, N.Daishima, A.maruyama, M.Katoh, M.Kohda: "N-gram Estimation from Japanese Large Corpus and Task Adaptation of N-gram" Technical Report of IPSJ. SLP-11-5. 25-30 (1996)
- Description
  「研究成果報告書概要(欧文)」より

1996 Fiscal Year Final Research Report Summary

Algorithm of Spontaneous Speech Recognition Based on A^<**> Search

Principal Investigator

KOHDA Masaki Yamagata University, Faculty of Engineering, Professor, 工学部, 教授 (00205337)

Research Products

[Publications] 伊藤彰則: "かな・漢字文字列の連鎖統計による言語モデル" 電子情報通信学会論文誌. 79-D-II,12. 2062-2069 (1996)

Description

[Publications] 伊藤彰則: "Language modelling by string pattern N-gram for Japanese speech recognition" 音声言語処理に関する国際会議(ICSLP). Vol.1. 490-493 (1996)

Description

[Publications] 伊藤彰則: "対話音声認識のための事前タスク適応の検討" 電子情報通信学会技術研究報告. S96-81. 25-32 (1996)

Description

[Publications] 堀貴明: "音素決定木に基づく逐次状態分割法によるHMnetの性能改善" 電子情報通信学会技術研究報告. S96-80. 17-24 (1996)

Description

[Publications] 加藤正治: "最適音素系列に基づく単語予備選択法の検討" 電子情報通信学会技術研究報告. S96-13. 9-14 (1996)

Description

[Publications] 伊藤彰則: "大語彙言語データベースからのN-gram構築とタスク適応の検討" 情報処理学会音声言語情報処理研究会. SLP-11-5. 25-30 (1996)

Description

[Publications] A.Ito, M.Kohda: "Language Modeling by Kana and Kanji String N-gram" Trans.IEICE. Vol.J79-D-II,No.12. 2062-2069 (1996)

Description

[Publications] A.Ito, M.Kohda: "Language Modelling by String Pattern N-gram for Japanese Speech Recongition" Proc.International Conference on Spoken Language Processing. Vol.1. 490-493 (1996)

Description

[Publications] A.Ito, M.Kohda: "Task Adaptation of a Stochastic Language Model for Dialogue Speech Recognition" Technical Report of IEICE. SP96-81. 25-32 (1996)

Description

[Publications] T.Hori, M.Katoh, A.Ito, M.Kohda: "A Study on Improvement of HM-Nets Using Decision Tree-based Successive State Splitting" Technical Report of IEICE. SP96-80. 17-24 (1996)

Description

[Publications] M.Katoh, A.Ito, M.Kohda: "A Study on Word Preselection Utilizing Optimal Phoneme Sequence" Technical Report of IEICE. SP96-13. 9-14 (1996)

Description

[Publications] A.Ito, N.Daishima, A.maruyama, M.Katoh, M.Kohda: "N-gram Estimation from Japanese Large Corpus and Task Adaptation of N-gram" Technical Report of IPSJ. SLP-11-5. 25-30 (1996)

Description