Large Bocabularly Spoken Word Recognition system Using Phonemic Segmentation Units

Research Project

Project/Area Number	05555102
Research Category	Grant-in-Aid for Developmental Scientific Research (B)
Allocation Type	Single-year Grants
Research Field	情報通信工学
Research Institution	Tokyo Institute of Techology
Principal Investigator	IMAI Satoshi Tokyo Institute of Techology, P & I Laboratory, Professor, 精密工学研究所, 教授 (50016763)
Co-Investigator(Kenkyū-buntansha)	TANIGUCHI Ichiro Tokyo Institute of Techology, P & I Laboratory, Research Assoc., 精密工学研究所, 助手 (10242314)
Project Period (FY)	1993 – 1994
Project Status	Completed (Fiscal Year 1994)
Budget Amount *help	¥9,900,000 (Direct Cost: ¥9,900,000) Fiscal Year 1994: ¥700,000 (Direct Cost: ¥700,000) Fiscal Year 1993: ¥9,200,000 (Direct Cost: ¥9,200,000)
Keywords	Spoken word Recognition / Phonemic Segmentation / Phoneme Labeling / Large Vocabulary / Multiple Referenc Pattern / Parallel Phoneme Labeling Method / Segment Lattice / Multiple Segmentation Method / 大語彙化
Research Abstract	In this research project, we substantiated that the speech recognition method based on the phonemic segmentation and phoneme labeling was very effective for the large vocabulary spoken word recognition, and we developed a high performance large vocabulary spoken word recognition system using the phonemic segmentation units. This spoken word recognition system is composed of the following subsystems : an acoustic analysis subsystem, phonemic segmentation units, phoneme labeling subsystem and word matching subsystem. one ofproblem of this word recognition system errors in the phonemic segmentation and phoneme labeling. We tried to improve the system in the phonemic segmentation and phoneme labeling. Trough this research project, we got the following good results. (1) We realized a high performance automatic phonemic segmentation unit for speaker and context independent Japanese speech recognition system. We substantiated that this segmentation unit was effective for the large vocabulary word recognition. (2) We developed a higher performance large vocabulary spoken word recognition system using the phonemic segmentation unit and phoneme labeling system.Experiments were carried out using the dictionaries of 1845 words and 4915 words to evaluate the system. The word recognition rates for the first candidate were found to be 96.5% and 94.5% for 1845 word and 4915 word dictionaries respectively. An estimated recognition rate for 20000 word dictionary was approximately 90%. (3) We proposed the parallel phonemic segmentation method in order to achieve a higher word recognition rate. Using the parallel phonemic segmentation unit, we obtained 1 or 2% higher recognition rate for 4915 word dictionay. We also proposed the parallel phoneme labeling method, and substantiated the method is very effective for realizing a higher recognition rate.

Report

(3 results)

1994 Annual Research Report Final Research Report Summary
1993 Annual Research Report

Research Products
(25 results)

All Other

All Publications (25 results)

[Publications] 鈴木良弥: "体示的な意味カテゴリーで記述された係り受け関係を利用する日本語文音声認識" 電子情報通信学会論文誌. J76-D-II. 2264-2273 (1993)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] 今井聖: "並列音素ラベリング(PPL)方式による話者独立単語音声認識システム" 電子情報通信学会論文誌. J77-A. 143-152 (1994)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] 菅野俊夫: "雑音劣化音声の一般化ケプストラムモデル化における事前情報の利用" 電子情報通信学会論文誌. J77-A. 945-953 (1994)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] 古市千枝子: "話者独立な特徴パラメータを用いた英語連続音声の音素セグメンテーションシステム" 電子情報通信学会論文誌. J78-A. 295-304 (1995)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] Ming-Shen Wang: "A New Approach of Parsing and Speech Based on the Divide and Conquer Strategy for Continuous Speech Recognition" IEICE Trans. on Information and Systems. E78-D. (1995)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] 胡力游: "中国語連続音声の声調認識" 電子情報通信学会論文誌. J78-A. (1995)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] 今井聖: "信号処理工学" コロナ社(テレビジョン学会編), 201 (1993)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] 今井聖: "音声認識" 共立出版, 200 (1995)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] Y.Suzuki, C.Furuichi, S.Imai: ""Spokrn Japanese Sentence Recognition Using Dependency Relationship with Systematical Semantic Category" (in Japanese)" Trans.IEICE. J76-D-II[11]. 2264-2273 (1993)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] S.Imai, I.Taniguchi, C.Furuichi, T.Kawasaki and H.Doi: ""Speaker-Independent Spoken Word Recognition System Based on Parallel Phoneme Labeling Method" (in Japanese)" Trans.IEICE. J77-A[2]. 143-152 (1994)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] T.Kanno, T.Kobayashi and S.Imai: ""On the Use of a priori Information in Generalized Cepstral Modeling of Degraded Speech" (in Japanese)" Trans.IEICE. J77-A[7]. 945-953 (1994)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] C.Furuich, K.Aizawa and S.Imai: ""Automatic Phonemic Segmentation System of English Continuous Speech by Using Speaker-Independent Features" (in Japanese)" Trans.IEICE. J78-A[3]. 295-304 (1995)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] M.S.Wang and S.Imai: ""A New Approach of Parsing and Search Based on the Divide and Conquer Strategy for Continuous Speech Recognition"" IEICE Trans. Inf.& Syst. E78-D[7]. (1995)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] L.Hu and S.Imai: ""Tone Recognition for Continuous Mandarin Speech" (in Japanese)" Trans.IEICE. J78-A[7]. (1995)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] S.Imai: Signal Processing Technology (in Japanese). Corona Publishing Co., Ltd., 201 (1993)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] S.Imai: Speech Recognition (in Japanese). Kyoritsu Shuppan Co., Ltd, 200 (1995)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1994 Final Research Report Summary
[Publications] 菅野俊夫: "雑音劣化音声の一般化ケプストラムモデル化における事前情報の利用" 電子情報通信学会論文誌. J77-A. 945-953 (1994)
- Related Report
  1994 Annual Research Report
[Publications] 徳田恵一: "適応メルケプストラム分析を利用した音声符号化とその評価" 電子情報通信学会論文誌. J77-A. 1443-1452 (1994)
- Related Report
  1994 Annual Research Report
[Publications] 古市千枝子: "話音独立な特徴パラメータを用いた英語連続音声の音素セグメンテーションシステム" 電子情報通信学会論文誌. J78-A. (1995)
- Related Report
  1994 Annual Research Report
[Publications] Ming-Shen WANG: "A New Approach of Parsing and Speech Based on the Divide and Conguer Strategy for Continnois Speech Recognition" IEICE Trans.on Information and Systems. E78-D. (1995)
- Related Report
  1994 Annual Research Report
[Publications] 胡力游: "中国語連続音声の声調認識" 電子情報通信学会論文誌. J78-A. (1995)
- Related Report
  1994 Annual Research Report
[Publications] 菅野俊夫: "Generalized Cepstral Modeling of Degraded Speech and its Application to speech Enhancement" IEICE Trans.Fundamentals. E76-A. 1300-1307 (1993)
- Related Report
  1993 Annual Research Report
[Publications] 鈴木良弥: "体系的な意味カテゴリーで記述された係り受け関係を利用する日本語文音声認識" 電子情報通信学会論文誌D-II. J76-D-II. 2264-2273 (1993)
- Related Report
  1993 Annual Research Report
[Publications] 今井聖: "並列音素ラベリング(PPL)方式による話者独立単語音声認識システム" 電子情報通信学会論文誌A. J77-A. 143-152 (1994)
- Related Report
  1993 Annual Research Report
[Publications] 今井聖: "信号処理工学" コロナ社(テレビジョン学会編), 201 (1993)
- Related Report
  1993 Annual Research Report

Large Bocabularly Spoken Word Recognition system Using Phonemic Segmentation Units

Principal Investigator

IMAI Satoshi Tokyo Institute of Techology, P & I Laboratory, Professor, 精密工学研究所, 教授 (50016763)

¥9,900,000 (Direct Cost: ¥9,900,000)

Report

Research Products

[Publications] 鈴木 良弥: "体示的な意味カテゴリーで記述された係り受け関係を利用する日本語文音声認識" 電子情報通信学会論文誌. J76-D-II. 2264-2273 (1993)

Description

Related Report

[Publications] 今井 聖: "並列音素ラベリング(PPL)方式による話者独立単語音声認識システム" 電子情報通信学会論文誌. J77-A. 143-152 (1994)

Description

Related Report

[Publications] 菅野 俊夫: "雑音劣化音声の一般化ケプストラムモデル化における事前情報の利用" 電子情報通信学会論文誌. J77-A. 945-953 (1994)

Description

Related Report

[Publications] 古市 千枝子: "話者独立な特徴パラメータを用いた英語連続音声の音素セグメンテーションシステム" 電子情報通信学会論文誌. J78-A. 295-304 (1995)

Description

Related Report

[Publications] Ming-Shen Wang: "A New Approach of Parsing and Speech Based on the Divide and Conquer Strategy for Continuous Speech Recognition" IEICE Trans. on Information and Systems. E78-D. (1995)

Description

Related Report

[Publications] 胡 力游: "中国語連続音声の声調認識" 電子情報通信学会論文誌. J78-A. (1995)

Description

Related Report

[Publications] 今井 聖: "信号処理工学" コロナ社(テレビジョン学会編), 201 (1993)

Description

Related Report

[Publications] 今井 聖: "音声認識" 共立出版, 200 (1995)

Description

Related Report

[Publications] Y.Suzuki, C.Furuichi, S.Imai: ""Spokrn Japanese Sentence Recognition Using Dependency Relationship with Systematical Semantic Category" (in Japanese)" Trans.IEICE. J76-D-II[11]. 2264-2273 (1993)

Description

Related Report

[Publications] S.Imai, I.Taniguchi, C.Furuichi, T.Kawasaki and H.Doi: ""Speaker-Independent Spoken Word Recognition System Based on Parallel Phoneme Labeling Method" (in Japanese)" Trans.IEICE. J77-A[2]. 143-152 (1994)

Description

Related Report

[Publications] T.Kanno, T.Kobayashi and S.Imai: ""On the Use of a priori Information in Generalized Cepstral Modeling of Degraded Speech" (in Japanese)" Trans.IEICE. J77-A[7]. 945-953 (1994)

Description

Related Report

[Publications] C.Furuich, K.Aizawa and S.Imai: ""Automatic Phonemic Segmentation System of English Continuous Speech by Using Speaker-Independent Features" (in Japanese)" Trans.IEICE. J78-A[3]. 295-304 (1995)

Description

Related Report

[Publications] M.S.Wang and S.Imai: ""A New Approach of Parsing and Search Based on the Divide and Conquer Strategy for Continuous Speech Recognition"" IEICE Trans. Inf.& Syst. E78-D[7]. (1995)

Description

Related Report

[Publications] L.Hu and S.Imai: ""Tone Recognition for Continuous Mandarin Speech" (in Japanese)" Trans.IEICE. J78-A[7]. (1995)

Description

Related Report

[Publications] S.Imai: Signal Processing Technology (in Japanese). Corona Publishing Co., Ltd., 201 (1993)

Description

Related Report

[Publications] S.Imai: Speech Recognition (in Japanese). Kyoritsu Shuppan Co., Ltd, 200 (1995)

Description

Related Report

[Publications] 菅野俊夫: "雑音劣化音声の一般化ケプストラムモデル化における事前情報の利用" 電子情報通信学会論文誌. J77-A. 945-953 (1994)

Related Report

[Publications] 徳田恵一: "適応メルケプストラム分析を利用した音声符号化とその評価" 電子情報通信学会論文誌. J77-A. 1443-1452 (1994)

Related Report

[Publications] 古市千枝子: "話音独立な特徴パラメータを用いた英語連続音声の音素セグメンテーションシステム" 電子情報通信学会論文誌. J78-A. (1995)

Related Report

[Publications] Ming-Shen WANG: "A New Approach of Parsing and Speech Based on the Divide and Conguer Strategy for Continnois Speech Recognition" IEICE Trans.on Information and Systems. E78-D. (1995)

Related Report

[Publications] 胡 力游: "中国語連続音声の声調認識" 電子情報通信学会論文誌. J78-A. (1995)

Related Report

[Publications] 菅野俊夫: "Generalized Cepstral Modeling of Degraded Speech and its Application to speech Enhancement" IEICE Trans.Fundamentals. E76-A. 1300-1307 (1993)

Related Report

[Publications] 鈴木良弥: "体系的な意味カテゴリーで記述された係り受け関係を利用する日本語文音声認識" 電子情報通信学会論文誌D-II. J76-D-II. 2264-2273 (1993)

Related Report

[Publications] 今井 聖: "並列音素ラベリング(PPL)方式による話者独立単語音声認識システム" 電子情報通信学会論文誌A. J77-A. 143-152 (1994)

Related Report

[Publications] 今井 聖: "信号処理工学" コロナ社(テレビジョン学会編), 201 (1993)

Related Report

[Publications] 鈴木良弥: "体示的な意味カテゴリーで記述された係り受け関係を利用する日本語文音声認識" 電子情報通信学会論文誌. J76-D-II. 2264-2273 (1993)

[Publications] 今井聖: "並列音素ラベリング(PPL)方式による話者独立単語音声認識システム" 電子情報通信学会論文誌. J77-A. 143-152 (1994)

[Publications] 菅野俊夫: "雑音劣化音声の一般化ケプストラムモデル化における事前情報の利用" 電子情報通信学会論文誌. J77-A. 945-953 (1994)

[Publications] 古市千枝子: "話者独立な特徴パラメータを用いた英語連続音声の音素セグメンテーションシステム" 電子情報通信学会論文誌. J78-A. 295-304 (1995)

[Publications] 胡力游: "中国語連続音声の声調認識" 電子情報通信学会論文誌. J78-A. (1995)

[Publications] 今井聖: "信号処理工学" コロナ社(テレビジョン学会編), 201 (1993)

[Publications] 今井聖: "音声認識" 共立出版, 200 (1995)

[Publications] 胡力游: "中国語連続音声の声調認識" 電子情報通信学会論文誌. J78-A. (1995)

[Publications] 今井聖: "並列音素ラベリング(PPL)方式による話者独立単語音声認識システム" 電子情報通信学会論文誌A. J77-A. 143-152 (1994)

[Publications] 今井聖: "信号処理工学" コロナ社(テレビジョン学会編), 201 (1993)