Phonemic Segmentation and Word Recognition for Dialogue Level Continuous Speech

Research Project

Project/Area Number	01420028
Research Category	Grant-in-Aid for General Scientific Research (A)
Allocation Type	Single-year Grants
Research Field	電子通信系統工学
Research Institution	Tokyo Institute of Technology
Principal Investigator	IMAI Satoshi Tokyo Institute of Technology Research Laboratory of Precision Machinery and Electronics Professor, 精密工学研究所, 教授 (50016763)
Co-Investigator(Kenkyū-buntansha)	FURUICHI Chieko Tokyo Institute of Technology Research Laboratory of Precision Machinery and Ele, 精密工学研究所, 助手 (90016783)
Project Period (FY)	1989 – 1990
Project Status	Completed (Fiscal Year 1990)
Budget Amount *help	¥17,400,000 (Direct Cost: ¥17,400,000) Fiscal Year 1990: ¥500,000 (Direct Cost: ¥500,000) Fiscal Year 1989: ¥16,900,000 (Direct Cost: ¥16,900,000)
Keywords	Continuous speech recognition / Word speech recognition / Phoneme recognition / Phonemic segmentation / Large vocabulary / Word spotting / Speech data base / Dialogue level / 音声デ-タベ-ス / 会話速度
Research Abstract	Though this research project, we substantiated that the phoneme level segmentation, phoneme labeling and context-independent word recognition method was very effective for automatic continuous speech recognition. We got the following good results. (1) We realized a high performance automatic phonemic segmentation system for speaker and context independent continuous Japanese speech recognition. The segmentation algorithm is implemented as the hierarchical segmentation and broad category classification, using selected segmentation parameters and acoustic phonetic knowledge concerning continuous Japanese speech. The segmentation of continuous, reading-rate speech utterances and phonetically balanced word utterances with various phonetic environments into phonemic units is successfully performed. (2) We developed a high performance speaker-dependent Japanese phoneme recognition system based on the phonemic segmentation and labeling. Experiments were carried out with one female and one male speakers using 600 polysyllabic words in unspecified vocabulary to evaluate the system. The phoneme recognition accuracy was found to be 84.0% and 81.6% for each of a female speaker and a male. (3) We developed a context independent spoken word recognition system. The word recognition procedure is based on the three steps : phonemic segmentation, obtaining a phoneme lattice with the degree of confidence, and word recognition. The word recognition is performed by matching the phoneme lattice of unknown input speech with phonemic symbol sequences in the word dictionary. The word recognition rate of the first candidate was found to be 99.0% and 95.3% for each of a female speaker and a male. (4) We are now investigating a word spotting system. The word spotting is performed by the continuous DP matching of the phoneme lattice for unknown speech with phonemic symbol sequences in the word dictionary. The word spotting method yields a fairly good results.

Report

(3 results)

1990 Annual Research Report Final Research Report Summary
1989 Annual Research Report

Research Products
(26 results)

All Other

All Publications (26 results)

[Publications] 古市千枝子,今井聖: "多様な音韻環境における音素的単位のセグメンテ-ション" 電子情報通信学会論文誌DーII. J72ーDーII. 1221-1227 (1989)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] 古市千枝子,今井聖: "特定話者任意語い連続音声の音素認識" 電子情報通信学会論文誌DーII. J73ーDーII. 501-511 (1990)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] 徳田恵一,小林隆夫,塩本祥司,今井聖: "適応ケプストラム分析ーケプストラムを係数とする適応フィルター" 電子情報通信学会論文誌A. J73ーA. 1207-1215 (1990)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] 古市千枝子,谷口一郎,今井聖: "音素を位とする任意単語音声の認識" 電子情報通信学会論文誌DーII.
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] K.tokuda,T.Kobayashi,S.shiomoto,S.Imai: "Adaptive Filtering Based on Cepstral Representation ーAdaptive Cepstral Analysis of Speech" Proc.of ICASSP90ー1990 Interntional Conference on Acoustics,Speech,and,signal processing.S7.2. 377-380 (1990)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] S.Imai C.Furuichi: "Automatic segmentation of continuous Japanese speech into Phonemic Units" Proc.of EUSIPCOー90 ーFifth European Signal processing conference.1355-1358 (1990)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] FURUICHI Chieko and IMAI Satoshi: "Phonemic Units Segmentation in Various Phonetic Environments" Transactions of the Institute of Electronics, Information and Communication Engineers. J72-D-11. 1221-1227 (1989)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] FURUICHI Chieko and IMAI Satoshi: "Speaker-Independent Phoneme Recognition of Unspecified Vocabulary Japnese Speech" Transactions of the Institute of Electronics, Information and Communication Engineers. J73-D-11. 501-511 (1990)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] TOKUDA Keiichi, KOBAYASHI Takao, SHIOMOTO Shoji and IMAI Satoshi: "Adaptive Cepstral Analysis --- Adaptive Filtering Based on Cepstral Representation ---" Transactions of the Institute of Electronics, Information and Communication Engineers. J73-A. 1207-1215 (1990)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] FURUICHI Chieko, TANIGUCHI Ichiro and IMAI Satoshi: "Context-Independent Word Recognition Based on Phonemic Unit" Transactions of the Institute of Electronics, Information and Communication Engineers. (J74-D-11). (1991)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] IMAI Satoshi and FURUICHI Chieko: "Automatic Segmentation of Continuous Japanese Speech into Phonemic Units" Signal Processing V : Theories and Applications --- Proceedings of EUSIPCO-90 : 1990 European Signal Processing Conference. 1355-1358 (1990)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] TOKUDA Keiichi, KOBAYASHI Takao, SHIOMOTO Shoji and IMAI Satoshi: "Adaptive Filtering Based on Cepstral Representation --- Adaptive Cepstral Analysis of Speech" Proceedings of the ICASSP-90 : 1990 International Conference on Acoustics, Speech, and Signal Processing. 377-380 (1990)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] TOKUDA Keiichi, KOBAYASHI Takao and IMAI Satoshi: "Generalized Cepstral Analysis of Speech --- Unified Approach to LPC and Cepstral Method" Proceedings of ICSLP-90 : 1990 International Conference on Spoken Language Processing. 1. 37-40 (1990)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] HU Zhi-ping and IMAI Satoshi: "Chinese Continuous Speech Recognition System Using the State Transition Models Both of Phonemes and Words" Proceedings of ICSLP-90 : 1990 International Conference on Spoken Language Processing. 2. 1201-1204 (1990)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1990 Final Research Report Summary
[Publications] 古市千枝子,今井聖: "多様な音韻環境における音素的単位のセグメンテ-ション" 電子情報通信学会論文誌DーII. J72ーDーII. 1221-1227 (1989)
- Related Report
  1990 Annual Research Report
[Publications] 古市千枝子,今井聖: "特定話者任意語い連続音声の音素認識" 電子情報通信学会論文誌DーII. J73ーDーII. 501-511 (1990)
- Related Report
  1990 Annual Research Report
[Publications] 徳田恵一,小林隆夫,塩本祥司,今井聖: "適応ケプストラム分析ーケプストラムを係数とする適応フィルター" 電子情報通信学会論文誌A. J73ーA. 1207-1215 (1990)
- Related Report
  1990 Annual Research Report
[Publications] 古市千枝子,谷口一郎,今井聖: "音素を単位とする任意単語音声の認識" 電子情報通信学会論文誌DーII.
- Related Report
  1990 Annual Research Report
[Publications] K.Tokuda,T.Kobayashi,S.Shiomoto,S.Imai: "Adaptive Filtering Based on Cepstral RepresentationーAdaptive Cepstral Analysis of Speech" Proc.of ICASSP 90ー1990 International Conference on Acoustics,Speech,and Signal Processing.(S7.2). 377-380 (1990)
- Related Report
  1990 Annual Research Report
[Publications] S.Imai,C.Furuichi: "Automatic Segmentation of Continuous Japanese Speech into Phonemic Units" Proc.of EUSIPCOー90ーFifth European Signal Processing Conference.1355-1358 (1990)
- Related Report
  1990 Annual Research Report
[Publications] 徳田恵一,小林隆夫,今井聖,斎藤博徳: "メルケプストラムをパラメ-タとする音声のスペクトル推定" 電子情報通信学会技術報告. DSP89ー17. 67-74 (1989)
- Related Report
  1989 Annual Research Report
[Publications] 古市千枝子,今井聖: "多様な音韻環境における音素的単位のセグメンテ-ション" 電子情報通信学会論文誌. J72ーDーII. 1221-1227 (1989)
- Related Report
  1989 Annual Research Report
[Publications] 徳田恵一,小林隆夫,深田俊明,今井聖: "適応メルケプストラム分析による音声信号処理" 電子情報通信学会技術報告. SP89ー61. 49-56 (1989)
- Related Report
  1989 Annual Research Report
[Publications] K.Tokuda,T.Kobayashi,S.Shiomoto,S.Imai: "Adaptive Filtering Based on Cepstral RepresentationーAdaptive Cepstral Analysis of Speech" Proc.ICASSP-90(IEEE主催音響音声信号処理国際会議). 4ペ-ジ (1990)
- Related Report
  1989 Annual Research Report
[Publications] 古市千枝子,今井聖: "特定話者任意語彙連続音声の音素認識" 電子情報通信学会論文誌. J73ーDーII. 11ペ-ジ (1990)
- Related Report
  1989 Annual Research Report
[Publications] Satoshi IMAI,Chieko FURICHI: "Automatic Segmentation of Continuous Japanese Speech into Phonemic Units." Proc.EUSIPCO-90(ヨ-ロッパ信号処理学会主催国際会議). 4ペ-ジ (1990)
- Related Report
  1989 Annual Research Report

Phonemic Segmentation and Word Recognition for Dialogue Level Continuous Speech

Principal Investigator

IMAI Satoshi Tokyo Institute of Technology Research Laboratory of Precision Machinery and Electronics Professor, 精密工学研究所, 教授 (50016763)

¥17,400,000 (Direct Cost: ¥17,400,000)

Report

Research Products

[Publications] 古市 千枝子,今井 聖: "多様な音韻環境における音素的単位の セグメンテ-ション" 電子情報通信学会論文誌DーII. J72ーDーII. 1221-1227 (1989)

Description

Related Report

[Publications] 古市 千枝子,今井 聖: "特定話者任意語い連続音声の音素認識" 電子情報通信学会論文誌DーII. J73ーDーII. 501-511 (1990)

Description

Related Report

[Publications] 徳田 恵一,小林 隆夫,塩本 祥司,今井 聖: "適応ケプストラム分析 ーケプストラムを係数とする適応フィルター" 電子情報通信学会論文誌A. J73ーA. 1207-1215 (1990)

Description

Related Report

[Publications] 古市 千枝子,谷口 一郎,今井 聖: "音素を位とする任意単語音声の認識" 電子情報通信学会論文誌DーII.

Description

Related Report

[Publications] K.tokuda,T.Kobayashi,S.shiomoto,S.Imai: "Adaptive Filtering Based on Cepstral Representation ーAdaptive Cepstral Analysis of Speech" Proc.of ICASSP90ー1990 Interntional Conference on Acoustics,Speech,and,signal processing.S7.2. 377-380 (1990)

Description

Related Report

[Publications] S.Imai C.Furuichi: "Automatic segmentation of continuous Japanese speech into Phonemic Units" Proc.of EUSIPCOー90 ーFifth European Signal processing conference.1355-1358 (1990)

Description

Related Report

[Publications] FURUICHI Chieko and IMAI Satoshi: "Phonemic Units Segmentation in Various Phonetic Environments" Transactions of the Institute of Electronics, Information and Communication Engineers. J72-D-11. 1221-1227 (1989)

Description

Related Report

[Publications] FURUICHI Chieko and IMAI Satoshi: "Speaker-Independent Phoneme Recognition of Unspecified Vocabulary Japnese Speech" Transactions of the Institute of Electronics, Information and Communication Engineers. J73-D-11. 501-511 (1990)

Description

Related Report

[Publications] TOKUDA Keiichi, KOBAYASHI Takao, SHIOMOTO Shoji and IMAI Satoshi: "Adaptive Cepstral Analysis --- Adaptive Filtering Based on Cepstral Representation ---" Transactions of the Institute of Electronics, Information and Communication Engineers. J73-A. 1207-1215 (1990)

Description

Related Report

[Publications] FURUICHI Chieko, TANIGUCHI Ichiro and IMAI Satoshi: "Context-Independent Word Recognition Based on Phonemic Unit" Transactions of the Institute of Electronics, Information and Communication Engineers. (J74-D-11). (1991)

Description

Related Report

[Publications] IMAI Satoshi and FURUICHI Chieko: "Automatic Segmentation of Continuous Japanese Speech into Phonemic Units" Signal Processing V : Theories and Applications --- Proceedings of EUSIPCO-90 : 1990 European Signal Processing Conference. 1355-1358 (1990)

Description

Related Report

[Publications] TOKUDA Keiichi, KOBAYASHI Takao, SHIOMOTO Shoji and IMAI Satoshi: "Adaptive Filtering Based on Cepstral Representation --- Adaptive Cepstral Analysis of Speech" Proceedings of the ICASSP-90 : 1990 International Conference on Acoustics, Speech, and Signal Processing. 377-380 (1990)

Description

Related Report

[Publications] TOKUDA Keiichi, KOBAYASHI Takao and IMAI Satoshi: "Generalized Cepstral Analysis of Speech --- Unified Approach to LPC and Cepstral Method" Proceedings of ICSLP-90 : 1990 International Conference on Spoken Language Processing. 1. 37-40 (1990)

Description

Related Report

[Publications] HU Zhi-ping and IMAI Satoshi: "Chinese Continuous Speech Recognition System Using the State Transition Models Both of Phonemes and Words" Proceedings of ICSLP-90 : 1990 International Conference on Spoken Language Processing. 2. 1201-1204 (1990)

Description

Related Report

[Publications] 古市 千枝子,今井 聖: "多様な音韻環境における音素的単位のセグメンテ-ション" 電子情報通信学会論文誌DーII. J72ーDーII. 1221-1227 (1989)

Related Report

[Publications] 古市 千枝子,今井 聖: "特定話者任意語い連続音声の音素認識" 電子情報通信学会論文誌DーII. J73ーDーII. 501-511 (1990)

Related Report

[Publications] 徳田 恵一,小林 隆夫,塩本 祥司,今井 聖: "適応ケプストラム分析ーケプストラムを係数とする適応フィルター" 電子情報通信学会論文誌A. J73ーA. 1207-1215 (1990)

Related Report

[Publications] 古市 千枝子,谷口 一郎,今井 聖: "音素を単位とする任意単語音声の認識" 電子情報通信学会論文誌DーII.

Related Report

[Publications] K.Tokuda,T.Kobayashi,S.Shiomoto,S.Imai: "Adaptive Filtering Based on Cepstral RepresentationーAdaptive Cepstral Analysis of Speech" Proc.of ICASSP 90ー1990 International Conference on Acoustics,Speech,and Signal Processing.(S7.2). 377-380 (1990)

Related Report

[Publications] S.Imai,C.Furuichi: "Automatic Segmentation of Continuous Japanese Speech into Phonemic Units" Proc.of EUSIPCOー90ーFifth European Signal Processing Conference.1355-1358 (1990)

Related Report

[Publications] 徳田恵一,小林隆夫,今井聖,斎藤博徳: "メルケプストラムをパラメ-タとする音声のスペクトル推定" 電子情報通信学会技術報告. DSP89ー17. 67-74 (1989)

Related Report

[Publications] 古市千枝子,今井聖: "多様な音韻環境における音素的単位のセグメンテ-ション" 電子情報通信学会論文誌. J72ーDーII. 1221-1227 (1989)

Related Report

[Publications] 徳田恵一,小林隆夫,深田俊明,今井聖: "適応メルケプストラム分析による音声信号処理" 電子情報通信学会技術報告. SP89ー61. 49-56 (1989)

Related Report

[Publications] K.Tokuda,T.Kobayashi,S.Shiomoto,S.Imai: "Adaptive Filtering Based on Cepstral RepresentationーAdaptive Cepstral Analysis of Speech" Proc.ICASSP-90(IEEE主催音響音声信号処理国際会議). 4ペ-ジ (1990)

Related Report

[Publications] 古市千枝子,今井聖: "特定話者任意語彙連続音声の音素認識" 電子情報通信学会論文誌. J73ーDーII. 11ペ-ジ (1990)

Related Report

[Publications] Satoshi IMAI,Chieko FURICHI: "Automatic Segmentation of Continuous Japanese Speech into Phonemic Units." Proc.EUSIPCO-90(ヨ-ロッパ信号処理学会主催国際会議). 4ペ-ジ (1990)

Related Report

[Publications] 古市千枝子,今井聖: "多様な音韻環境における音素的単位のセグメンテ-ション" 電子情報通信学会論文誌DーII. J72ーDーII. 1221-1227 (1989)

[Publications] 古市千枝子,今井聖: "特定話者任意語い連続音声の音素認識" 電子情報通信学会論文誌DーII. J73ーDーII. 501-511 (1990)

[Publications] 徳田恵一,小林隆夫,塩本祥司,今井聖: "適応ケプストラム分析ーケプストラムを係数とする適応フィルター" 電子情報通信学会論文誌A. J73ーA. 1207-1215 (1990)

[Publications] 古市千枝子,谷口一郎,今井聖: "音素を位とする任意単語音声の認識" 電子情報通信学会論文誌DーII.

[Publications] 古市千枝子,今井聖: "多様な音韻環境における音素的単位のセグメンテ-ション" 電子情報通信学会論文誌DーII. J72ーDーII. 1221-1227 (1989)

[Publications] 古市千枝子,今井聖: "特定話者任意語い連続音声の音素認識" 電子情報通信学会論文誌DーII. J73ーDーII. 501-511 (1990)

[Publications] 徳田恵一,小林隆夫,塩本祥司,今井聖: "適応ケプストラム分析ーケプストラムを係数とする適応フィルター" 電子情報通信学会論文誌A. J73ーA. 1207-1215 (1990)

[Publications] 古市千枝子,谷口一郎,今井聖: "音素を単位とする任意単語音声の認識" 電子情報通信学会論文誌DーII.