Speech recognition accepting utterances including out-of-vocabularies

Research Project

Project/Area Number	14380168
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	Waseda University
Principal Investigator	SAGISAKA Yoshinori Waseda University, Graduate School of Global Information and Telecommunication Studies, Professor, 大学院・国際情報通信研究科, 教授 (70339737)
Co-Investigator(Kenkyū-buntansha)	SHIRAI Katsuhiko Waseda University, School of Science and Engineering, Professor, 理工学部, 教授 (10063702) KOBAYASI Tsunori Waseda University, School of Science and Engineering, Professor, 理工学部, 教授 (30162001) YAWIMOTO Hirofumi Advanced Telecommunications Research Institute International, Senior Researcher, 主任研究員 (00395013)
Project Period (FY)	2002 – 2005
Project Status	Completed (Fiscal Year 2005)
Budget Amount *help	¥14,000,000 (Direct Cost: ¥14,000,000) Fiscal Year 2005: ¥3,300,000 (Direct Cost: ¥3,300,000) Fiscal Year 2004: ¥3,400,000 (Direct Cost: ¥3,400,000) Fiscal Year 2003: ¥3,400,000 (Direct Cost: ¥3,400,000) Fiscal Year 2002: ¥3,900,000 (Direct Cost: ¥3,900,000)
Keywords	statistical language model / out of vocabulary (OOV) / hierarchical language model / continuous speech recognition / task-free speech recognition / 未登録語 / 単語クラスモデル / 音韻連接特性
Research Abstract	A speech recognition scheme was studied to accept utterances including out-of-vocabularies (OOVs). A hierarchical statistical language model was newly proposed to cope with OOVs and speech recognition experiments have been carried out to confirm its effectiveness. In this language model, we described word-neighboring characteristics of unregistered expressions and constituent phonotactic constraints statistically independently to cope with unregistered expressions. The upper layer of this hierarchical model consists of inter-word statistics expressed by multi-dimensional composite word N-grams and the lower layer expresses infra-word statistical phonotactics using multi-dimensional composite sub-word units. A series of speech recognition experiments have shown that this language modeling enables the effective use of independent statistics and achieved high recognition performance for utterances including OOVs. By expandingthis lower layer model for single words such as personal names a … More nd city names to much longer named entity such as book titles and movie titles, we have successfully shown the validity of this modeling to other unregistered expressions consisting of multiple words. This success suggests that the proposed language model is effective for OOVs task independently and the possibility of a task-free statistical language model by integrating different statistical constraints independently. In speech recognition experiments, long unregistered expressions for movie titles were expressed by multi-dimensionalcomposite word N-grams as a lower-layer model. Experimental results showed that the proposed model recognition accuracy almost corresponded to the theoretical upper limit obtained by registering all OOVs as recognition lexicons. Furthermore, multiple Markov models have been automatically obtained by splitting OOV characteristics into multiple lower layered models. The use of word-class intrinsic models and automatically derived unsupervised models were proved to be useful for general unspecified OOVs, which gives a guideline of building statistical language models according to the size and the quality of available language data. Less

Report

(5 results)

2005 Annual Research Report Final Research Report Summary
2004 Annual Research Report
2003 Annual Research Report
2002 Annual Research Report

Research Products
(30 results)

All 2005 2004 2003 2002 Other

All Journal Article (26 results) Publications (4 results)

[Journal Article] Speech recognition of a named entity2005
- Author(s)
  Tasuhiko Tomita, Yoshiyuki Okimoto, Hirofumi Yamamoto, Yoshinori Sagisaka
- Journal Title
  
  Proc. ICASSP2005 I
  
  Pages: 1057-1060
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Speech Recognition of 00V Expressions and Words2005
- Author(s)
  Tetsuhiko Tomita, Yoshiyuki Okimoto, Hirofumi Yamamoto, Yoshinori Sagasaki
- Journal Title
  
  Proc. SNLP2005 I
  
  Pages: 273-278
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 未知固有表現を含む音声の認識2005
- Author(s)
  富田達彦, 沖本純幸, 山本博史, 匂坂芳典
- Journal Title
  
  情報処理学会研究報告
  
  Pages: 117-122
- NAID
  110002952523
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 未登録固有表現と未登録単語を含む音声の認識2005
- Author(s)
  富田達彦, 沖本純幸, 山本博史, 匂坂芳典
- Journal Title
  
  日本音響学会2005年秋季研究発表会講演論文集
  
  Pages: 45-46
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Speech recognition of a named entity2005
- Author(s)
  Tatsuhiko Tomita, Yoshiyuki Okimoto, Hirofumi Yamamoto, Yoshinori Sagisaka
- Journal Title
  
  Proc.ICASSP2005 I
  
  Pages: 1057-1060
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Speech Recognition of OOV Expressions and Words2005
- Author(s)
  Tatsuhiko Tomita, Yoshiyuki Okimoto, Hirofumi Yamamoto, Yoshinori Sagisaka
- Journal Title
  
  Proc.SNLP2005 I
  
  Pages: 273-278
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Speech recognition of unregistered expressions2005
- Author(s)
  Tatsuhiko Tomita, Yoshiyuki Okimoto, Hirofumi Yamamoto, Yoshinori Sagisaka
- Journal Title
  
  IPSJ SIG Technical Reports
  
  Pages: 117-122
- NAID
  110002952523
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Speech recognition of OOV expressions and OOV words2005
- Author(s)
  Tatsuhiko Tomita, Yoshiyuki Okimoto, Hirofumi Yamamoto, Yoshinori Sagisaka
- Journal Title
  
  2005 Autumn Meeting Acoustical Society of Japan
  
  Pages: 45-46
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Speech recognition of a named entity2005
- Author(s)
  T.Tomita, Y.Okimoto, H.Yamamoto, Y.Sagisaka
- Journal Title
  
  Proc.ICASSP2005 Vol.1
  
  Pages: 1057-1060
- Related Report
  2005 Annual Research Report
[Journal Article] Speech Recognition of OOV Expressions and Words2005
- Author(s)
  T.Tomita, Y.Okimoto, H.Yamamoto, Y.Sagisaka
- Journal Title
  
  Proc.SNLP2005 Vol.I
  
  Pages: 273-278
- Related Report
  2005 Annual Research Report
[Journal Article] 複数のマルコフモデルを用いた階層化言語モデルによる未登録語録認識2004
- Author(s)
  山本博史, 小窪浩明, 菊井玄一郎, 小川良彦, 匂坂芳典
- Journal Title
  
  電子情報通信学会論文誌(D-II) 12
  
  Pages: 2104-2111
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Mis-recognized utterance detection using hierarchical language model2004
- Author(s)
  Hirofumi Yamamoto, Genichiro Kikui, Yshinori Sagisaka
- Journal Title
  
  Proc. ICSLP2004 (International Conference on Speech Processing)
  
  Pages: 1025-1028
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 未知固有表現を含む音声の認識2004
- Author(s)
  富田達彦, 沖本純幸, 山本博史, 匂坂芳典
- Journal Title
  
  日本音響学会2004年秋季研究発表会講演論文集 I
  
  Pages: 59-60
- NAID
  110002952523
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Out-of-Vocabulary Word Recognition with a Hierarchical Language Model Using Multiple Markov Model2004
- Author(s)
  Hirofumi Yamamoto, Hiroaki Kokubo, Genichiro Kikui, Yoshihiko Ogawa, Yoshinori Sagisaka
- Journal Title
  
  The Journal of The Institute of Electronics, Information and Communication Engineers Vol.87
  
  Pages: 2104-2111
- NAID
  110003203161
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Mis-recognized utterance detection using hierarchical language model2004
- Author(s)
  Hirofumi Yamamoto, Genichiro Kikui, Yoshinori Sagisaka
- Journal Title
  
  Proc.ICSLP2004
  
  Pages: 1025-1028
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Speech recognition for unregistered expression of a class2004
- Author(s)
  Tatsuhiko Tomita, Yoshiyuki Okimoto, Hirofumi Yamamoto, Yoshinori Sagisaka
- Journal Title
  
  2004 Autumn Meeting Acoustical Society of Japan I
  
  Pages: 59-60
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] 未知固有表現を含む音声の認識2004
- Author(s)
  冨田達彦, 匂坂芳典, 沖本純幸
- Journal Title
  
  日本音響学会2004年秋季研究発表会講演論文集 Vol.I,2-1-2
  
  Pages: 59-60
- NAID
  110002952523
- Related Report
  2004 Annual Research Report
[Journal Article] Mis-recognized utterance detection using hierarchical language model2004
- Author(s)
  Hirofumi Yamamoto, Genichiro Kikui, Yoshinori Sagisaka
- Journal Title
  
  Proc.ICSLP 2004(International Conference on Speech Processing) Vol.2
  
  Pages: 1025-1028
- Related Report
  2004 Annual Research Report
[Journal Article] Spoken language processing as computational human modeling2004
- Author(s)
  Yoshinori Sagisaka
- Journal Title
  
  TECHNOLOGY AND PROCESSING SYSTEMS and Oriental COCOSDA-2004 Vol.2
  
  Pages: 161-166
- Related Report
  2004 Annual Research Report
[Journal Article] 複数のマルコフモデルを用いた階層化言語モデルによる未登録語認識2004
- Author(s)
  山本博史, 小窪浩明, 菊井玄一郎, 小川良彦, 匂坂芳典
- Journal Title
  
  電子情報通信学会論文誌D-2 J87-D-2 No.12
  
  Pages: 2104-2111
- NAID
  110003203161
- Related Report
  2004 Annual Research Report
[Journal Article] Word Class Modeling for Speach Recognition with Out-of-Task Words Using a Hierarchical Language Model2003
- Author(s)
  Yoshihiko Ogawa, Hirofumi Yamamoto, Yoshinori Sagisaka
- Journal Title
  
  Proc. Eurospeech2003
  
  Pages: 221-224
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] タスク外語彙のための構造化クラス言語モデル2003
- Author(s)
  小川良彦, 山本博史, 匂坂芳典, 小窪浩明, 菊井玄一郎
- Journal Title
  
  日本音響学会2003年秋季研究発表会講演論文集 I
  
  Pages: 83-84
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Word Class Modeling for Speech Recognition with Out-of-Task Words Using a Hierarchical Language Model2003
- Author(s)
  Yoshihiko Ogawa, Hirofumi Yamamoto, Yoshinori Sagisaka
- Journal Title
  
  Proc.Eurospeech2003
  
  Pages: 221-224
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Word Class Modeling for Speech Recognition with Out-of-Task Words Using a Hierarchical Language Model2003
- Author(s)
  Yoshihiko Ogawa, Hirofumi Yamamoto, Yoshinori Sagisaka, Hiroaki Kokubo, Genichiro Kikui
- Journal Title
  
  2003 Autumn Meeting Acoustical Society of Japan I
  
  Pages: 83-84
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] タスク外語彙を含む音声の認識2002
- Author(s)
  小川良彦, 磯貝俊太郎, 匂坂芳典, 大西茂彦, 山本博史, 菊井玄一郎
- Journal Title
  
  日本音響学会2002年秋季研究発表会講演論文集 I
  
  Pages: 143-144
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2005 Final Research Report Summary
[Journal Article] Speech recognition for out of vocabularies2002
- Author(s)
  Yoshihiko Ogawa, Shuntaro Isogai, Yoshinori Sagisaka, Shigehiko Onishi, Hirofumi Yamamoto, Genichiro Kikui
- Journal Title
  
  2002 Autumn Meeting Acoustical Society of Japan I
  
  Pages: 143-144
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2005 Final Research Report Summary
[Publications] S.Onishi, H.Yamamoto, G.Kikui, Y.Sagisaka: "A statistical word model using word-class specific constraints for handling out-of-vocabulary words in speech recognition"Proceedings of SNLP-Oriental COCOSDA 2002. 37-42 (2002)
- Related Report
  2002 Annual Research Report
[Publications] 匂坂芳典: "認知計算モデルとしての音声技術"電子情報通信学会信学技報. SP2002-29. 31-36 (2002)
- Related Report
  2002 Annual Research Report
[Publications] 山下博史, 大西茂彦, 小窪浩明, 匂坂芳典: "構造化言語モデルとその実装"電子情報通信学会信学技報. SP2003-32. 49-54 (2002)
- Related Report
  2002 Annual Research Report
[Publications] 小川良彦, 磯貝俊太郎, 匂坂芳典, 大西茂彦, 山本博史, 菊井玄一郎: "タスク外語彙を含む音声の認識"日本音響学会2002年秋季研究発表会講演論文集. 3-9-7. 143-144 (2002)
- Related Report
  2002 Annual Research Report

Speech recognition accepting utterances including out-of-vocabularies

Principal Investigator

SAGISAKA Yoshinori Waseda University, Graduate School of Global Information and Telecommunication Studies, Professor, 大学院・国際情報通信研究科, 教授 (70339737)

¥14,000,000 (Direct Cost: ¥14,000,000)

Report

Research Products

[Journal Article] Speech recognition of a named entity2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] Speech Recognition of 00V Expressions and Words2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] 未知固有表現を含む音声の認識2005

Author(s)

Journal Title

NAID

Description

Related Report

[Journal Article] 未登録固有表現と未登録単語を含む音声の認識2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] Speech recognition of a named entity2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] Speech Recognition of OOV Expressions and Words2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] Speech recognition of unregistered expressions2005

Author(s)

Journal Title

NAID

Description

Related Report

[Journal Article] Speech recognition of OOV expressions and OOV words2005

Author(s)

Journal Title

Description

Related Report

[Journal Article] Speech recognition of a named entity2005

Author(s)

Journal Title

Related Report

[Journal Article] Speech Recognition of OOV Expressions and Words2005

Author(s)

Journal Title

Related Report

[Journal Article] 複数のマルコフモデルを用いた階層化言語モデルによる未登録語録認識2004

Author(s)

Journal Title

Description

Related Report

[Journal Article] Mis-recognized utterance detection using hierarchical language model2004

Author(s)

Journal Title

Description

Related Report

[Journal Article] 未知固有表現を含む音声の認識2004

Author(s)

Journal Title

NAID

Description

Related Report

[Journal Article] Out-of-Vocabulary Word Recognition with a Hierarchical Language Model Using Multiple Markov Model2004

Author(s)

Journal Title

NAID

Description

Related Report

[Journal Article] Mis-recognized utterance detection using hierarchical language model2004

Author(s)

[Publications] 匂坂芳典: "認知計算モデルとしての音声技術"電子情報通信学会信学技報. SP2002-29. 31-36 (2002)

[Publications] 山下博史, 大西茂彦, 小窪浩明, 匂坂芳典: "構造化言語モデルとその実装"電子情報通信学会信学技報. SP2003-32. 49-54 (2002)

[Publications] 小川良彦, 磯貝俊太郎, 匂坂芳典, 大西茂彦, 山本博史, 菊井玄一郎: "タスク外語彙を含む音声の認識"日本音響学会2002年秋季研究発表会講演論文集. 3-9-7. 143-144 (2002)