Japanese text dictation system for official reports

Research Project

Project/Area Number	07558042
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	展開研究
Research Field	Intelligent informatics
Research Institution	Tohoku University
Principal Investigator	MAKINO Shozo Tohoku Univ., Computer Center, Prof., 大型計算機センター, 教授 (00089806)
Co-Investigator(Kenkyū-buntansha)	NIYADA Katsuyuki Matsushita Technology Institute Co., Researcher, 情報ネットワーク研究所, 研究職 CHEN Guo yue Tohoku Univ., Computer Center, Research Associ., 大型計算機センター, 助手 (20282014) KUDOH Junichi Tohoku Univ., Computer Center, Associ.Prof., 大型計算機センター, 助教授 (40186408) 木幡稔東北大学, 工学研究科, 助教授 (30186720)
Project Period (FY)	1995 – 1997
Project Status	Completed (Fiscal Year 1997)
Budget Amount *help	¥5,300,000 (Direct Cost: ¥5,300,000) Fiscal Year 1997: ¥600,000 (Direct Cost: ¥600,000) Fiscal Year 1996: ¥1,000,000 (Direct Cost: ¥1,000,000) Fiscal Year 1995: ¥3,700,000 (Direct Cost: ¥3,700,000)
Keywords	dictation system / Phoneme recognition / acquisition of language model / official reports / 連続音声認識 / モデル音声法 / 言語モデル / HMnet / 文節オートマトン / 解剖所見 / 識別学習
Research Abstract	The experts often make official reports such as for estimation of real estiate, for medico-legal autopsy and so on. It is a time-consuming job to make official reports. If speech input is automatically transformed to sentenses, the load of making document will be decreased. In current speech input system, it is still difficult to automatically recognize continuously spoken general sentences. But, when the condition is so limited that a user is special, 1) a user is limited to a specified speaker, 2) the expression of document is almost decided, and the structure of sentence is comparatively simple, 3) the vocabulary number in the official reports is 4000 from 3000, and the vocabulary number decrease more by deviding the document into each of parts. These conditions will lighten load to a device, and the device will be developed with ease. In this research, we have developed a sentense recognition system for autopsy reports. Firstly, we built an automaton by ECGI method to represent the structure of sentence. Then we defined the distance between the states of an automaton to strengthen correspondence to the words, the appearance of which was expected. Based on this definition, we developed a method in which an automaton was revised and generalized. For phoneme recognition, we used the model sound method developed by Niyada. We, all the members, put the above-mentioned methods together and made the sound input system of autopsy findings. The system ran to recognize sounds without time delay. Since the precision of sound recognition is not enough, the improvement of the system will be continued in future.

Report

(4 results)

1997 Annual Research Report Final Research Report Summary
1996 Annual Research Report
1995 Annual Research Report

Research Products
(24 results)

All Other

All Publications (24 results)

[Publications] H Mori, H Aso, S Makino: "Japanese Document Recognition Based on interpolated n-gram Model of Character" Proc.of Third Inter.Conf.on Document Analysis and Recognition. 274-277 (1995)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1997 Final Research Report Summary
[Publications] T.OTSUKI, A.ITO, S.MAKINO, T.OHTOMO: "The Performance Prediction on Sentence Recognition Using a Finite State Word Automaton" IEICE Trans.on Information and Systems. E79-D,6・1. 47-53 (1996)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1997 Final Research Report Summary
[Publications] M.Suzuki S.Makino, H.Aso: "Acquisition of language model" Jour.Acoust.soc.America. 100, 4. 2757-2757 (1996)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1997 Final Research Report Summary
[Publications] S.MAKINO, M.SUZUKI,: "Automatic Acquistion of LanguageModel using HMnet" Proc.Inter.Conf on Speech Processing. I. 47-54 (1997)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1997 Final Research Report Summary
[Publications] H.Mori H.Aso S.Makino: "Japanese Document Recognition Based on interpolated ngram Model of Character" Proc.of Third Inter.Conf.on Document Analysis and Recognition. 274-277 (1995)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1997 Final Research Report Summary
[Publications] T.OTSUKI,A.ITO,S.MAKINO,T.OHTOMO: "The Performance Prediction on Sentence Recognition Using a Finite State Word Automaton" IEICE Trans.on Information and Systems. E79-D,1. 47-53 (1996)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1997 Final Research Report Summary
[Publications] M.Suzuki S.Makino, H.Aso: "Acquisition of language model" Jour.Acoust.Soc.America. 100,4. 2757-2757 (1996)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1997 Final Research Report Summary
[Publications] S.MAKINO,M.SUZUKI,A.HARADA: "Automatic Acquistion of Language Model using HMnet" Proc.Inter.Conf on Speech Processing. I. 47-54 (1997)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1997 Final Research Report Summary
[Publications] S.MAKINO.M.SUZUKI, A.HARADA: "Automatic Acquistion of Language Model using HMent" Proc.Int.Conf.Speech Processing'97. I. 47-54 (1997)
- Related Report
  1997 Annual Research Report
[Publications] 原田, 鈴木, 牧野: "離散型HMnetによる新聞記事からの文節モデルの獲得" 電子情報通信学会技術報告. SP97・24. 45-50 (1997)
- Related Report
  1997 Annual Research Report
[Publications] 阿部, 鈴木, 牧野, 阿曽: "音素毎の話者クラスタリングに基づく話者適応法" 電子情報通信学会技術報告. SP97・74. 41-46 (1997)
- Related Report
  1997 Annual Research Report
[Publications] 森, 阿曽, 牧野: "再現性を考慮した文字列に基づく統計的言語モデル" 電子情報通信学会技術報告. NLC97・47. 29-34 (1997)
- Related Report
  1997 Annual Research Report
[Publications] 鈴木,阿曽,牧野: "SSS-freeに基づくHMnetを用いた不特定話者音素認識" 日本音響学会講演論文集. 春季号. 143-144 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 大坂,牧野: "発声速度に基づく音素持続時間予測を用いた音素認識" 信学技報. Vol. 96 No. 93. 1-6 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 沖本,牧野: "可変長パターンと識別学習を用いた音素認識" 信学技報. Vol. 96 No. 93. 7-14 (1996)
- Related Report
  1996 Annual Research Report
[Publications] Y. Okimoto, S. Makino: "Phoneme Recognition using reference patterns constructed with discriminative training and DP matching" THE JOURNAL of the Acoustical Society of America. Vol. 100 No. 4. 2757-2757 (1996)
- Related Report
  1996 Annual Research Report
[Publications] M. Suzuki, S. Makino: "Acquisition of language models based on HMnet" THE JOURNAL of the Acoustical Society of America. Vol. 100 No. 4. 2791-2791 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 牧野正三: "東北大一松下単語音声データベース" 人文学と情報処理. 第12号. 56-59 (1996)
- Related Report
  1996 Annual Research Report
[Publications] 古賀,牧野,城戸: "ローカルピークによる単母音認識に及ぼす時間窓とリフタの影響" 日本音響学会誌. 51. 130-132 (1995)
- Related Report
  1995 Annual Research Report
[Publications] 伊藤,牧野: "拡張RHA法による連続音声認識のための単語予備選択" 電子情報通信学会論文誌D-II. J-78-D-II. 400-408 (1995)
- Related Report
  1995 Annual Research Report
[Publications] M.SUZUKI,S.MAKINO,H.ASO,H.SHIMODAIRA: "A New HMnet Construction Algorithm Requining No Contextual Factors" IEICE Traus, INF. & SYST.E-78-D. 662-668 (1995)
- Related Report
  1995 Annual Research Report
[Publications] 鈴木,牧野,阿曽: "離散型HMnetの言語モデルへの適用" 電子情報通信学会技術研究報告. SP95-33. 65-72 (1995)
- Related Report
  1995 Annual Research Report
[Publications] 沖本,牧野,曽根: "確率尺度によるDPマッチングを用いた音素のセグメンテーション" 日本音響学会講演論文集. I. 165-166 (1995)
- Related Report
  1995 Annual Research Report
[Publications] 大坂,牧野,曽根: "予備認識結果に基づく持続時間予測の音素認識における効果" 日本音響学会講演論文集. I. 55-56 (1995)
- Related Report
  1995 Annual Research Report

Japanese text dictation system for official reports

Principal Investigator

MAKINO Shozo Tohoku Univ., Computer Center, Prof., 大型計算機センター, 教授 (00089806)

¥5,300,000 (Direct Cost: ¥5,300,000)

Report

Research Products

[Publications] H Mori, H Aso, S Makino: "Japanese Document Recognition Based on interpolated n-gram Model of Character" Proc.of Third Inter.Conf.on Document Analysis and Recognition. 274-277 (1995)

Description

Related Report

[Publications] T.OTSUKI, A.ITO, S.MAKINO, T.OHTOMO: "The Performance Prediction on Sentence Recognition Using a Finite State Word Automaton" IEICE Trans.on Information and Systems. E79-D,6・1. 47-53 (1996)

Description

Related Report

[Publications] M.Suzuki S.Makino, H.Aso: "Acquisition of language model" Jour.Acoust.soc.America. 100, 4. 2757-2757 (1996)

Description

Related Report

[Publications] S.MAKINO, M.SUZUKI,: "Automatic Acquistion of LanguageModel using HMnet" Proc.Inter.Conf on Speech Processing. I. 47-54 (1997)

Description

Related Report

[Publications] H.Mori H.Aso S.Makino: "Japanese Document Recognition Based on interpolated ngram Model of Character" Proc.of Third Inter.Conf.on Document Analysis and Recognition. 274-277 (1995)

Description

Related Report

[Publications] T.OTSUKI,A.ITO,S.MAKINO,T.OHTOMO: "The Performance Prediction on Sentence Recognition Using a Finite State Word Automaton" IEICE Trans.on Information and Systems. E79-D,1. 47-53 (1996)

Description

Related Report

[Publications] M.Suzuki S.Makino, H.Aso: "Acquisition of language model" Jour.Acoust.Soc.America. 100,4. 2757-2757 (1996)

Description

Related Report

[Publications] S.MAKINO,M.SUZUKI,A.HARADA: "Automatic Acquistion of Language Model using HMnet" Proc.Inter.Conf on Speech Processing. I. 47-54 (1997)

Description

Related Report

[Publications] S.MAKINO.M.SUZUKI, A.HARADA: "Automatic Acquistion of Language Model using HMent" Proc.Int.Conf.Speech Processing'97. I. 47-54 (1997)

Related Report

[Publications] 原田, 鈴木, 牧野: "離散型HMnetによる新聞記事からの文節モデルの獲得" 電子情報通信学会技術報告. SP97・24. 45-50 (1997)

Related Report

[Publications] 阿部, 鈴木, 牧野, 阿曽: "音素毎の話者クラスタリングに基づく話者適応法" 電子情報通信学会技術報告. SP97・74. 41-46 (1997)

Related Report

[Publications] 森, 阿曽, 牧野: "再現性を考慮した文字列に基づく統計的言語モデル" 電子情報通信学会技術報告. NLC97・47. 29-34 (1997)

Related Report

[Publications] 鈴木,阿曽,牧野: "SSS-freeに基づくHMnetを用いた不特定話者音素認識" 日本音響学会講演論文集. 春季号. 143-144 (1996)

Related Report

[Publications] 大坂,牧野: "発声速度に基づく音素持続時間予測を用いた音素認識" 信学技報. Vol. 96 No. 93. 1-6 (1996)

Related Report

[Publications] 沖本,牧野: "可変長パターンと識別学習を用いた音素認識" 信学技報. Vol. 96 No. 93. 7-14 (1996)

Related Report

[Publications] Y. Okimoto, S. Makino: "Phoneme Recognition using reference patterns constructed with discriminative training and DP matching" THE JOURNAL of the Acoustical Society of America. Vol. 100 No. 4. 2757-2757 (1996)

Related Report

[Publications] M. Suzuki, S. Makino: "Acquisition of language models based on HMnet" THE JOURNAL of the Acoustical Society of America. Vol. 100 No. 4. 2791-2791 (1996)

Related Report

[Publications] 牧野 正三: "東北大一松下単語音声データベース" 人文学と情報処理. 第12号. 56-59 (1996)

Related Report

[Publications] 古賀,牧野,城戸: "ローカルピークによる単母音認識に及ぼす時間窓とリフタの影響" 日本音響学会誌. 51. 130-132 (1995)

Related Report

[Publications] 伊藤,牧野: "拡張RHA法による連続音声認識のための単語予備選択" 電子情報通信学会論文誌D-II. J-78-D-II. 400-408 (1995)

Related Report

[Publications] M.SUZUKI,S.MAKINO,H.ASO,H.SHIMODAIRA: "A New HMnet Construction Algorithm Requining No Contextual Factors" IEICE Traus, INF. & SYST.E-78-D. 662-668 (1995)

Related Report

[Publications] 鈴木,牧野,阿曽: "離散型HMnetの言語モデルへの適用" 電子情報通信学会技術研究報告. SP95-33. 65-72 (1995)

Related Report

[Publications] 沖本,牧野,曽根: "確率尺度によるDPマッチングを用いた音素のセグメンテーション" 日本音響学会講演論文集. I. 165-166 (1995)

Related Report

[Publications] 大坂,牧野,曽根: "予備認識結果に基づく持続時間予測の音素認識における効果" 日本音響学会講演論文集. I. 55-56 (1995)

Related Report

[Publications] 牧野正三: "東北大一松下単語音声データベース" 人文学と情報処理. 第12号. 56-59 (1996)