Studies on Speech Recognition, Closed Caption and Summarization of Broadcast News

Research Project

Project/Area Number	09480064
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	Toyohashi University of Technology
Principal Investigator	NAKAGAWA Seiichi Toyohashi University of Technology, Faculty of Engineering, Professor, 工学部, 教授 (20115893)
Co-Investigator(Kenkyū-buntansha)	KAI Atsuhiko Shizuoka University, Faculty of Engineering, Assitant Professor, 工学部, 講師 (60283496) MINEMATSU Nobuaki Toyohashi University of Technology, Faculty of Engineering, Research Assistant, 工学部, 助手 (90273333) MASUYAMA Sigeru Toyohashi University of Technology, Faculty of Engineering, Professor, 工学部, 教授 (60173762) ANDO Akio NHK Laboratory, Sub-Head of Human-Interface Department, 放送技術研究所, 副部長
Project Period (FY)	1997 – 1999
Project Status	Completed (Fiscal Year 1999)
Budget Amount *help	¥13,100,000 (Direct Cost: ¥13,100,000) Fiscal Year 1999: ¥4,800,000 (Direct Cost: ¥4,800,000) Fiscal Year 1998: ¥3,100,000 (Direct Cost: ¥3,100,000) Fiscal Year 1997: ¥5,200,000 (Direct Cost: ¥5,200,000)
Keywords	speech recognition / acoustic model / closed caption / dictation / language model / summarization / broadcast news / 大語彙連続音声認識 / ニュース文 / 要約
Research Abstract	It is well-known that HMMs only of the basic structure can not capture the correlation among successive frames adequately. In our previous work, to solve this problem, segmental unit HMMs were introduced and their effectiveness was shown. And the integration of Δ cepstrum and ΔΔ cepstrum into the segmental unit HMMs was also found to improve the recognition performance in the work. Firstly, we compared frame-based models and segment-based models. Results showed the effectiveness of the use of segmental features as input vectors. Secondly, we compared syllable-based HMMs and triphone-based HMMs. Recognition experiments showed that syllable-based HMMs are suitable for Japanese. Next, we developed a method that constructs language models using a task adaptation strategy and idiomatic expressions of news articles. First, we investigated the effect of a task adaptation method of N-gram language model using a limited amount of target articles. Second, we investigated the effect of the language model adaptation method using the latest articles. Third, we investigated the effect of the use of idiomatic expressions as morpheme units, since some specific expressions and idiomatic expressions are frequently observed in news articles. We showed that our proposed three methods were effective for constructing N-gram language models. Finally, we proposed and evaluated a method for summarizing each sentence in TV news texts written in Japanese. It is not appropriate to select important sentences for abstracting news text, because a news text consists of only a few and long sentences. Then, we tried to reduce redundant parts, which consisted of modifier etc., of each sentence. We used a simple parsing method specialized for news texts so that the syntactical structure was not destroyed. We evaluated this summarizing method by obtaining information by means of questionnaires to 32 examinees.

Report

(4 results)

1999 Annual Research Report Final Research Report Summary
1998 Annual Research Report
1997 Annual Research Report

Research Products
(34 results)

All Other

All Publications (34 results)

[Publications] K. Hanai, K. Yamamoto, N. Minematsu and S. Nakagawa: "Continuous speech recognition using segmental unit input HMMS with mixture of probability density functions and context dependency"Proc. 5th Int. Conf. Spoken Language Processing. 2935-2938 (1998)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 中川聖一、赤松裕隆、西崎博光: "音声認識用言語モデルのためのタスク適応化と定型表現の利用"自然言語処理. 6・2. 97-115 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 甲斐充彦、廣瀬良文、中川聖一: "単語N-gram言語モデルを用いた音声認識システムにおける未知語・冗長語の処理"情報処理学会論文誌. 40・4. 1385-1394 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 三上真、増山繁、中川聖一: "ニュース番組における字幕生成のための文内短縮による要約"自然言語処理. 6・6. 65-81 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] H. Nishizaki and S. Nakagawa: "A Retrieval System of Broadcast News Speech Documents Through Key Word and Voice"Proc. Int. Workshop on Text, Speech and Dialogue, in Lecture Notes in Artificial Intelligence. 286-289 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 中川聖一: "音声認識研究の動向"電子情報通信学会論文誌. J83-DII・2. 433-457 (2000)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 中川聖一: "岩波書店"5章音声認識「音声」(田窪,前川,本多,白井,中川). 177〜229 (1998)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] 中川聖一: "丸善"パターン情報処理. 310 (1999)
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] K.Hanai, K.Yamamoto, N.Minematsu and S.Nakagawa: "Continuous Speech Recognition Using Segmental Unit Input HMMs with a Mixture of Probability Density Functions and Context Dependency"Proc. 5th Int. Conf. Spoken Language Processing. 2935-2938 (1998)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] S.Nakagawa, H.Akamatsu and H.Nishizaki: "A Task Adaptation Method and Use of Idiomatic Expression of Stochastic Language Model for Speech recognition"Journal of Natural Language Processing. Vol.6, No.6 (in Japanese). 89-107 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] A.Kai, Y.Hirose and S.Nakagawa: "Dealing with Out-of-vocabulary Words and Filled Pauses in Word N-gram based Speech Recognition System"Trans. IPSJ. Vol.40, No.4 (in Japanese). 1385-1394 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] M.Minami, S.Masuyama and S.Nakagawa: "A Summarization Method by Reducing Redundancy of Each Sentence for Making Captions of Newscasting"Journal of Natural Language Processing. Vol.6, No.6 (in Japanese). 65-81 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] H.Nishizaki and S.Nakagawa: "A Retrieval System of Broadcast News Speech Documents Through Key Word and Voice"Proc. Int. Workshop on Text, Speech and Dialogue, in Lecture Notes in Artificial Intelligence, V.Matousek, P.Mautner, J.Ocelikova and P.Sojka, Springer. 286-289 (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] S.Nakagawa: "A Survey on Automatic Speech Recognition"IEICE Trans. Vol.83-D II No.2 (in Japanese). 433-457 (2000)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] Seiichi Nakagawa: "Speech Recognition, Speech"Iwanami-Shoten (in Japanese). (1998)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] Seiichi Nakagawa: "Pattern Information Processing"Maruzen (in Japanese). (1999)
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  1999 Final Research Report Summary
[Publications] K.Hanai,K.Yamamoto,N.Minematsu and S.Nakagawa: "Continuous speech recognition using segmental unit input HMMs with mixture of probability density functions and context dependency"Proc.5th Int,Conf,Spoken Language Processing. 2935-2938 (1998)
- Related Report
  1999 Annual Research Report
[Publications] 中川聖一、赤松裕隆、西沢博光: "音声認識用言語モデルのためのタスク適応化と定型表現の利用"自然言語処理. 6・2. 97-115 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 甲斐充彦、廣瀬良文、中川聖一: "単語N-gram言語モデルを用いた音声認識システムにおける未知語・冗長語の処理"情報処理学会論文誌. 40・4. 1385-1394 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 三上真、増山繁、中川聖一: "ニュース番組における字幕生成のための文内短縮による要約"自然言語処理. 6・6. 65-81 (1999)
- Related Report
  1999 Annual Research Report
[Publications] H.Nishizaki and S.Nakagawa: "A Retrieval System of Broadcast News Speech Documents Through Key Word and Voice"Proc.Int Workshop on Text,Speech and Dialogue,in Lecture Notes in Artificial Intelligence. 286-289 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 中川聖一: "音声認識研究の動向"電子情報通信学会論文誌. J83-DII・2. 433-457 (2000)
- Related Report
  1999 Annual Research Report
[Publications] 中川聖一: "岩波書店"5章音声認識『音声』(田窪、前川、本多、白井、中川). 177-229 (1998)
- Related Report
  1999 Annual Research Report
[Publications] 中川聖一: "丸善"パターン情報処理. 310 (1999)
- Related Report
  1999 Annual Research Report
[Publications] 甲斐充彦: "N-gram言語モデルと効率的探索法を用いた大語彙連続音声認識システムの検討" 電子情報通信学会・音声技術報告. SP97-99. 31-38 (1998)
- Related Report
  1998 Annual Research Report
[Publications] 山崎邦子: "聴覚障害者用字幕生成のための言い替えによるニュース文要約" 第4回言語処理学会論文集. 646-649 (1998)
- Related Report
  1998 Annual Research Report
[Publications] 三上真: "文中の重要部抽出と言い替えを併用した聴覚障害者用字幕生成のためのニュース要約" 第4回言語処理学会併設ワークショップ. (1998)
- Related Report
  1998 Annual Research Report
[Publications] 赤松裕隆: "新聞・ニュースをタスクとした大語彙連続音声認識システムの評価" 情報処理学会、第57回全国大会. 6C-10. 2-35-2-36 (1998)
- Related Report
  1998 Annual Research Report
[Publications] 中川聖一: "音声認識用言語モデルのためのタスク適応化と定型表現の利用" 自然言語処理. 6・2. 97-115 (1999)
- Related Report
  1998 Annual Research Report
[Publications] 三上真: "ニュース音声の認識結果を用いた要約による字幕生成" 情報処理学会、第58回全国大会. 3W-5. (1999)
- Related Report
  1998 Annual Research Report
[Publications] 甲斐充彦: "N-gram言語モデルと効率的探索法を用いた大語彙連続音声認識システムの検討" 電子情報通信学会・音声技術報告. sp97-99. 31-38 (1998)
- Related Report
  1997 Annual Research Report
[Publications] 西崎博光: "音声認識のための定型表現を用いた言語モデルの検討" 第4回言語処理学会論文集. (1998)
- Related Report
  1997 Annual Research Report
[Publications] 山崎邦子: "聴覚障害者用字幕生成のための言い替えによるニュース文要約" 第4回言語処理学会論文集. (1998)
- Related Report
  1997 Annual Research Report
[Publications] 三上真: "文中の重要部抽出と言い換えを併用した聴覚障害者用字幕生成のためのニュース文要約" 第4回言語処理学会併用ワークショップ. (1998)
- Related Report
  1997 Annual Research Report

Studies on Speech Recognition, Closed Caption and Summarization of Broadcast News

Principal Investigator

NAKAGAWA Seiichi Toyohashi University of Technology, Faculty of Engineering, Professor, 工学部, 教授 (20115893)

¥13,100,000 (Direct Cost: ¥13,100,000)

Report

Research Products

[Publications] K. Hanai, K. Yamamoto, N. Minematsu and S. Nakagawa: "Continuous speech recognition using segmental unit input HMMS with mixture of probability density functions and context dependency"Proc. 5th Int. Conf. Spoken Language Processing. 2935-2938 (1998)

Description

Related Report

[Publications] 中川聖一、赤松裕隆、西崎博光: "音声認識用言語モデルのためのタスク適応化と定型表現の利用"自然言語処理. 6・2. 97-115 (1999)

Description

Related Report

[Publications] 甲斐充彦、廣瀬良文、中川聖一: "単語N-gram言語モデルを用いた音声認識システムにおける未知語・冗長語の処理"情報処理学会論文誌. 40・4. 1385-1394 (1999)

Description

Related Report

[Publications] 三上真、増山繁、中川聖一: "ニュース番組における字幕生成のための文内短縮による要約"自然言語処理. 6・6. 65-81 (1999)

Description

Related Report

[Publications] H. Nishizaki and S. Nakagawa: "A Retrieval System of Broadcast News Speech Documents Through Key Word and Voice"Proc. Int. Workshop on Text, Speech and Dialogue, in Lecture Notes in Artificial Intelligence. 286-289 (1999)

Description

Related Report

[Publications] 中川聖一: "音声認識研究の動向"電子情報通信学会論文誌. J83-DII・2. 433-457 (2000)

Description

Related Report

[Publications] 中川聖一: "岩波書店"5章音声認識「音声」(田窪,前川,本多,白井,中川). 177〜229 (1998)

Description

Related Report

[Publications] 中川聖一: "丸善"パターン情報処理. 310 (1999)

Description

Related Report

[Publications] K.Hanai, K.Yamamoto, N.Minematsu and S.Nakagawa: "Continuous Speech Recognition Using Segmental Unit Input HMMs with a Mixture of Probability Density Functions and Context Dependency"Proc. 5th Int. Conf. Spoken Language Processing. 2935-2938 (1998)

Description

Related Report

[Publications] S.Nakagawa, H.Akamatsu and H.Nishizaki: "A Task Adaptation Method and Use of Idiomatic Expression of Stochastic Language Model for Speech recognition"Journal of Natural Language Processing. Vol.6, No.6 (in Japanese). 89-107 (1999)

Description

Related Report

[Publications] A.Kai, Y.Hirose and S.Nakagawa: "Dealing with Out-of-vocabulary Words and Filled Pauses in Word N-gram based Speech Recognition System"Trans. IPSJ. Vol.40, No.4 (in Japanese). 1385-1394 (1999)

Description

Related Report

[Publications] M.Minami, S.Masuyama and S.Nakagawa: "A Summarization Method by Reducing Redundancy of Each Sentence for Making Captions of Newscasting"Journal of Natural Language Processing. Vol.6, No.6 (in Japanese). 65-81 (1999)

Description

Related Report

[Publications] H.Nishizaki and S.Nakagawa: "A Retrieval System of Broadcast News Speech Documents Through Key Word and Voice"Proc. Int. Workshop on Text, Speech and Dialogue, in Lecture Notes in Artificial Intelligence, V.Matousek, P.Mautner, J.Ocelikova and P.Sojka, Springer. 286-289 (1999)

Description

Related Report

[Publications] S.Nakagawa: "A Survey on Automatic Speech Recognition"IEICE Trans. Vol.83-D II No.2 (in Japanese). 433-457 (2000)

Description

Related Report

[Publications] Seiichi Nakagawa: "Speech Recognition, Speech"Iwanami-Shoten (in Japanese). (1998)

Description

Related Report

[Publications] Seiichi Nakagawa: "Pattern Information Processing"Maruzen (in Japanese). (1999)

Description

Related Report

[Publications] K.Hanai,K.Yamamoto,N.Minematsu and S.Nakagawa: "Continuous speech recognition using segmental unit input HMMs with mixture of probability density functions and context dependency"Proc.5th Int,Conf,Spoken Language Processing. 2935-2938 (1998)

Related Report

[Publications] 中川聖一、赤松裕隆、西沢博光: "音声認識用言語モデルのためのタスク適応化と定型表現の利用"自然言語処理. 6・2. 97-115 (1999)

Related Report

[Publications] 甲斐充彦、廣瀬良文、中川聖一: "単語N-gram言語モデルを用いた音声認識システムにおける未知語・冗長語の処理"情報処理学会論文誌. 40・4. 1385-1394 (1999)

Related Report

[Publications] 三上真、増山繁、中川聖一: "ニュース番組における字幕生成のための文内短縮による要約"自然言語処理. 6・6. 65-81 (1999)

Related Report

[Publications] H.Nishizaki and S.Nakagawa: "A Retrieval System of Broadcast News Speech Documents Through Key Word and Voice"Proc.Int Workshop on Text,Speech and Dialogue,in Lecture Notes in Artificial Intelligence. 286-289 (1999)

Related Report

[Publications] 中川聖一: "音声認識研究の動向"電子情報通信学会論文誌. J83-DII・2. 433-457 (2000)

Related Report

[Publications] 中川聖一: "岩波書店"5章音声認識『音声』(田窪、前川、本多、白井、中川). 177-229 (1998)

Related Report

[Publications] 中川聖一: "丸善"パターン情報処理. 310 (1999)

Related Report

[Publications] 甲斐充彦: "N-gram言語モデルと効率的探索法を用いた大語彙連続音声認識システムの検討" 電子情報通信学会・音声技術報告. SP97-99. 31-38 (1998)

Related Report

[Publications] 山崎邦子: "聴覚障害者用字幕生成のための言い替えによるニュース文要約" 第4回言語処理学会論文集. 646-649 (1998)

Related Report

[Publications] 三上 真: "文中の重要部抽出と言い替えを併用した聴覚障害者用字幕生成のためのニュース要約" 第4回言語処理学会併設ワークショップ. (1998)

Related Report

[Publications] 赤松裕隆: "新聞・ニュースをタスクとした大語彙連続音声認識システムの評価" 情報処理学会、第57回全国大会. 6C-10. 2-35-2-36 (1998)

Related Report

[Publications] 中川聖一: "音声認識用言語モデルのためのタスク適応化と定型表現の利用" 自然言語処理. 6・2. 97-115 (1999)

Related Report

[Publications] 三上真: "文中の重要部抽出と言い替えを併用した聴覚障害者用字幕生成のためのニュース要約" 第4回言語処理学会併設ワークショップ. (1998)

[Publications] 三上真: "ニュース音声の認識結果を用いた要約による字幕生成" 情報処理学会、第58回全国大会. 3W-5. (1999)