2015 Fiscal Year Annual Research Report

音声認識と自動整形の統合的なモデル化に基づく字幕生成の研究

Research Project

Project/Area Number	25730112
Research Institution	Kyoto University
Principal Investigator	秋田祐哉京都大学, 経済学研究科, 講師 (90402742)
Project Period (FY)	2013-04-01 – 2016-03-31
Keywords	音声認識 / 自動整形 / 話し言葉 / 字幕
Outline of Annual Research Achievements	講義や講演などの話し言葉には冗長な表現や口語表現が含まれるため，音声認識を字幕などに活用する際は，まず音声認識器が話し言葉特有の表現をカバーした上で認識を行い，その結果に含まれる冗長表現・口語表現を読みやすく整形するというアプローチが取られる．本研究ではこれらを話し言葉への変換とその逆変換としてとらえ，話し言葉の特徴のモデルを構築し，このモデルに基づき音声認識と自動整形を実現する．本研究では，講義・講演の字幕をターゲットとし，一連の処理に基づく字幕の生成・配信システムを構築して性能評価を実施する．これまで，データを収集してこれらの処理（モデル）の検証と精緻化を進めた．また，音声認識に関連して，特に本研究が対象とする講義や講演では話題への適応が必須であることから，言語モデルの適応手法についても検討を行った．字幕生成・配信のシステムについては，音声認識・自動整形のモデル・手法をはじめとして，編集環境やシステムの応答など種々の改善を図り，本システムとしての運用を開始した．平成27年度では，実際の音声を使用しての運用と評価を引き続き実施するとともに，これまでオフラインの処理が中心であった本システムをリアルタイム字幕に拡張し，そのための手法・枠組みを検討した．また本システムを用いて，実際の学会会場におけるリアルタイム字幕の提供も実施した．これらの一部について，国際学会・国内研究会にて研究発表を行っている．

Research Products
(3 results)

All 2015

All Presentation (3 results) (of which Int'l Joint Research: 2 results)

[Presentation] Automatic classification of usability of ASR result for real-time captioning of lectures2015
- Author(s)
  Yuya Akita, Nobuhiro Kuwahara, Tatsuya Kawahara
- Organizer
  APSIPA ASC
- Place of Presentation
  香港（中国）
- Year and Date
  2015-12-16 – 2015-12-19
- Int'l Joint Research
[Presentation] 音声認識を用いた講義・講演の字幕作成・編集システム2015
- Author(s)
  秋田祐哉, 三村正人, 河原達也
- Organizer
  情報処理学会音声言語情報処理研究会
- Place of Presentation
  早稲田大学（東京都新宿区）
- Year and Date
  2015-10-30
[Presentation] Language model adaptation for academic lectures using character recognition result of presentation slides2015
- Author(s)
  Yuya Akita, Yizheng Tong, Tatsuya Kawahara
- Organizer
  IEEE-ICASSP
- Place of Presentation
  ブリスベン（オーストラリア）
- Year and Date
  2015-04-19 – 2015-04-24
- Int'l Joint Research

2015 Fiscal Year Annual Research Report

音声認識と自動整形の統合的なモデル化に基づく字幕生成の研究

Principal Investigator

秋田 祐哉 京都大学, 経済学研究科, 講師 (90402742)

Research Products

[Presentation] Automatic classification of usability of ASR result for real-time captioning of lectures2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声認識を用いた講義・講演の字幕作成・編集システム2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Language model adaptation for academic lectures using character recognition result of presentation slides2015

Author(s)

Organizer

Place of Presentation

Year and Date

秋田祐哉京都大学, 経済学研究科, 講師 (90402742)