2015 Fiscal Year Annual Research Report

音声処理技術と言語処理技術を活用した電子ノート作成支援システムの研究

Research Project

Project/Area Number	26282049
Research Institution	University of Yamanashi
Principal Investigator	西崎博光山梨大学, 総合研究部, 助教 (40362082)
Co-Investigator(Kenkyū-buntansha)	秋葉友良豊橋技術科学大学, 工学(系)研究科(研究院), 准教授 (00356346) 北岡教英徳島大学, ソシオテクノサイエンス研究部, 教授 (10333501) 中川聖一豊橋技術科学大学, リーディング大学院教育推進機構, 特任教授 (20115893) 宇津呂武仁筑波大学, システム情報工学研究科(系), 教授 (90263433)
Project Period (FY)	2014-04-01 – 2017-03-31
Keywords	電子ノート / 雑音下音声認識 / 不特定話者音声認識 / 音声中のキーワード検出 / ユーザインタフェース / 言語モデル学習支援
Outline of Annual Research Achievements	本研究では，授業で利用するための電子ノート作成支援システムを開発し，これによるノート作成補助および電子ノートコンテンツをeラーニング教材に利用したときの学習効果を実証することを目的としている．具体的な研究項目は，　１．音声・映像（静止画）を包括的に記録できる電子ノート作成支援システムを開発する，　２．授業音声の記録ならびに作成したノートコンテンツを有効利用するための音声処理・言語処理基盤技術を開発する，　３．eラーニング支援として本システムの学習効果を実証実験で明らかにする，の3点である．これらの研究項目のうち，平成27年度は１．および２．について取り組んだ．まず，１．に関連して，平成26年度に作成した電子ノート作成支援システムのプロトタイプの評価結果を踏まえて，改良を行った．平成27年度初頭に実施した評価結果では，利便性が確認されたものの，ユーザインタフェースに使いにくさが残る結果となったため，主にインタフェースの改良を行った．具体的には，ボタンの配置，画像の配置の改良などである．次に，２．の項目において，音声処理の高度化について取り組んだ．具体的には，まず，雑音下での音声認識技術として，音響的な雑音に影響されない顔画像中の口唇の形状を用いた音声認識を試みた．従来の方法と比較し，口唇形状を捉えた新しい特徴量を導入したことで大幅な認識率の向上を得た．さらに，老若男女の不特定話者の音声認識のための音声認識手法を開発した．話者情報を補助情報として利用することで，音響モデルを改善し，従来手法と比べて優れていることを示した．また，授業を行う教員が自分の授業にあった言語モデルを半自動で構築できる枠組みも開発し，大幅に音声認識性能を改善できることを示した．これらの要素技術は電子ノート作成支援システムのユーザインタフェースや，これを用いた学習効果を高めるために必要不可欠である．
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 概ね順調に進展している．その理由は，実際に授業と連動して動作する電子ノート作成支援システムを概ね開発できたこと，音声・言語処理技術の基盤技術として，雑音下での，様々な年代・性別の教師に対する音声認識率の改善，授業内容に応じた音声認識モデルの構築などの開発が進んでいることにある．さらに，深層学習技術を用いることで，音声から特定のキーワードを検索する技術の性能の改善も得られたことも，この理由である．これらの技術は実際の授業では効果を測定していないが，評価音声データにおいて個々の技術の有効性を示すことができた．これらの成果については，国内外の学術会議と学術論文誌で発表している．さらに，平成27年度の研究成果を査読付きの国際会議や学術論文誌へ投稿する準備も進めている．
Strategy for Future Research Activity	平成28年度（最終年度）は，１．音声処理基盤技術の開発・改良を引き続き進めていくとともに，２．これらの技術を備えた電子ノート作成支援システムを実際の授業や授業ビデオ閲覧で利用することで学習効果を確認する．１．においては，平成27年度までに開発した基盤技術（雑音下音声認識，不特定話者の音声認識，授業にあった言語モデル学習支援，音声中の検索語検出（キーワード検出），音声ドキュメント検索）をより高度化していく予定である．高度化していくにあたり，最新の深層学習技術を取り入れる予定である．２．については，平成27年度までで概ね完成した電子ノート作成支援システムを，まず，被験者実験で評価し，使いやすさなどのユーザインタフェースを中心に評価を実施する．また，これまでに開発してきた最新の音声処理・言語処理基盤技術を電子ノート作成支援システムに取り入れることで，電子ノート作成支援システムをより高度化・使いやすく改良していく予定である．一通りの評価・技術の実装が終わった後に，被験者に授業で実利用してもらったり，e-ラーニング教材の授業ビデオを使って学習をしてもらったりして，本システムを利用することでの学習効果などを検証する予定である．
Causes of Carryover	既存の研究環境（計算機環境や携帯端末などの設備，データ，資料，など）を効率よく利用できたことから，物品費の出費を低く抑えることができた．また，被験者実験が平成28年度に持ち越しとなったため，そのための人件費の出費が少なくなった．さらに，投稿論文に関する費用（掲載費や英文校正費）を平成28年度の予算で支払うことになるため，その他の予算の使用も少なくなっている．
Expenditure Plan for Carryover Budget	次の予算項目に使っていきたい．①被験者実験用のタブレット端末の購入などの設備投資による研究の加速化，②研究成果の発信（国内外の会議や論文投稿），③収集した音声・映像データの整備．

Research Products
(17 results)

All 2016 2015 Other

All Journal Article (2 results) (of which Peer Reviewed: 2 results, Open Access: 1 results, Acknowledgement Compliant: 1 results) Presentation (12 results) (of which Int'l Joint Research: 7 results) Book (2 results) Remarks (1 results)

[Journal Article] 携帯端末で録音された音声メモを用いる診療記録作成支援システム2016
- Author(s)
  西崎博光，胡桃澤圭佑，西崎香苗，池上仁志
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: J99-D Pages: 358-366
- DOI
  10.14923/transinfj.2015JDP7074
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Spoken Term Detection Using Spoken Document Index Based on Keyword Corrected from Automatic Speech Recognition Result2015
- Author(s)
  Kentaro Domoto, Takehito Utsuro, Naoki Sawada, Hiromitsu Nishizaki
- Journal Title
  
  International Journal of Signal Processing Systems
  
  Volume: 4 Pages: 282-288
- DOI
  10.18178/ijsps.4.4.282-288
- Peer Reviewed / Open Access
[Presentation] 講義音声認識のための言語モデル学習ユーザインタフェースの設計2016
- Author(s)
  山田一星，西崎博光
- Organizer
  日本音響学会春季講演論文集
- Place of Presentation
  桐蔭横浜大学（横浜市，神奈川県）
- Year and Date
  2016-03-11 – 2016-03-11
[Presentation] 畳み込みニューラルネットワークの教師なし逐次適応学習の検討2016
- Author(s)
  関博史，山本一公，中川聖一
- Organizer
  日本音響学会春季講演論文集
- Place of Presentation
  桐蔭横浜大学（横浜市，神奈川県）
- Year and Date
  2016-03-11 – 2016-03-11
[Presentation] 認識語彙に依存しない音声内容検索におけるグラフに基づく文書拡張手法2016
- Author(s)
  川崎祥, 秋葉友良
- Organizer
  日本音響学会春季講演論文集
- Place of Presentation
  桐蔭横浜大学（横浜市，神奈川県）
- Year and Date
  2016-03-09 – 2016-03-09
[Presentation] Score Normalization using Phoneme-based Entropy for Spoken Term Detection2015
- Author(s)
  Hiromitsu Nishizaki, Naoki Sawada
- Organizer
  Proceedings of the 7th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2015
- Place of Presentation
  香港
- Year and Date
  2015-12-19 – 2015-12-19
- Int'l Joint Research
[Presentation] Audio-visual speech recognition using deep bottleneck features and high-perfromanc lipreading2015
- Author(s)
  Satoshi Tamura, Hiroshi Ninomiya, Norihide Kitaoka, Shin Osuga, Yurie Iribe, Kazuya Takeda, Satoru Hayamizu
- Organizer
  Proceedings of the 7th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2015
- Place of Presentation
  香港
- Year and Date
  2015-12-19 – 2015-12-19
- Int'l Joint Research
[Presentation] Deep neural network based acoustic model using speaker-class information for short time utterance2015
- Author(s)
  H. Seki, K. Yamamoto, S. Nakagawa
- Organizer
  Proceedings of the 7th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2015
- Place of Presentation
  香港
- Year and Date
  2015-12-19 – 2015-12-19
- Int'l Joint Research
[Presentation] 音素誤りパターンに基づく音声中の検索語検出の検討2015
- Author(s)
  澤田直輝，西崎博光
- Organizer
  日本音響学会秋季講演論文集
- Place of Presentation
  会津大学（会津若松市，福島県）
- Year and Date
  2015-09-16 – 2015-09-16
[Presentation] 認識結果から生成したキーワード集合を用いた分類器による最良照合STD2015
- Author(s)
  堂元健太郎，宇津呂武仁，澤田直輝，西崎博光
- Organizer
  日本音響学会秋季講演論文集
- Place of Presentation
  会津大学（会津若松市，福島県）
- Year and Date
  2015-09-16 – 2015-09-16
[Presentation] Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction2015
- Author(s)
  A. Abe, K. Yamamoto, S. Nakagawa
- Organizer
  Proceedings of the 16th Annual Conference of the International Speech Communication Association (INTERSPEECH2015)
- Place of Presentation
  ドレスデン市，ドイツ
- Year and Date
  2015-09-09 – 2015-09-09
- Int'l Joint Research
[Presentation] Two-Step Spoken Term Detection using SVM Classifier Trained with Pre-Indexed Keywords based on ASR Result2015
- Author(s)
  Kentaro Domoto, Takehito Utsuro, Naoki Sawada, Hiromitsu Nishizaki
- Organizer
  Proceedings of the 16th Annual Conference of the International Speech Communication Association (INTERSPEECH2015)
- Place of Presentation
  ドレスデン市，ドイツ
- Year and Date
  2015-09-07 – 2015-09-07
- Int'l Joint Research
[Presentation] Integration of Deep Bottleneck Features for Audio-Visual Speech Recognition2015
- Author(s)
  Hiroshi Ninomiya, Norihide Kitaoka, Satoshi Tamura, Yurie Iribe, Kazuya Takeda
- Organizer
  Proceedings of the 16th Annual Conference of the International Speech Communication Association (INTERSPEECH2015)
- Place of Presentation
  ドレスデン市，ドイツ
- Year and Date
  2015-09-07 – 2015-09-07
- Int'l Joint Research
[Presentation] Incorporating Text Information on Presentation Slides for Spoken Lecture Retrieval2015
- Author(s)
  Kosuke Yamauchi, Tomoyosi Akiba
- Organizer
  Proceedings of International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA2015)
- Place of Presentation
  チョンブリー県，タイ
- Year and Date
  2015-08-19 – 2015-08-19
- Int'l Joint Research
[Book] 日本音響学会編音響キーワードブック2016
- Author(s)
  日本音響学会（編集），秋葉友良，ほか多数
- Total Pages
  494
- Publisher
  コロナ社
[Book] 今後の超高齢化社会に求められる生活支援ロボット技術2015
- Author(s)
  寺嶋一彦（監修），中川聖一，ほか多数
- Total Pages
  609
- Publisher
  情報機構
[Remarks] http://www.alps-lab.org/

2015 Fiscal Year Annual Research Report

音声処理技術と言語処理技術を活用した電子ノート作成支援システムの研究

Principal Investigator

西崎 博光 山梨大学, 総合研究部, 助教 (40362082)

Current Status of Research Progress

Reason

Research Products

[Journal Article] 携帯端末で録音された音声メモを用いる診療記録作成支援システム2016

Author(s)

Journal Title

DOI

[Journal Article] Spoken Term Detection Using Spoken Document Index Based on Keyword Corrected from Automatic Speech Recognition Result2015

Author(s)

Journal Title

DOI

[Presentation] 講義音声認識のための言語モデル学習ユーザインタフェースの設計2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 畳み込みニューラルネットワークの教師なし逐次適応学習の検討2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 認識語彙に依存しない音声内容検索におけるグラフに基づく文書拡張手法2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Score Normalization using Phoneme-based Entropy for Spoken Term Detection2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Audio-visual speech recognition using deep bottleneck features and high-perfromanc lipreading2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Deep neural network based acoustic model using speaker-class information for short time utterance2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音素誤りパターンに基づく音声中の検索語検出の検討2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 認識結果から生成したキーワード集合を用いた分類器による最良照合STD2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Two-Step Spoken Term Detection using SVM Classifier Trained with Pre-Indexed Keywords based on ASR Result2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Integration of Deep Bottleneck Features for Audio-Visual Speech Recognition2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Incorporating Text Information on Presentation Slides for Spoken Lecture Retrieval2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Book] 日本音響学会編 音響キーワードブック2016

Author(s)

Total Pages

Publisher

[Book] 今後の超高齢化社会に求められる生活支援ロボット技術2015

西崎博光山梨大学, 総合研究部, 助教 (40362082)

[Book] 日本音響学会編音響キーワードブック2016