2016 Fiscal Year Annual Research Report

Study on an Electronic Note-Taking Support System using Speech and Language Processing Technologies

Research Project

Project/Area Number	26282049
Research Institution	University of Yamanashi
Principal Investigator	西崎博光山梨大学, 総合研究部, 准教授 (40362082)
Co-Investigator(Kenkyū-buntansha)	秋葉友良豊橋技術科学大学, 工学(系)研究科(研究院), 准教授 (00356346) 北岡教英徳島大学, 社会産業理工学研究部, 教授 (10333501) 中川聖一豊橋技術科学大学, リーディング大学院教育推進機構, 特任教授 (20115893) 宇津呂武仁筑波大学, システム情報工学研究科(系), 教授 (90263433)
Project Period (FY)	2014-04-01 – 2017-03-31
Keywords	電子ノート作成支援システム / 学習支援 / 音声認識 / 雑音除去 / 音声認識誤り訂正 / 音声ドキュメント検索 / 音声中の検索語検出
Outline of Annual Research Achievements	本研究では，電子ノート作成支援システムを開発し，これを用いた際の学習効果を実証することを目的としている．具体的な研究項目は，１．音声・映像（静止画）を包括的に記録できる電子ノート作成支援システムの開発，２．授業音声の記録とノートコンテンツを有効利用するための音声処理・言語処理基盤技術の開発，３．本システムの学習効果の実証実験，の3点である．最終年度では，１．ならびに２．に関する技術の継続的な改善と，完成した電子ノートを用いた実証実験を行った．まず１．に関連して，平成27年度までに作成した電子ノート作成支援システムのプロトタイプシステムの評価結果を踏まえて改良を行った．加えて，２．の基盤技術の実装ならびに，実証実験に向けてのバグフィックスを行った．次に２．において，雑音下での音声認識技術として深層学習を用いた雑音除去手法開発した．提案手法により音声認識誤りを半減できることが確認できたため，電子ノートシステムに実装した．加えて，雑音に頑健に対応するために，深層学習を用いて映像と音声それぞれからの特徴抽出結果を統合する方法を開発し，音声認識の高精度化を実現した．音声認識の高度化では，深層学習を用いて音声処理用のフィルタリング技術を開発し，少量データで学習できる話者依存フィルタを開発し，音声認識率が改善することを確認した．また，深層学習による音声認識誤りの自動訂正技術も開発し，高い音声認識精度を実現した（音素認識率98％）．電子ノート作成支援としては，従来の音声検索語検出を発展させ，確信度の順に検出結果を利用し漸進的に検索結果を提示する音声内容検索手法を開発した．最後に，開発したシステムを実際の講義で利用し，学習効果の実証実験を行った．評価実験の結果，従来の紙のノート作成と比較して試験の成績が良かったこと，学習したことが記憶に定着しやすいことを示すことができた．
Research Progress Status	28年度が最終年度であるため、記入しない。
Strategy for Future Research Activity	28年度が最終年度であるため、記入しない。
Causes of Carryover	28年度が最終年度であるため、記入しない。
Expenditure Plan for Carryover Budget	28年度が最終年度であるため、記入しない。

Research Products
(20 results)

All 2017 2016 Other

All Journal Article (4 results) (of which Peer Reviewed: 3 results, Acknowledgement Compliant: 2 results) Presentation (15 results) (of which Int'l Joint Research: 7 results, Invited: 1 results) Remarks (1 results)

[Journal Article] Re-Ranking Approach of Spoken Term Detection using Conditional Random Fields-based Triphone Detection,” IEICE Trans. on Information & Systems2016
- Author(s)
  Naoki Sawada, Hiromitsu Nishizaki
- Journal Title
  
  IEICE Transaction on Information & Systems
  
  Volume: E99-D Pages: 2518-2527
- DOI
  10.1587/transinf.2016SLP0012
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Spoken Term Detection using SVM-based Classifier Trained with Pre-indexed Keywords2016
- Author(s)
  Kentaro Domoto, Takehito Utsuro, Naoki Sawada, Hiromitsu Nishizaki
- Journal Title
  
  IEICE Transaction on Information & Systems
  
  Volume: E99-D Pages: 2528-2538
- DOI
  10.1587/transinf.2016SLP0017
- Peer Reviewed / Acknowledgement Compliant
[Journal Article] Investigation of DNN-based audio-visual speech recognition2016
- Author(s)
  Satoshi Tamura, Hiroshi Ninomiya, Norihide Kitaoka, Shin Osuga, Yurie Iribe, Kazuya Takeda
- Journal Title
  
  IEICE Transaction on Information & Systems
  
  Volume: E99-D Pages: 2444-2451
- DOI
  10.1587/transinf.2016SLP0019
- Peer Reviewed
[Journal Article] 音声研究は今後どこに向かうののであろうか？－音声科学、音声工学、音声技術、そして音声脳科学－2016
- Author(s)
  中川聖一
- Journal Title
  
  日本音響学会氏
  
  Volume: 72 Pages: 172-172
[Presentation] 電子ノート作成支援システムの利用が及ぼす学習効果の検証2017
- Author(s)
  成田陽介，西崎博光
- Organizer
  情報処理学会第79回全国大会講演論文集
- Place of Presentation
  名古屋大学（愛知県名古屋市）
- Year and Date
  2017-03-16 – 2017-03-18
[Presentation] 音声中の検索語検出のための双方向回帰結合ニューラルネットワークを用いた正解音素推定2017
- Author(s)
  澤田直輝，西崎博光
- Organizer
  日本音響学会講2017年春季研究発表会
- Place of Presentation
  明治大学（神奈川県川崎市）
- Year and Date
  2017-03-14 – 2017-03-16
[Presentation] 自然発話クエリの音響・言語特徴を利用した確率的検索モデルによる音声内容検索2017
- Author(s)
  田崎広人, 秋葉友良
- Organizer
  日本音響学会講2017年春季研究発表会
- Place of Presentation
  明治大学（神奈川県川崎市）
- Year and Date
  2017-03-14 – 2017-03-16
[Presentation] DNNに基づくフィルタバンクの再学習による話者クラス適応の検討2017
- Author(s)
  関博史，山本一公，秋葉友良，中川聖一
- Organizer
  日本音響学会講2017年春季研究発表会
- Place of Presentation
  明治大学（神奈川県川崎市）
- Year and Date
  2017-03-14 – 2017-03-16
[Presentation] A deep neural network integrated with filterbank learning for speech recognition2017
- Author(s)
  Hiroshi Seki, Kazumasa Yamamoto, Seiichi Nakagawa
- Organizer
  ICASSP2017
- Place of Presentation
  アメリカ・ニューオーリンズ市
- Year and Date
  2017-03-05 – 2017-03-09
- Int'l Joint Research
[Presentation] Taking advantage of spontaneous speech for document retrieval: Lessons from the organization of evaluation tasks2016
- Author(s)
  Tomoyosi Akiba
- Organizer
  The 5th Joint meeting of ASA/ASJ
- Place of Presentation
  アメリカ・ホノルル市
- Year and Date
  2016-11-28 – 2016-12-02
- Int'l Joint Research / Invited
[Presentation] Correct phoneme sequence estimation using recurrent neural network for spoken term detection2016
- Author(s)
  Naoki Sawada, Hiromitsu Nishizaki
- Organizer
  The 5th Joint meeting of ASA/ASJ
- Place of Presentation
  アメリカ・ホノルル市
- Year and Date
  2016-11-28 – 2016-12-02
- Int'l Joint Research
[Presentation] 音声認識のためのDNNに基づくフィルタバンクの学習の検討2016
- Author(s)
  関博史，山本一公，中川聖一
- Organizer
  日本音響学会2016年秋季研究発表会
- Place of Presentation
  富山大学（富山県富山市）
- Year and Date
  2016-09-14 – 2016-09-16
[Presentation] 距離順音声検索語検出に基づく音声ドキュメントの漸進的内容検索2016
- Author(s)
  河谷浩志, 大野哲平, 秋葉友良
- Organizer
  日本音響学会2016年秋季研究発表会
- Place of Presentation
  富山大学（富山県富山市）
- Year and Date
  2016-09-14 – 2016-09-16
[Presentation] 音素遷移ネットワークを用いたリアルタイムキーワードスポッティングの検討2016
- Author(s)
  中村卓磨，澤田直輝，西崎博光
- Organizer
  日本音響学会2016年秋季研究発表会
- Place of Presentation
  富山大学（富山県富山市）
- Year and Date
  2016-09-14 – 2016-09-16
[Presentation] Recurrent Neural Network-based Phoneme Sequence Estimation using Multiple ASR Systems' Outputs for Spoken Term Detection2016
- Author(s)
  Naoki Sawada, Hiromitsu Nishizaki
- Organizer
  INTERSPEECH2016
- Place of Presentation
  アメリカ・サンフランシスコ市
- Year and Date
  2016-09-08 – 2016-09-12
- Int'l Joint Research
[Presentation] Overview of the NTCIR-12 SpokenQuery&Doc-2 Task2016
- Author(s)
  Tomoyosi Akiba, Hiromitsu Nishizaki, Hiroaki Nanjo and Gareth J. F. Jones
- Organizer
  The 12th NTCIR Conference
- Place of Presentation
  学術総合センター（東京都中央区）
- Year and Date
  2016-06-07 – 2016-06-10
- Int'l Joint Research
[Presentation] Evaluation of DNN-based Phoneme Estimation Approach on the NTCIR-12 SpokenQuery&Doc-2 SQ-STD Subtask2016
- Author(s)
  Naoki Sawada, Hiromitsu Nishizaki
- Organizer
  The 12th NTCIR Conference
- Place of Presentation
  学術総合センター（東京都中央区）
- Year and Date
  2016-06-07 – 2016-06-10
- Int'l Joint Research
[Presentation] Graph-based Document Expansion and Robust SCR Models for False Positives: Experiments at the NTCIR-12 SpokenQuery&Doc-22016
- Author(s)
  Sho Kawasaki, Hiroshi Oshima, Tomoyosi Akiba
- Organizer
  The 12th NTCIR Conference
- Place of Presentation
  学術総合センター（東京都中央区）
- Year and Date
  2016-06-07 – 2016-06-10
- Int'l Joint Research
[Presentation] 音声中の検索語検出のための回帰結合ニューラルネットワークを用いた正解音素推定2016
- Author(s)
  澤田直輝，西崎博光
- Organizer
  第111回音声言語情報処理研究会
- Place of Presentation
  東京工業大学（東京都目黒区）
- Year and Date
  2016-05-16 – 2016-05-17
[Remarks] 電子ノート作成支援システム
- URL
  http://www.alps-lab.org/kikimimi/dian_zinoto.html

2016 Fiscal Year Annual Research Report

Study on an Electronic Note-Taking Support System using Speech and Language Processing Technologies

Principal Investigator

西崎 博光 山梨大学, 総合研究部, 准教授 (40362082)

Research Products

[Journal Article] Re-Ranking Approach of Spoken Term Detection using Conditional Random Fields-based Triphone Detection,” IEICE Trans. on Information & Systems2016

Author(s)

Journal Title

DOI

[Journal Article] Spoken Term Detection using SVM-based Classifier Trained with Pre-indexed Keywords2016

Author(s)

Journal Title

DOI

[Journal Article] Investigation of DNN-based audio-visual speech recognition2016

Author(s)

Journal Title

DOI

[Journal Article] 音声研究は今後どこに向かうののであろうか？－音声科学、音声工学、音声技術、そして音声脳科学－2016

Author(s)

Journal Title

[Presentation] 電子ノート作成支援システムの利用が及ぼす学習効果の検証2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声中の検索語検出のための双方向回帰結合ニューラルネットワークを用いた正解音素推定2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 自然発話クエリの音響・言語特徴を利用した確率的検索モデルによる音声内容検索2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] DNNに基づくフィルタバンクの再学習による話者クラス適応の検討2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A deep neural network integrated with filterbank learning for speech recognition2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Taking advantage of spontaneous speech for document retrieval: Lessons from the organization of evaluation tasks2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Correct phoneme sequence estimation using recurrent neural network for spoken term detection2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音声認識のためのDNNに基づくフィルタバンクの学習の検討2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 距離順音声検索語検出に基づく音声ドキュメントの漸進的内容検索2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 音素遷移ネットワークを用いたリアルタイムキーワードスポッティングの検討2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Recurrent Neural Network-based Phoneme Sequence Estimation using Multiple ASR Systems' Outputs for Spoken Term Detection2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Overview of the NTCIR-12 SpokenQuery&Doc-2 Task2016

Author(s)

Organizer

Place of Presentation

Year and Date

西崎博光山梨大学, 総合研究部, 准教授 (40362082)