2016 Fiscal Year Annual Research Report

マルチモーダルサイレント音声認識技術に関する研究

Research Project

Project/Area Number	16H03211
Research Institution	Kyushu Institute of Technology
Principal Investigator	齊藤剛史九州工業大学, 大学院情報工学研究院, 准教授 (10379654)
Co-Investigator(Kenkyū-buntansha)	田村哲嗣岐阜大学, 工学部, 助教 (10402215) 桂田浩一東京理科大学, 理工学部情報科学科, 准教授 (80324490) 速水悟岐阜大学, 工学部, 教授 (90345794) 永井秀利九州工業大学, 大学院情報工学研究院, 助教 (60237485) 山崎敏正九州工業大学, 大学院情報工学研究院, 教授 (50392163)
Project Period (FY)	2016-04-01 – 2020-03-31
Keywords	ヒューマンインターフェース
Outline of Annual Research Achievements	１．実施計画に基づき、本研究グループが企画・運営するワークショップ（第３回サイレント音声認識ワークショップ）を2016年10月14日（金）～15日（土）に福岡朝日ビル貸会議室（福岡県福岡市）で開催した。発表件数は17件（特別講演1件を含む）、参加者数は45人であり、本研究グループの研究者も参加・発表した。また実施計画通りに表彰制度を設けて、学生奨励賞1名を表彰した。ワークショップの最後に、ビジネスミーティングを実施して、今後の研究の方向性について検討した。２．研究代表者は、本研究に関連しThe 13th Asian Conference on Computer Vision (ACCV2016) workshop on Multi-view Lip-reading/Audio-visual Challenges (MLAC2016)のオーガナイザとなり、た国際ワークショップを開催した。本ワークショップでは視聴覚音声認識の著名な研究者であるテッサリア大学（ギリシア）のGerasimos Potamianos先生を招聘して招待講演を実施した。また本研究グループメンバー2名が研究成果を発表した。３．本研究グループ6名の研究者は、カラー画像、音声、距離画像、表面筋電信号、脳波などの各モダリティ、あるいはバイモーダルなどの各テーマで研究を進め、雑誌論文2編、国際会議発表11件、国内学会発表19件の研究成果を挙げた。４．マルチモーダルコーパスの構築については、研究代表者が準備を進めている。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason ワークショップ開催に関して、実施計画では国内のみであったが、国際ワークショップにも本研究グループが関連した。国内ワークショップでは聴覚障害者、国際ワークショップでは研究者を招聘して特別講演を実施した。本研究グループ6名の研究者は、各テーマにおける研究を進め、国内外で成果を発表した。
Strategy for Future Research Activity	ワークショップについては継続して開催する。2017年度は岐阜で開催する予定で計画を進めている。2016年度同様に特別講演の実施や学生奨励賞を設ける。マルチモーダルコーパスの構築を進める。

Research Products
(35 results)

All 2017 2016 Other

All Int'l Joint Research (1 results) Journal Article (2 results) (of which Peer Reviewed: 2 results) Presentation (30 results) (of which Int'l Joint Research: 11 results) Remarks (1 results) Funded Workshop (1 results)

[Int'l Joint Research] University of Oulu(Finland)
- Country Name
  Finland
- Counterpart Institution
  University of Oulu
[Journal Article] Potential association between modular structures of brain functional connectivity networks and individual variability in foreign language ability2017
- Author(s)
  山﨑敏正，秋山暉佳，副島英子，山本宇彦
- Journal Title
  
  認知科学
  
  Volume: 24 Pages: 118-128
- Peer Reviewed
[Journal Article] Investigation of DNN-based audio-visual speech recognition2016
- Author(s)
  Satoshi Tamura, Hiroshi Ninomiya, Norihide Kitaoka, Shin Osuga, Yurie Iribe, Kazuya Takeda, Satoru Hayamizu
- Journal Title
  
  IEICE transaction on Information and Systems
  
  Volume: E99-D Pages: 2444-2451
- DOI
  10.1587/transinf.2016SLP0019
- Peer Reviewed
[Presentation] 日本語音節発話・想起時の脳波解析2017
- Author(s)
  浅原康平，中根丈司，神崎卓丸，桂田浩一，杉本俊二，新田恒雄，堀川順生
- Organizer
  日本音響学会2017年春季研究発表会講演論文集
- Place of Presentation
  明治大学、神奈川県川崎市
- Year and Date
  2017-03-15 – 2017-03-17
[Presentation] 発話時と想起時の脳波による日本語短音節認識の比較2017
- Author(s)
  神崎卓丸，浅原康平，中根丈司，桂田浩一，杉本俊二，堀川順生，新田恒雄
- Organizer
  日本音響学会2017年春季研究発表会講演論文集
- Place of Presentation
  明治大学、神奈川県川崎市
- Year and Date
  2017-03-15 – 2017-03-17
[Presentation] Suffix Arrayを用いた高速STDにおけるキーワード分割の最適化に関する検討2017
- Author(s)
  桂田浩一
- Organizer
  日本音響学会2017年春季研究発表会講演論文集
- Place of Presentation
  明治大学、神奈川県川崎市
- Year and Date
  2017-03-15 – 2017-03-17
[Presentation] 顔画像の対称3D-AAMによる顔方向非依存な発話認識2017
- Author(s)
  渡辺拓也，桂田浩一，金澤靖
- Organizer
  電子情報通信学会技術研究報告
- Place of Presentation
  京都大学，京都府京都市
- Year and Date
  2017-01-19 – 2017-01-20
[Presentation] Dynamic generation of thesaurus from plaintext using deep learning2016
- Author(s)
  K.Matsushita and T.Yamazaki
- Organizer
  4th International symposium on Applied Engineering and Sciences (SAES2016)
- Place of Presentation
  九州工業大学、福岡県北九州市
- Year and Date
  2016-12-17 – 2016-12-18
- Int'l Joint Research
[Presentation] Denoising of scalp-recorded EEGs using higher-order statistics for BCI2016
- Author(s)
  N.Toshima and T.Yamazaki
- Organizer
  4th International symposium on Applied Engineering and Sciences (SAES2016)
- Place of Presentation
  九州工業大学、福岡県北九州市
- Year and Date
  2016-12-17 – 2016-12-18
- Int'l Joint Research
[Presentation] A method for constructing brain functional connectivity network based on synchronization likelihood of scalp-recorded EEGs2016
- Author(s)
  A.Watanabe and T.Yamazaki
- Organizer
  4th International symposium on Applied Engineering and Sciences (SAES2016)
- Place of Presentation
  九州工業大学、福岡県北九州市
- Year and Date
  2016-12-17 – 2016-12-18
- Int'l Joint Research
[Presentation] EEG during Japanese syllable recall and speech tasks2016
- Author(s)
  Kohei Asahara, Jozi Nakane, Takumaru Kanzaki, Shunji Sugimoto, Kouich Katsurada, Tsuneo Nitta, and Junsei Horikawa
- Organizer
  The 3rd Annual Meeting of the Society for Bioacoustics
- Place of Presentation
  伊良湖シーパーク＆スパ、愛知県田原市
- Year and Date
  2016-12-10 – 2016-12-11
- Int'l Joint Research
[Presentation] Japanese monosyllable recognition from EEG2016
- Author(s)
  Takumaru Kanzaki, Shunji Sugimoto, Kouich Katsurada, Junsei Horikawa, and Tsuneo Nitta
- Organizer
  The 3rd Annual Meeting of the Society for Bioacoustics
- Place of Presentation
  伊良湖シーパーク＆スパ、愛知県田原市
- Year and Date
  2016-12-10 – 2016-12-11
- Int'l Joint Research
[Presentation] Lip Reading from Multi View Facial Images Using 3D-AAM2016
- Author(s)
  Takuya Watanabe, Kouichi Katsurada, and Yasushi Kanazawa
- Organizer
  ACCV2016 workshop: Multi-view Lip-reading/Audio-visual Challenges (MLAC2016)
- Place of Presentation
  TICC、台北市、台湾
- Year and Date
  2016-11-20 – 2016-11-24
- Int'l Joint Research
[Presentation] Concatenated Frame Image based CNN for Visual Speech Recognition2016
- Author(s)
  Takeshi Saitoh, Ziheng Zhou, Guoying Zhao, and Matti Pietikainen
- Organizer
  ACCV2016 workshop: Multi-view Lip-reading/Audio-visual Challenges (MLAC2016)
- Place of Presentation
  TICC、台北市、台湾
- Year and Date
  2016-11-20 – 2016-11-24
- Int'l Joint Research
[Presentation] Continuous Inaudible Recognition of Japanese Vowels Using Features Detected at Lips Shape Peaks based on Surface Electromyography2016
- Author(s)
  Nao Kurogi, Hidetoshi Nagai, Teigo Nakamura
- Organizer
  3rd International Conference on Systems and Informatics (ICSAI 2016)
- Place of Presentation
  上海、中国
- Year and Date
  2016-11-19 – 2016-11-21
- Int'l Joint Research
[Presentation] 読唇向け公開データベースの紹介2016
- Author(s)
  窪川美智子，齊藤剛史
- Organizer
  第6回バイオメトリクスと認識・認証シンポジウム（SBRA2016）
- Place of Presentation
  芝浦工業大学、東京都江東区
- Year and Date
  2016-11-16 – 2016-11-17
[Presentation] フレーム連結画像を用いたCNNによる読唇2016
- Author(s)
  齊藤剛史
- Organizer
  第6回バイオメトリクスと認識・認証シンポジウム（SBRA2016）
- Place of Presentation
  芝浦工業大学、東京都江東区
- Year and Date
  2016-11-16 – 2016-11-17
[Presentation] Development of audio-visual speech corpus toward speaker-independent Japanese LVCSR2016
- Author(s)
  Kazuto Ukai, Satoshi Tamura and Satoru Hayamizu
- Organizer
  Oriental COCOSDA 2016
- Place of Presentation
  Grand Aston Benoa, Bali, Indonesia
- Year and Date
  2016-10-26
- Int'l Joint Research
[Presentation] 連続音間の相関パラメータを導入した口形極点に基づく連続黙声母音認識2016
- Author(s)
  黒木菜緒，永井秀利，中村貞吾
- Organizer
  第3回サイレント音声認識ワークショップ
- Place of Presentation
  福岡朝日ビル、福岡県福岡市
- Year and Date
  2016-10-14 – 2016-10-15
[Presentation] 少数語彙世界の特性を活用した黙声音声認識手法に関する一提案2016
- Author(s)
  永井秀利，渡邉優実，中村貞吾
- Organizer
  第3回サイレント音声認識ワークショップ
- Place of Presentation
  福岡朝日ビル、福岡県福岡市
- Year and Date
  2016-10-14 – 2016-10-15
[Presentation] EEG信号からの日本語単音節認識2016
- Author(s)
  新田恒雄，神崎卓丸，堀川順生，杉本俊二，浅原康平，中根丈司，中澤香太，桂田浩一
- Organizer
  第3回サイレント音声認識ワークショップ
- Place of Presentation
  福岡朝日ビル、福岡県福岡市
- Year and Date
  2016-10-14 – 2016-10-15
[Presentation] フレーム連結画像を用いたCNNによる読唇2016
- Author(s)
  齊藤剛史
- Organizer
  第3回サイレント音声認識ワークショップ
- Place of Presentation
  福岡朝日ビル、福岡県福岡市
- Year and Date
  2016-10-14 – 2016-10-15
[Presentation] 3DAAMを用いた顔方向非依存の読唇の評価2016
- Author(s)
  渡辺拓也，桂田浩一，金澤靖
- Organizer
  第3回サイレント音声認識ワークショップ
- Place of Presentation
  福岡朝日ビル、福岡県福岡市
- Year and Date
  2016-10-14 – 2016-10-15
[Presentation] 大語彙マルチモーダル音声認識に向けたデータ収集2016
- Author(s)
  鵜飼和渡，田村哲嗣，速水悟
- Organizer
  第3回サイレント音声認識ワークショップ
- Place of Presentation
  福岡朝日ビル、福岡県福岡市
- Year and Date
  2016-10-14 – 2016-10-15
[Presentation] 口唇画像と深度情報を用いたマルチモーダル読唇2016
- Author(s)
  宮崎晃一，田村哲嗣，速水悟
- Organizer
  第3回サイレント音声認識ワークショップ
- Place of Presentation
  福岡朝日ビル、福岡県福岡市
- Year and Date
  2016-10-14 – 2016-10-15
[Presentation] 画像情報を用いた深層ボトルネック特徴量による発話区間検出2016
- Author(s)
  田村哲嗣，藤原正希，速水悟
- Organizer
  第3回サイレント音声認識ワークショップ
- Place of Presentation
  福岡朝日ビル、福岡県福岡市
- Year and Date
  2016-10-14 – 2016-10-15
[Presentation] 学会報告（INTERSPEECH2016）2016
- Author(s)
  田村哲嗣
- Organizer
  第3回サイレント音声認識ワークショップ
- Place of Presentation
  福岡朝日ビル、福岡県福岡市
- Year and Date
  2016-10-14 – 2016-10-15
[Presentation] 口形極点に基づく連続黙声母音認識への連続音間での相関パラメータ導入の効果2016
- Author(s)
  黒木菜緒，永井秀利，中村貞吾
- Organizer
  第69回電気・情報関係学会九州支部連合大会講演論文集
- Place of Presentation
  宮崎大学、宮崎県宮崎市
- Year and Date
  2016-09-29 – 2016-09-30
[Presentation] 日本語単音節発話時と想起時の脳波解析2016
- Author(s)
  浅原康平，中根丈司，神崎卓丸，中澤香太，桂田浩一，杉本俊二，新田恒雄，堀川順生
- Organizer
  日本音響学会2016年秋季研究発表会講演論文集
- Place of Presentation
  富山大学、富山県富山市
- Year and Date
  2016-09-14 – 2016-09-16
[Presentation] 脳波からの日本語単音節認識方式の検討2016
- Author(s)
  神崎卓丸，浅原康平，中根丈司，中澤香太，桂田浩一，杉本俊二，堀川順生，新田恒雄
- Organizer
  日本音響学会2016年秋季研究発表会講演論文集
- Place of Presentation
  富山大学、富山県富山市
- Year and Date
  2016-09-14 – 2016-09-16
[Presentation] Prognostic feature extraction in colorectal cancer by combining the gene expression data and the clinical data2016
- Author(s)
  A.Kitajima et al.
- Organizer
  15th European Conference on Computational Biology
- Place of Presentation
  World Forum Convention Center、The Hague, Netherlands
- Year and Date
  2016-09-03 – 2016-09-07
- Int'l Joint Research
[Presentation] Identification of distinct subtypes in colorectal cancer with the survival and the primary sites2016
- Author(s)
  Y.Hosokawa et al.
- Organizer
  15th European Conference on Computational Biology
- Place of Presentation
  World Forum Convention Center、The Hague, Netherlands
- Year and Date
  2016-09-03 – 2016-09-07
- Int'l Joint Research
[Presentation] フレーム連結画像を用いたCNNによるシーン認識2016
- Author(s)
  齊藤剛史，Ziheng Zhou，Iryna Anina，Guoying Zhao，Matti Pietikainen
- Organizer
  第19回　画像の認識・理解シンポジウム（MIRU2016）
- Place of Presentation
  アクトシティ浜松、静岡県浜松市
- Year and Date
  2016-08-01 – 2016-08-04
[Remarks] 第3回サイレント音声認識ワークショップ
- URL
  http://www.slab.ces.kyutech.ac.jp/SSRW2016/
[Funded Workshop] ACCV2016 workshop on Multi-view Lip-reading/Audio-visual Challenges (MLAC2016)2016
- Place of Presentation
  TICC、台北市、台湾
- Year and Date
  2016-11-21 – 2016-11-21

2016 Fiscal Year Annual Research Report

マルチモーダルサイレント音声認識技術に関する研究

Principal Investigator

齊藤 剛史 九州工業大学, 大学院情報工学研究院, 准教授 (10379654)

Current Status of Research Progress

Reason

Research Products

[Int'l Joint Research] University of Oulu(Finland)

Country Name

Counterpart Institution

[Journal Article] Potential association between modular structures of brain functional connectivity networks and individual variability in foreign language ability2017

Author(s)

Journal Title

[Journal Article] Investigation of DNN-based audio-visual speech recognition2016

Author(s)

Journal Title

DOI

[Presentation] 日本語音節発話・想起時の脳波解析2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 発話時と想起時の脳波による日本語短音節認識の比較2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Suffix Arrayを用いた高速STDにおけるキーワード分割の最適化に関する検討2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 顔画像の対称3D-AAMによる顔方向非依存な発話認識2017

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Dynamic generation of thesaurus from plaintext using deep learning2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Denoising of scalp-recorded EEGs using higher-order statistics for BCI2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A method for constructing brain functional connectivity network based on synchronization likelihood of scalp-recorded EEGs2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] EEG during Japanese syllable recall and speech tasks2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Japanese monosyllable recognition from EEG2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Lip Reading from Multi View Facial Images Using 3D-AAM2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Concatenated Frame Image based CNN for Visual Speech Recognition2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Continuous Inaudible Recognition of Japanese Vowels Using Features Detected at Lips Shape Peaks based on Surface Electromyography2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 読唇向け公開データベースの紹介2016

Author(s)

Organizer

齊藤剛史九州工業大学, 大学院情報工学研究院, 准教授 (10379654)