2021 Fiscal Year Annual Research Report

Multi-Modal Speech Enhancement Using Mobile Device

Research Project

Project/Area Number	19K12905
Research Institution	Osaka Institute of Technology
Principal Investigator	松井謙二大阪工業大学, ロボティクス&デザイン工学部, 教授 (30613682)
Co-Investigator(Kenkyū-buntansha)	中藤良久九州工業大学, 大学院工学研究院, 教授 (10599955) 加藤弓子聖マリアンナ医科大学, 医学部, 研究員 (10600463) 水町光徳九州工業大学, 大学院工学研究院, 准教授 (90380740)
Project Period (FY)	2019-04-01 – 2022-03-31
Keywords	発声支援 / 人工喉頭 / 口唇画像認識 / モバイル端末 / 深度画像 / 子音認識
Outline of Annual Research Achievements	喉頭摘出者のための読唇による発声支援装置の開発を行っている。2020年度までに読唇方式による単語・フレーズ認識アルゴリズムの開発を行い、第3候補までで20単語中19単語(95%)を認識することができた。また、このアルゴリズムで携帯端末用アプリを開発し、実用化された場合に近い形での使用感評価も実施し、従来の電気式人工喉頭に比べて良好な結果が得られた。しかしながら、7名の健常者による認識実験では、認識精度が大きく変動し第6候補までで60%程度であった。2021年度では、この認識精度を向上させるため、母音のみを認識するアルゴリズムから子音認識機能を付与し、母音認識のみの方式より認識精度および安定度の高い認識方式の開発を行った。母音部については従来と同様に変分オートエンコーダによる特徴量抽出とCNNによる認識を行った。子音部については赤外の３Dカメラにより深度画像を抽出し、先ず、単音節認識の実験を行い、第6候補までで、80%程度の認識結果であり、深度画像を用いた効果がある程度確認できた。これにより、CV単位の認識方式では良好な結果が得られると考えられる。一方、単語・フレーズ認識では従来の母音認識による単語認識に比べて認識精度が向上できていない。これは、従来の口形素単位での方式と深度画像による認識方式の適切な組み合わせが出来ていないことが原因と考えられる。今後、CV、VCV単位の単語認識アルゴリズムに修正することで良好な認識精度が得られるようになることが期待できる。

Research Products
(4 results)

All 2022 2021

All Journal Article (1 results) (of which Int'l Joint Research: 1 results, Peer Reviewed: 1 results) Presentation (3 results) (of which Int'l Joint Research: 2 results)

[Journal Article] Development of Mobile Device-Based Speech Enhancement System Using Lip-Reading2022
- Author(s)
  Fumiaki Eguchi, Kenji Matsui, Yoshihisa Nakatoh, Yumiko O. Kato, Alberto Rivas, Juan Manuel Corchado
- Journal Title
  
  Distributed Computing and Artificial Intelligence, Volume 1: 18th International Conference
  
  Volume: 1 Pages: 210-220
- DOI
  10.1007/978-3-030-86261-9
- Peer Reviewed / Int'l Joint Research
[Presentation] 携帯端末を用いた口唇認識による発話支援の検討2021
- Author(s)
  江口文耀、松井謙二、中藤良久、加藤弓子
- Organizer
  日本音響学会2021年秋季研究発表会
[Presentation] Effective Selection Method of Microphones for Conversation Assistance in Noisy Environment2021
- Author(s)
  Mizuki Horii, Rin Hirakawa, Hideaki Kawano, Yoshihisa Nakatoh
- Organizer
  5th International Conference on Human Interaction and Emerging Technologies
- Int'l Joint Research
[Presentation] Speaker identification method using bone conduction and throat microphones2021
- Author(s)
  Takeshi Hashiguchi, Rin Hirakawa, Hideaki Kawano, Yoshihisa Nakatoh,
- Organizer
  5th International Conference on Human Interaction and Emerging Technologies
- Int'l Joint Research

2021 Fiscal Year Annual Research Report

Multi-Modal Speech Enhancement Using Mobile Device

Principal Investigator

松井 謙二 大阪工業大学, ロボティクス&デザイン工学部, 教授 (30613682)

Research Products

[Journal Article] Development of Mobile Device-Based Speech Enhancement System Using Lip-Reading2022

Author(s)

Journal Title

DOI

[Presentation] 携帯端末を用いた口唇認識による発話支援の検討2021

Author(s)

Organizer

[Presentation] Effective Selection Method of Microphones for Conversation Assistance in Noisy Environment2021

Author(s)

Organizer

[Presentation] Speaker identification method using bone conduction and throat microphones2021

Author(s)

Organizer

松井謙二大阪工業大学, ロボティクス&デザイン工学部, 教授 (30613682)