• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Multi-Modal Speech Enhancement Using Mobile Device

Research Project

Project/Area Number 19K12905
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 90150:Medical assistive technology-related
Research InstitutionOsaka Institute of Technology

Principal Investigator

MATSUI Kenji  大阪工業大学, ロボティクス&デザイン工学部, 教授 (30613682)

Co-Investigator(Kenkyū-buntansha) 中藤 良久  九州工業大学, 大学院工学研究院, 教授 (10599955)
加藤 弓子  聖マリアンナ医科大学, 医学部, 研究員 (10600463)
水町 光徳  九州工業大学, 大学院工学研究院, 准教授 (90380740)
Project Period (FY) 2019-04-01 – 2022-03-31
Project Status Completed (Fiscal Year 2021)
Budget Amount *help
¥3,250,000 (Direct Cost: ¥2,500,000、Indirect Cost: ¥750,000)
Fiscal Year 2021: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2020: ¥1,170,000 (Direct Cost: ¥900,000、Indirect Cost: ¥270,000)
Fiscal Year 2019: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Keywords機械読唇 / 発声支援 / 変分オートエンコーダー / 口形素 / 深度画像 / 携帯端末 / 人工喉頭 / 口唇画像認識 / モバイル端末 / 子音認識 / 音声合成
Outline of Research at the Start

本研究の目的は、見た目に健常者と同等で周囲の視線を気にすることなく使える拡声装置あるいは発声支援装置の開発である。さらに、低コスト、軽量で簡単に使え、ユーザー適応可能な装置の開発を目指す。スマホおよび口唇画像情報を最大限利用し、今までの人工喉頭とは全く異なり、スマホにアプリをインストールするだけで利用可能なシステムを構築する。具体的には、主に食道発声ユーザーに向けた音声強調および小型拡声器を用いたワイヤレス拡声機能、人工喉頭を主として用いるユーザーに向けた読唇⇒音声合成による発声機能をそれぞれ実現し、多様な特性に適応可能なスマホ利用発声支援機能群を実現することを目標とする。

Outline of Final Research Achievements

We have been developing a speech enhancement device for laryngectomees. Our approach is to use a lip-reading technology to be able to recognize Japanese words from lip images and generate speech outputs using mobile devices. The target words are translated into registered 36 viseme sequences, and converted into VAE (Variational Auto Encoder) feature parameters. Then the corresponding words are recognized using CNN-based model. PC-based prototype was tested, and observed more than 90% accuracy with 20 Japanese words and a well-trained single subject. Also, we developed a mobile device based prototype and conducted the preliminary recognition experiment with 26 words by a well-trained single subject, and 95% accuracy was obtained including the 1st through 6th candidates, which was almost equivalent to the PC-based system. To be able to improve consonant recognition, depth camera was introduced and obtained slightly better accuracy, however, more careful algorithm tuning is necessary.

Academic Significance and Societal Importance of the Research Achievements

喉頭摘出者など病気や事故で発声が困難になった場合、電気式人工喉頭や食道発声等の代用音声を用いる.しかしこれらは使用時に目立つことや習得に時間がかかることが課題である.実際にユーザからは“既存のデバイスが使える”,“目立たない外観である”,“使いやすいインターフェースである”ことが望まれている.このことから機械読唇による発声支援が研究されている.本研究の特徴は口形素と変分オートエンコーダを用いて単語登録が極めて容易な機械読唇によるフレーズ認識方式であり、携帯端末への実装も行いその効果や課題を検証した.また、深度画像を用いて機械読唇での子音認識の精度向上を図っており、実証実験に向けて意義は大きい.

Report

(4 results)
  • 2021 Annual Research Report   Final Research Report ( PDF )
  • 2020 Research-status Report
  • 2019 Research-status Report
  • Research Products

    (12 results)

All 2022 2021 2020 2019

All Journal Article (2 results) (of which Int'l Joint Research: 2 results,  Peer Reviewed: 2 results) Presentation (10 results) (of which Int'l Joint Research: 5 results)

  • [Journal Article] Development of Mobile Device-Based Speech Enhancement System Using Lip-Reading2022

    • Author(s)
      Fumiaki Eguchi, Kenji Matsui, Yoshihisa Nakatoh, Yumiko O. Kato, Alberto Rivas, Juan Manuel Corchado
    • Journal Title

      Distributed Computing and Artificial Intelligence, Volume 1: 18th International Conference

      Volume: 1 Pages: 210-220

    • Related Report
      2021 Annual Research Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Mobile Device-based Speech Enhancement System Using Lip-reading2020

    • Author(s)
      Tomonori Nakahara, Kohei Fukuyama, Mitsuru Hamada, Kenji Matsui, Yoshihisa Nakatoh, Yumiko O. Kato, Alberto Rivas, Juan Manuel Corchado
    • Journal Title

      Advances in Intelligent Systems and Computing

      Volume: 1237 Pages: 159-167

    • DOI

      10.1007/978-3-030-53036-5_17

    • ISBN
      9783030530358, 9783030530365
    • Related Report
      2020 Research-status Report 2019 Research-status Report
    • Peer Reviewed / Int'l Joint Research
  • [Presentation] 携帯端末を用いた口唇認識による発話支援の検討2021

    • Author(s)
      江口文耀、松井謙二、中藤良久、加藤弓子
    • Organizer
      日本音響学会2021年秋季研究発表会
    • Related Report
      2021 Annual Research Report
  • [Presentation] Effective Selection Method of Microphones for Conversation Assistance in Noisy Environment2021

    • Author(s)
      Mizuki Horii, Rin Hirakawa, Hideaki Kawano, Yoshihisa Nakatoh
    • Organizer
      5th International Conference on Human Interaction and Emerging Technologies
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Speaker identification method using bone conduction and throat microphones2021

    • Author(s)
      Takeshi Hashiguchi, Rin Hirakawa, Hideaki Kawano, Yoshihisa Nakatoh,
    • Organizer
      5th International Conference on Human Interaction and Emerging Technologies
    • Related Report
      2021 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Speech Enhancement System Using SVM for Train Announcement2021

    • Author(s)
      Yuto Kinoshita, Rin Hirakawa, Hideaki Kawano, Kenichi Nakashi, Yoshihisa Nakatoh
    • Organizer
      The 39th IEEE International Conference on Consumer Electronics(IEEE ICCE 2021)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Speech Enhancement System Using Lip-reading2020

    • Author(s)
      Kenji Matsui, Kohei Fukuyama, Yoshihisa Nakatoh, Yumiko O. Kato
    • Organizer
      2nd IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET 2020)
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] 発声支援のための口形素列によるフレーズ認識方式の検討2020

    • Author(s)
      中原智典, 福山晃平, 松井謙二, 中藤良久, 加藤弓子
    • Organizer
      日本音響学会2020年秋季研究発表会
    • Related Report
      2020 Research-status Report
  • [Presentation] Mobile Device-based Speech Enhancement System Using Lip-reading2020

    • Author(s)
      Tomonori Nakahara, Kohei Fukuyama, Mitsuru Hamada, Kenji Matsui, Yoshihisa Nakatoh, Yumiko O. Kato, Alberto Rivas, Juan Manuel Corchado
    • Organizer
      17th International Conference on Distributed Computing and Artificial Intelligence
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research
  • [Presentation] 携帯機器 を用いた口唇情報利用 発声支援デバイスの開発2020

    • Author(s)
      濵田三 弦,福山晃平,松井謙二,中藤良久,加藤弓子
    • Organizer
      日本音響学会2020年春季研究発表会
    • Related Report
      2019 Research-status Report
  • [Presentation] 発声支援のための読唇手法の検討2020

    • Author(s)
      福山晃平,松井謙二,中藤良久,加藤弓子
    • Organizer
      日本音響学会2020年春季研究発表会
    • Related Report
      2019 Research-status Report
  • [Presentation] 携帯機器と口唇情報利用による発声支援方式の検討2019

    • Author(s)
      福山晃平,濵田三 弦,松井謙二,中藤良久,加藤弓子
    • Organizer
      日本音響学会2019年秋季研究発表会
    • Related Report
      2019 Research-status Report

URL: 

Published: 2019-04-18   Modified: 2023-03-16  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi