2023 Fiscal Year Final Research Report

Estimation of speech content from vocal movements by fusion of multiple sensors and its application to speech assistance devices

Research Project

PDF

Project/Area Number	21K11941
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	Nippon Institute of Technology
Principal Investigator	Ota Kenko 日本工業大学, 基幹工学部, 助教 (50511911)
Project Period (FY)	2021-04-01 – 2024-03-31
Keywords	無発声音声認識 / 深層学習 / 三次元計測
Outline of Final Research Achievements	Throughout the entire research period, the purpose of this study was to investigate systems that assist people who have difficulty speaking, such as by removing their vocal cords, and systems that assist existing speech recognition. As a result, we were able to study a technology using deep learning that recognizes sentences phoneme by phoneme without using speech information. We also studied a technology for estimating emotions using a camera and a sensor that measures the galvanic skin response of the fingers, and a speech synthesizing technology from text as a method for assisting speech production.
Free Research Field	知能情報処理
Academic Significance and Societal Importance of the Research Achievements	本研究は、音声情報を利用しない音声認識について、深層学習を用いた音素単位での文章認識を実現するためのデータ取得手法や深層ニューラルネットワークについて検討したことに学術的な意義がある。また、話者の感情推定や音声合成技術それぞれについて取り組み、発声が困難な方のための発声補助デバイスの開発に向けた基礎的な検討ができたことや課題の抽出ができたことに社会的な意義がある。