Automatic speech recognition based on semi-autonomous learning for captioning lectures
Project/Area Number |
16H02847
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perceptual information processing
|
Research Institution | Kyoto University |
Principal Investigator |
|
Co-Investigator(Kenkyū-buntansha) |
秋田 祐哉 京都大学, 経済学研究科, 准教授 (90402742)
|
Research Collaborator |
Hirose Youko
|
Project Period (FY) |
2016-04-01 – 2019-03-31
|
Project Status |
Completed (Fiscal Year 2018)
|
Budget Amount *help |
¥16,250,000 (Direct Cost: ¥12,500,000、Indirect Cost: ¥3,750,000)
Fiscal Year 2018: ¥5,070,000 (Direct Cost: ¥3,900,000、Indirect Cost: ¥1,170,000)
Fiscal Year 2017: ¥5,070,000 (Direct Cost: ¥3,900,000、Indirect Cost: ¥1,170,000)
Fiscal Year 2016: ¥6,110,000 (Direct Cost: ¥4,700,000、Indirect Cost: ¥1,410,000)
|
Keywords | 音声認識 / コンテンツ・アーカイブ / 機械学習 / 字幕付与 / 情報保障 |
Outline of Final Research Achievements |
We have proposed a new end-to-end framework of speech recognition that directly converts speech signal to a word sequence. It is demonstrated to achieve higher accuracy with a drastically faster speed compared with the conventional systems. We have also developed a captioning system based on the server-based speech recognition system, and also a speech recognition package for PC which is integrated with the captioning software IPtalk widely used in Japan. The software is freely open to the public.
|
Academic Significance and Societal Importance of the Research Achievements |
障害者差別解消法の施行に伴い、講義や講演において聴覚障害者に対する情報保障、すなわち字幕付与が求められているが、現状では量と質の両方において十分でない。これを支援するための音声認識技術の研究開発を行った。新たな深層学習に基づくモデルを導入することで、認識精度と速度の両方で大きな改善が得られた。サーバベースで音声ファイルに字幕を付与するシステム(http://caption.ist.i.kyoto-u.ac.jp/)に加えて、パソコン要約筆記で一般的に用いられているIPtalkにも音声認識の組込みを行い、一般公開した。また、『聴覚障害者のための字幕付与技術』シンポジウムを開催した。
|
Report
(4 results)
Research Products
(29 results)