2019 年度実施状況報告書

Next generation multilingual End-to-End speech recognition (from G30 to G200)

研究課題

研究課題/領域番号	19K24376
研究機関	国立研究開発法人情報通信研究機構
研究代表者	李勝国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター先進的音声技術研究室, 研究員 (70840940)
研究期間 (年度)	2019-08-30 – 2021-03-31
キーワード	speech recognition / end-to-end / language identification / disordered speech / speaker diarization
研究実績の概要	In FY2019, I focus on algorithm optimization. I am so happy to find the proposed method is universally available for wide various tasks (multilingual mispronunciation detection, multilingual speech recognition, disordered speech recognition, language identification, and speaker diarization). There are 2 international papers were accepted in ICASSP2020, 1 joint first author, and 1 corresponding author. 1 co-authored with equal contribution in ICME2020, 1 first author and 1 co-author in Speaker Odyssey2020. There are also 3 domestic presentations were reported in ASJ2020. In FY2020, we will investigate the core problem of this research topic, that is massive language modeling ability. There are strong influences from cross-language family or same-language family.
現在までの達成度 (区分)	現在までの達成度 (区分) 1: 当初の計画以上に進展している理由 I am so happy to find the proposed method is universally available for various tasks (speech recognition, disordered speech recognition, language identification, and speaker diarization). I also revealed the limits of the proposed method by the extensive research in paper of Odyssey2020. All these are important for future research.
今後の研究の推進方策	In the next step, we will touch the core problem of this research topic, that is massive language modeling ability. Three possible tasks would be accent speech recognition, cross-language family speech recognition, inter-language family speech recognition.
次年度使用額が生じた理由	In FY2020 the COVID-19 influenced my research activities. I canceled all of the on-site meeting for the accepted paper. And I focus myself on experiment and new algorithm design. For the remaining 515,867 JPY, I would buy one more GPU-workstation and proofreading service for the manuscripts.

研究成果
(17件)

すべて 2020 2019 その他

すべて国際共同研究 (1件) 学会発表 (8件) (うち国際学会 4件) 図書 (1件) 産業財産権 (4件) 学会・シンポジウム開催 (3件)

[国際共同研究] Tianjin University(中国)
- 国名
  中国
- 外国機関名
  Tianjin University
[学会発表] Joint Training End-to-End Speech Recognition Systems with Speaker Attributes.2020
- 著者名/発表者名
  S. Li, X. Lu, R. Dabre, P. Shen and H. Kawai
- 学会等名
  ISCA-Odyssey (The Speaker and Language Recognition Workshop)
- 国際学会
[学会発表] Compensation on x-vector for short utterance spoken language identification.2020
- 著者名/発表者名
  P. Shen, X. Lu, K. Sugiura, S. Li and H. Kawai.
- 学会等名
  ISCA-Odyssey (The Speaker and Language Recognition Workshop)
- 国際学会
[学会発表] Voice-Indistinguishability: Protecting Voiceprint in Privacy Preserving Speech Data Release.2020
- 著者名/発表者名
  Y. Han, S. Li, Y. Cao, Q. Ma and M. Yoshikawa.
- 学会等名
  IEEE-ICME
- 国際学会
[学会発表] End-To-End Articulatory Modeling for Dysarthria Articulatory Attribute Detection.2020
- 著者名/発表者名
  Y. Lin, L. Wang, J. Dang, S. Li, and C. Ding.
- 学会等名
  IEEE-ICASSP
- 国際学会
[学会発表] Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation.2020
- 著者名/発表者名
  H. Shi, L. Wang, M. Ge, S. Li, and J. Dang.
- 学会等名
  IEEE-ICASSP
[学会発表] End-to-End Articulatory Attribute Modeling for Low-resource Multilingual Speech Recognition,2020
- 著者名/発表者名
  S. Li, C. Ding, X. Lu, P. Shen and H. Kawai,
- 学会等名
  Acoustical Society of Japan, spring, 2020.
[学会発表] Joint Training End-to-End Systems for Speech and Speaker Recognition with Speaker Attributes,2020
- 著者名/発表者名
  S. Li, X. Lu, R. Dabre, P. Shen and H. Kawai,
- 学会等名
  Acoustical Society of Japan, spring, 2020.
[学会発表] Improvement of x-vector for short utterance spoken language identification,2020
- 著者名/発表者名
  P. Shen, X. Lu, K. Sugiura, S. Li, H. Kawai,
- 学会等名
  Acoustical Society of Japan, spring, 2020.
[図書] Automatic speech recognition2020
- 著者名/発表者名
  X. Lu, S. Li, M. Fujimoto
- 総ページ数
  18
- 出版者
  Springer Singapore
- ISBN
  978-981-15-0595-9
[産業財産権] 推論器および推論器の学習方法2020
- 発明者名
  李勝、ルーシュガン、河井恒
- 権利者名
  国立研究開発法人情報通信研究機構
- 産業財産権種類
  特許
- 産業財産権番号
  特願2020-059962
[産業財産権] 推論器、推論プログラムおよび学習方法2019
- 発明者名
  李勝、ルーシュガン、丁塵辰、河原達也、河井恒
- 権利者名
  国立研究開発法人情報通信研究機構
- 産業財産権種類
  特許
- 産業財産権番号
  特願2019-163555
[産業財産権] 推論器、学習方法および学習プログラム2019
- 発明者名
  李勝、ルーシュガン、ダブレラジ、河井恒
- 権利者名
  国立研究開発法人情報通信研究機構
- 産業財産権種類
  特許
- 産業財産権番号
  特願2019-051008
[産業財産権] 言語識別モデルの訓練方法及び装置、並びにそのためのコンピュータプログラム2019
- 発明者名
  沈鵬, ルーシュガン , 李勝 , 河井恒
- 権利者名
  国立研究開発法人情報通信研究機構
- 産業財産権種類
  特許
- 産業財産権番号
  特願2019-086005
[学会・シンポジウム開催] Odyssey2020 The Speaker and Language Recognition Workshop2020
[学会・シンポジウム開催] ICASSP20202020
[学会・シンポジウム開催] ICME20202020

2019 年度 実施状況報告書

Next generation multilingual End-to-End speech recognition (from G30 to G200)

研究代表者

李 勝 国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター先進的音声技術研究室, 研究員 (70840940)

現在までの達成度 (区分)

理由

研究成果

[国際共同研究] Tianjin University(中国)

国名

外国機関名

[学会発表] Joint Training End-to-End Speech Recognition Systems with Speaker Attributes.2020

著者名/発表者名

学会等名

[学会発表] Compensation on x-vector for short utterance spoken language identification.2020

著者名/発表者名

学会等名

[学会発表] Voice-Indistinguishability: Protecting Voiceprint in Privacy Preserving Speech Data Release.2020

著者名/発表者名

学会等名

[学会発表] End-To-End Articulatory Modeling for Dysarthria Articulatory Attribute Detection.2020

著者名/発表者名

学会等名

[学会発表] Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation.2020

著者名/発表者名

学会等名

[学会発表] End-to-End Articulatory Attribute Modeling for Low-resource Multilingual Speech Recognition,2020

著者名/発表者名

学会等名

[学会発表] Joint Training End-to-End Systems for Speech and Speaker Recognition with Speaker Attributes,2020

著者名/発表者名

学会等名

[学会発表] Improvement of x-vector for short utterance spoken language identification,2020

著者名/発表者名

学会等名

[図書] Automatic speech recognition2020

著者名/発表者名

総ページ数

出版者

ISBN

[産業財産権] 推論器および推論器の学習方法2020

発明者名

権利者名

産業財産権種類

産業財産権番号

[産業財産権] 推論器、推論プログラムおよび学習方法2019

発明者名

権利者名

産業財産権種類

産業財産権番号

[産業財産権] 推論器、学習方法および学習プログラム2019

発明者名

権利者名

産業財産権種類

産業財産権番号

[産業財産権] 言語識別モデルの訓練方法及び装置、並びにそのためのコンピュータプログラム2019

発明者名

権利者名

産業財産権種類

産業財産権番号

[学会・シンポジウム開催] Odyssey2020 The Speaker and Language Recognition Workshop2020

[学会・シンポジウム開催] ICASSP20202020

[学会・シンポジウム開催] ICME20202020

2019 年度実施状況報告書

李勝国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター先進的音声技術研究室, 研究員 (70840940)