• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2019 Fiscal Year Research-status Report

Next generation multilingual End-to-End speech recognition (from G30 to G200)

Research Project

Project/Area Number 19K24376
Research InstitutionNational Institute of Information and Communications Technology

Principal Investigator

李 勝  国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター先進的音声技術研究室, 研究員 (70840940)

Project Period (FY) 2019-08-30 – 2021-03-31
Keywordsspeech recognition / end-to-end / language identification / disordered speech / speaker diarization
Outline of Annual Research Achievements

In FY2019, I focus on algorithm optimization. I am so happy to find the proposed method is universally available for wide various tasks (multilingual mispronunciation detection, multilingual speech recognition, disordered speech recognition, language identification, and speaker diarization).

There are 2 international papers were accepted in ICASSP2020, 1 joint first author, and 1 corresponding author. 1 co-authored with equal contribution in ICME2020, 1 first author and 1 co-author in Speaker Odyssey2020. There are also 3 domestic presentations were reported in ASJ2020.

In FY2020, we will investigate the core problem of this research topic, that is massive language modeling ability. There are strong influences from cross-language family or same-language family.

Current Status of Research Progress
Current Status of Research Progress

1: Research has progressed more than it was originally planned.

Reason

I am so happy to find the proposed method is universally available for various tasks (speech recognition, disordered speech recognition, language identification, and speaker diarization).

I also revealed the limits of the proposed method by the extensive research in paper of Odyssey2020.

All these are important for future research.

Strategy for Future Research Activity

In the next step, we will touch the core problem of this research topic, that is massive language modeling ability.

Three possible tasks would be accent speech recognition, cross-language family speech recognition, inter-language family speech recognition.

Causes of Carryover

In FY2020 the COVID-19 influenced my research activities. I canceled all of the on-site meeting for the accepted paper. And I focus myself on experiment and new algorithm design.

For the remaining 515,867 JPY, I would buy one more GPU-workstation and proofreading service for the manuscripts.

  • Research Products

    (17 results)

All 2020 2019 Other

All Int'l Joint Research (1 results) Presentation (8 results) (of which Int'l Joint Research: 4 results) Book (1 results) Patent(Industrial Property Rights) (4 results) Funded Workshop (3 results)

  • [Int'l Joint Research] Tianjin University(中国)

    • Country Name
      CHINA
    • Counterpart Institution
      Tianjin University
  • [Presentation] Joint Training End-to-End Speech Recognition Systems with Speaker Attributes.2020

    • Author(s)
      S. Li, X. Lu, R. Dabre, P. Shen and H. Kawai
    • Organizer
      ISCA-Odyssey (The Speaker and Language Recognition Workshop)
    • Int'l Joint Research
  • [Presentation] Compensation on x-vector for short utterance spoken language identification.2020

    • Author(s)
      P. Shen, X. Lu, K. Sugiura, S. Li and H. Kawai.
    • Organizer
      ISCA-Odyssey (The Speaker and Language Recognition Workshop)
    • Int'l Joint Research
  • [Presentation] Voice-Indistinguishability: Protecting Voiceprint in Privacy Preserving Speech Data Release.2020

    • Author(s)
      Y. Han, S. Li, Y. Cao, Q. Ma and M. Yoshikawa.
    • Organizer
      IEEE-ICME
    • Int'l Joint Research
  • [Presentation] End-To-End Articulatory Modeling for Dysarthria Articulatory Attribute Detection.2020

    • Author(s)
      Y. Lin, L. Wang, J. Dang, S. Li, and C. Ding.
    • Organizer
      IEEE-ICASSP
    • Int'l Joint Research
  • [Presentation] Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation.2020

    • Author(s)
      H. Shi, L. Wang, M. Ge, S. Li, and J. Dang.
    • Organizer
      IEEE-ICASSP
  • [Presentation] End-to-End Articulatory Attribute Modeling for Low-resource Multilingual Speech Recognition,2020

    • Author(s)
      S. Li, C. Ding, X. Lu, P. Shen and H. Kawai,
    • Organizer
      Acoustical Society of Japan, spring, 2020.
  • [Presentation] Joint Training End-to-End Systems for Speech and Speaker Recognition with Speaker Attributes,2020

    • Author(s)
      S. Li, X. Lu, R. Dabre, P. Shen and H. Kawai,
    • Organizer
      Acoustical Society of Japan, spring, 2020.
  • [Presentation] Improvement of x-vector for short utterance spoken language identification,2020

    • Author(s)
      P. Shen, X. Lu, K. Sugiura, S. Li, H. Kawai,
    • Organizer
      Acoustical Society of Japan, spring, 2020.
  • [Book] Automatic speech recognition2020

    • Author(s)
      X. Lu, S. Li, M. Fujimoto
    • Total Pages
      18
    • Publisher
      Springer Singapore
    • ISBN
      978-981-15-0595-9
  • [Patent(Industrial Property Rights)] 推論器および推論器の学習方法2020

    • Inventor(s)
      李勝、ルーシュガン、河井恒
    • Industrial Property Rights Holder
      国立研究開発法人情報通信研究機構
    • Industrial Property Rights Type
      特許
    • Industrial Property Number
      特願2020-059962
  • [Patent(Industrial Property Rights)] 推論器、推論プログラムおよび学習方法2019

    • Inventor(s)
      李勝、 ルーシュガン、 丁塵辰、 河原達也、 河井恒
    • Industrial Property Rights Holder
      国立研究開発法人情報通信研究機構
    • Industrial Property Rights Type
      特許
    • Industrial Property Number
      特願2019-163555
  • [Patent(Industrial Property Rights)] 推論器、学習方法および学習プログラム2019

    • Inventor(s)
      李勝、 ルーシュガン、 ダブレラジ、 河井恒
    • Industrial Property Rights Holder
      国立研究開発法人情報通信研究機構
    • Industrial Property Rights Type
      特許
    • Industrial Property Number
      特願2019-051008
  • [Patent(Industrial Property Rights)] 言語識別モデルの訓練方法及び装置、並びにそのためのコンピュータプログラム2019

    • Inventor(s)
      沈 鵬, ルー シュガン , 李 勝 , 河井 恒
    • Industrial Property Rights Holder
      国立研究開発法人情報通信研究機構
    • Industrial Property Rights Type
      特許
    • Industrial Property Number
      特願2019-086005
  • [Funded Workshop] Odyssey2020 The Speaker and Language Recognition Workshop2020

  • [Funded Workshop] ICASSP20202020

  • [Funded Workshop] ICME20202020

URL: 

Published: 2021-01-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi