• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2020 Fiscal Year Annual Research Report

Next generation multilingual End-to-End speech recognition (from G30 to G200)

Research Project

Project/Area Number 19K24376
Research InstitutionNational Institute of Information and Communications Technology

Principal Investigator

李 勝  国立研究開発法人情報通信研究機構, 先進的音声翻訳研究開発推進センター 先進的音声技術研究室, 研究員 (70840940)

Project Period (FY) 2019-08-30 – 2021-03-31
Keywordsmultilingual modeling / low-resourced modeling / speech translation / multi-unit modeling / language identification / disordered speech / code-switched
Outline of Annual Research Achievements

In FY2020, I focus on accent speech recognition (English and Chinese), cross-language family speech recognition. Multilingual speech recognition technologies have also been applied to language identification, speaker recognition, disordered speech recognition, and more complex tasks, such as speech translation and adversarial attack.Achievements are as follows:
1. This year's investigation of multilingual modeling technology has been applied to speaker modeling (1 domestic presentation: IEICE-SP), low-resource transfer learning (1 Interspeech SLIMT2020), and speech translation (NLP2021 presentation), language identification (1 journal paper of IEEE-TASLP), and disordered speech recognition (1 Interspeech2020 with grant honor, 1 O-COCOSDA).
2. I also find the acoustic modeling unit selection technology can enhance single-language speech recognition with multi-unit (1 invited full paper on 1 Interspeech SLIMT2020, 1 ICASSP2021) and code-switched speech synthesis (1 Interspeech SLIMT2020, 1 ICONIP paper).
3. Following researches also benefit with the multilingual modeling technologies: speech separation (1 Interspeech2020 with grant honor), adversarial attack (1 IEEE-SLT demo paper), voice-privacy (1 invited report on Interspeech SLIMT2020, 1 Interspeech challenge, 1 ACM-CCS demo), voice activity detection (1 ICASSP2021), Mandarin tone modeling (1 ICASSP2021)

Remarks

The paper urls can be found in these pages.

  • Research Products

    (23 results)

All 2021 2020 Other

All Int'l Joint Research (1 results) Journal Article (1 results) (of which Peer Reviewed: 1 results) Presentation (16 results) (of which Int'l Joint Research: 14 results,  Invited: 4 results) Remarks (5 results)

  • [Int'l Joint Research] Tianjin University/Xinjiang University/Hithink RoyalFlush AI(中国)

    • Country Name
      CHINA
    • Counterpart Institution
      Tianjin University/Xinjiang University/Hithink RoyalFlush AI
  • [Journal Article] Knowledge Distillation-based Representation Learning for Short-Utterance Spoken Language Identification2020

    • Author(s)
      P. Shen, X. Lu, S. Li, H. Kawai.
    • Journal Title

      IEEE/ACM Trans. Audio, Speech \& Language Process.

      Volume: 28 Pages: 2674 - 2683

    • DOI

      10.1109/TASLP.2020.3023627

    • Peer Reviewed
  • [Presentation] Robust voice activity detection using a masked auditory encoder based convolutional neural network.2021

    • Author(s)
      N. Li, L. Wang, M. Unoki, S. Li, R. Wang, M. Ge, J. Dang,
    • Organizer
      IEEE-ICASSP, 2021
    • Int'l Joint Research
  • [Presentation] An investigation of using hybrid modeling units for improving End-to-End speech recognition systems.2021

    • Author(s)
      S. Chen, X. Hu, S. Li, X. Xu,
    • Organizer
      IEEE-ICASSP, 2021.
    • Int'l Joint Research
  • [Presentation] Encoder-Decoder based pitch tracking and joint model training for Mandarin tone classification.2021

    • Author(s)
      H. Huang, K. Wang, Y. Hu, S. Li,
    • Organizer
      IEEE-ICASSP, 2021.
    • Int'l Joint Research
  • [Presentation] Comparison of End-to-End Models for Joint Speaker and Speech Recognition2021

    • Author(s)
      K. Soky, S. Li, M. Mimura, C. Chu, T. Kawahara,
    • Organizer
      IEICE-SP, 2021.
  • [Presentation] Phantom in the Opera: Effective Adversarial Music Attack on Keyword Spotting Systems.2020

    • Author(s)
      H. Zhang, S. Li, X. Ma, Y. Zhao, Y. Cao, T. Kawahara,
    • Organizer
      IEEE-SLT, 2021
    • Int'l Joint Research
  • [Presentation] Multilingual transformer training for Khmer automatic speech recognition2020

    • Author(s)
      K. Soky, S. Li, T. Kawahara, S. Seng,
    • Organizer
      Interspeech 2020 Satellite Workshop (SLIMTS2020)
    • Int'l Joint Research / Invited
  • [Presentation] End-to-End Speech Translation with Cross-lingual Transfer Learning2020

    • Author(s)
      S. Shimizu, C. Chu, S. Li, S. Kurohashi,
    • Organizer
      NLP, 2021.
  • [Presentation] Effectively Synthesizing Code-switched Speech Using Highly Imbalanced Mix-lingual Data and mask embedding2020

    • Author(s)
      S. Guo, L. Wang, S. Li, J. Zhang, C. Gong, Y. Wang, J. Dang, K. Honda
    • Organizer
      Interspeech 2020 Satellite Workshop (SLIMTS2020)
    • Int'l Joint Research / Invited
  • [Presentation] A Mixture of Character and Word End-to-End System for Keyword Spotting2020

    • Author(s)
      H. Zhang, S. Ueno, M. Mimura, S. Li, W. Zhang, T. Kawahara,
    • Organizer
      Interspeech 2020 Satellite Workshop (SLIMTS2020)(full paper).
    • Int'l Joint Research / Invited
  • [Presentation] Effectively Synthesizing Code-switched Speech Using Highly Imbalanced Mix-lingual Data2020

    • Author(s)
      S. Guo, L. Wang, S. Li, J. Zhang, C. Gong, Y. Wang, J. Dang, K. Honda.
    • Organizer
      In Proc. ICONIP, 2020.
    • Int'l Joint Research
  • [Presentation] Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription2020

    • Author(s)
      Y. Lin, L. Wang, S. Li, J. Dang, and C. Ding.
    • Organizer
      In Proc. INTERSPEECH, 2020 (Travel Granted by ISCA).
    • Int'l Joint Research
  • [Presentation] VOIS: The First Speech Therapy App in the World for Myanmar Hearing-Impaired Children.2020

    • Author(s)
      A. Thida, N. Han, S. Oo, S. Li and C. Ding.
    • Organizer
      In Proc. O-COCOSDA, 2020.
    • Int'l Joint Research
  • [Presentation] Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release,2020

    • Author(s)
      Y. Han, Y. Cao, S. Li, Q. Ma, M. Yoshikawa.
    • Organizer
      Interspeech 2020 Satellite Workshop (SLIMTS2020) (invited report).
    • Int'l Joint Research / Invited
  • [Presentation] Voice-Indistinguishability: Protecting Voiceprint with Differential Privacy under an Untrusted Server.2020

    • Author(s)
      Y. Han, Y. Cao, S. Li, Q. Ma, M. Yoshikawa.
    • Organizer
      ACM conference on Computer and Communications Security (CCS), demo, 2020.
    • Int'l Joint Research
  • [Presentation] System Description for Voice Privacy Challenge (Kyoto Team).2020

    • Author(s)
      Y. Han, S. Li, Y. Cao, M. Yoshikawa,
    • Organizer
      In special session of INTERSPEECH 2020 (VoicePrivacy challenge 2020).
    • Int'l Joint Research
  • [Presentation] Singing Voice Extraction with Attention based Spectrograms Fusion.2020

    • Author(s)
      H. Shi, L. Wang, S. Li, C. Ding, M. Ge, N. Li, J. Dang, and H. Seki.
    • Organizer
      In Proc. INTERSPEECH, 2020 (Travel Granted by ISCA).
    • Int'l Joint Research
  • [Remarks] publication information on DBLP

    • URL

      https://dblp.dagstuhl.de/pid/23/3439-10.html

  • [Remarks] Google scholar homepage

    • URL

      https://scholar.google.com/citations?hl=en&user=zHAhs0IAAAAJ

  • [Remarks] researchmap homepage

    • URL

      https://researchmap.jp/listen

  • [Remarks] NICT researcher's homepage

    • URL

      https://ast-astrec.nict.go.jp/aboutus/member/sheng-li/index.html

  • [Remarks] researchgage researcher's homepage

    • URL

      https://www.researchgate.net/profile/Sheng-Li-60

URL: 

Published: 2021-12-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi