• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2019 Fiscal Year Annual Research Report

Research for unsupervised acoustic pattern discovery with zero resources

Research Project

Project/Area Number 17K00237
Research InstitutionNara Institute of Science and Technology

Principal Investigator

サクリアニ サクティ  奈良先端科学技術大学院大学, 先端科学技術研究科, 特任准教授 (00395005)

Co-Investigator(Kenkyū-buntansha) 中村 哲  奈良先端科学技術大学院大学, データ駆動型サイエンス創造センター, 教授 (30263429)
Project Period (FY) 2017-04-01 – 2020-03-31
Keywords音声認識 / ゼロ資源音声技術 / 脳波 / 音声翻訳
Outline of Annual Research Achievements

東京オリンピック・パラリンピックが近づくにつれ、海外からの観光客との言葉の壁はますます深刻な問題となっている。現在の音声認識・音声翻訳技術は、リソースが大きい言語についてはすでに容易に利用できるため、ここでは言語特有の知識も書き起こしデータもないようなゼロ資源の音声処理の問題を対象とする。2018 年度では、インドネシア言語のゼロリソースモデリングの構築に成功した。今回は、Dirichlet プロセスのガウス混合モデルを利用する代わりに、ディープラーニングに基づいてシステムを構築した。このシステムでは、(1)サブワード単位を発見すること、(2)音声を合成すること、および両方とも教師なしで行うことができた。また、2019年の世界ゼロ資源スピーチチャレンジに参加し、提案手法で上位結果を得ることができた。さらに、脳解析研究について、2018年度では、Speech-Imagination中のEEG振動とあからさまな相手の音声包絡線との間の同期を明らかにするための研究を行った。2019年では引き続き2020年のWorld Zero Resource Speech Challengeに参加し、システムのパフォーマンスを向上させることができた。また、テキストを書き起こさずに、未知の言語用の教師なし音声音声変換を作成し、IEEE自動音声認識および理解会議で公開した。また、すべての言語、すべての人々、すべての国の言語テクノロジーをサポートする世界言語言語コンソーシアムのため、ユネスコとの協力関係を構築した。このプロジェクトは、今後、2022年から2023年の10年間、国連国際先住民族言語年として継続される予定である。

  • Research Products

    (26 results)

All 2020 2019 2018

All Journal Article (7 results) (of which Int'l Joint Research: 5 results,  Peer Reviewed: 7 results,  Open Access: 3 results) Presentation (18 results) (of which Int'l Joint Research: 15 results) Patent(Industrial Property Rights) (1 results)

  • [Journal Article] Machine Speech Chain2020

    • Author(s)
      Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Journal Title

      IEEE/ACM Transactions on Audio, Speech and Language Processing

      Volume: Vol.28 Pages: 976-989

    • DOI

      10.1109/TASLP.2020.2977776

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Leveraging Neural Caption Translation with Visually Grounded Paraphrase Augmentation2020

    • Author(s)
      Johanes Effendi, Katsuhito Sudoh, Sakriani Sakti, Satoshi Nakamura
    • Journal Title

      IEICE

      Volume: Vol.E103-D, No.03 Pages: 674-683

    • DOI

      10.1587/transinf.2019EDP7065

    • Peer Reviewed
  • [Journal Article] Recurrent Neural Network Compression based on Low-Rank Tensor Representation2020

    • Author(s)
      Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Journal Title

      IEICE

      Volume: Volume E103.D Issue 2 Pages: 435-449

    • DOI

      10.1587/transinf.2019EDP7040

    • Peer Reviewed
  • [Journal Article] End-to-End Speech Recognition Sequence Training with Reinforcement Learning2019

    • Author(s)
      Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Journal Title

      IEEE Access

      Volume: Volume: 7 Pages: 79758-79769

    • DOI

      10.1109/ACCESS.2019.2922617

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Positive Emotion Elicitation in Chat-Based Dialogue Systems2019

    • Author(s)
      Nurul Lubis, Sakriani Sakti, Koichiro Yoshino, Satoshi Nakamura
    • Journal Title

      IEEE/ACM Transactions on Audio, Speech and Language Processing

      Volume: Volume: 27, Issue: 4 Pages: 866-877

    • DOI

      10.1109/TASLP.2019.2900910

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Synchronization between overt speech envelope and EEG oscillations during imagined speech2019

    • Author(s)
      Hiroki Watanabe, Hiroki Tanaka, Sakriani Sakti, Satoshi Nakamura
    • Journal Title

      Neuroscience Research

      Volume: Volume 153 Pages: 48-55

    • DOI

      10.1016/j.neures.2019.04.004

    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Neural Oscillation-Based Classification of Japanese Spoken Sentences During Speech Perception2019

    • Author(s)
      Hiroki Watanabe, Hiroki Tanaka, Sakriani Sakti, Satoshi Nakamura
    • Journal Title

      IEICE Transactions on Information and Systems

      Volume: Volume E102.D, issue 2 Pages: 383-391

    • DOI

      10.1587/transinf.2018EDP7293

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Presentation] Neural Incremental Speech Recognition Through Attention Transfer2020

    • Author(s)
      Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Organizer
      ANLP
  • [Presentation] From Speech Chain to Multimodal Chain: Leveraging Cross-modal Data Augmentation for Semi-supervised Learning2020

    • Author(s)
      Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Organizer
      ANLP
  • [Presentation] Speech-to-Speech Translation without Text2020

    • Author(s)
      Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Organizer
      ANLP
  • [Presentation] Neural Machine Translation with Acoustic Embedding2019

    • Author(s)
      Takatomo Kano, Sakriani Sakti, Satoshi Nakamura
    • Organizer
      IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop
    • Int'l Joint Research
  • [Presentation] Zero-shot Code-switching ASR and TTS with Multilingual Machine Speech Chain2019

    • Author(s)
      Sahoko Nakayama, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Organizer
      IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop
    • Int'l Joint Research
  • [Presentation] Listening while Speaking: Improving ASR through Multimodal Chain2019

    • Author(s)
      Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Organizer
      IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop
    • Int'l Joint Research
  • [Presentation] Speech-to-speech Translation between Untranscribed Unknown Languages2019

    • Author(s)
      Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Organizer
      IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop
    • Int'l Joint Research
  • [Presentation] Dialogue Model and Response Generation for Emotion Improvement Elicitation2019

    • Author(s)
      Nurul Lubis, Sakriani Sakti, Koichiro Yoshino, Satoshi Nakamura
    • Organizer
      the 3rd Conversational AI workshop - NeurIPS 2019
    • Int'l Joint Research
  • [Presentation] Recognition and Translation of Code-switching Speech Utterances2019

    • Author(s)
      Sahoko Nakayama, Takatomo Kano, Andros Tjandra, Sakriani Sakti, and Satoshi Nakamura
    • Organizer
      Oriental COCOSDA 2019
    • Int'l Joint Research
  • [Presentation] Phoneme Level Speaking Rate Variation on Waveform Generation using GAN-TTS2019

    • Author(s)
      Mayuko Okamoto, Sakriani Sakti, and Satoshi Nakamura
    • Organizer
      Oriental COCOSDA 2019
    • Int'l Joint Research
  • [Presentation] Sequence-to-sequence Learning via Attention Transfer for Incremental Speech Recognition2019

    • Author(s)
      Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Organizer
      Interspeech 2019
    • Int'l Joint Research
  • [Presentation] VQVAE Unsupervised Unit Discovery and Multi-Scale Code2Spec Inverter for Zerospeech Challenge 20192019

    • Author(s)
      Andros Tjandra, Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizou Li, Satoshi Nakamura
    • Organizer
      Interspeech 2019
    • Int'l Joint Research
  • [Presentation] Neural iTTS: Toward Synthesizing Speech in Real-time with End-to-end Neural Text-to-Speech Framework2019

    • Author(s)
      Tomoya Yanagita, Sakriani Sakti and Satoshi Nakamura
    • Organizer
      SSW
    • Int'l Joint Research
  • [Presentation] Speech Quality Evaluation of Synthesized Japanese Speech Using EEG2019

    • Author(s)
      Ivan Halim Parmonangan, Hiroki Tanaka, Sakriani Sakti, Shinnosuke Takamichi, Satoshi Nakamura
    • Organizer
      Interspeech 2019
    • Int'l Joint Research
  • [Presentation] EEG Analysis towards Evaluating Synthesized Speech Quality2019

    • Author(s)
      Ivan Halim Parmonangan, Hiroki Tanaka, Sakti Sakriani, Shinnosuke Takamichi, Satoshi Nakamura
    • Organizer
      IEEE Engineering in Medicine and Biology Society
    • Int'l Joint Research
  • [Presentation] Cross-lingual speech-based ToBI label generation using bidirectional LSTM2019

    • Author(s)
      Marco Vetter, Sakriani Sakti, Satoshi Nakamura
    • Organizer
      IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
    • Int'l Joint Research
  • [Presentation] End-to-end feedback loss in speech chain framework via straight-through estimator2019

    • Author(s)
      Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
    • Organizer
      IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
    • Int'l Joint Research
  • [Presentation] Speech Artifact Removal from EEG Recordings of Spoken Word Production with Tensor Decomposition2019

    • Author(s)
      Holy Lovenia, Hiroki Tanaka, Sakriani Sakti, Ayu Purwarianti, Satoshi Nakamura
    • Organizer
      IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
    • Int'l Joint Research
  • [Patent(Industrial Property Rights)] スピーチチェイン装置、コンピュータプログラムおよびDNN音声認識・合成相互学習方法2018

    • Inventor(s)
      アンドロス チャンドラ, サクリアニ サクティ, 中村 哲
    • Industrial Property Rights Holder
      アンドロス チャンドラ, サクリアニ サクティ, 中村 哲
    • Industrial Property Rights Type
      特許
    • Patent Publication Number
      特開2019-120841

URL: 

Published: 2021-01-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi