2017 Fiscal Year Research-status Report

ゼロ資源での教師なし音響パターン発見のための研究

Research Project

Project/Area Number	17K00237
Research Institution	Nara Institute of Science and Technology
Principal Investigator	サクリアニサクティ奈良先端科学技術大学院大学, 情報科学研究科, 特任准教授 (00395005)
Co-Investigator(Kenkyū-buntansha)	中村哲奈良先端科学技術大学院大学, データ駆動型サイエンス創造センター, 教授 (30263429)
Project Period (FY)	2017-04-01 – 2020-03-31
Keywords	音声認識 / ゼロ資源音声技術 / 脳波
Outline of Annual Research Achievements	2020年東京オリンピック・パラリンピックが近づくにつれ、海外からの観光客との言葉の壁はますます深刻な問題となっている。現在の音声認識・音声翻訳技術は、リソースが大きい言語についてはすでに容易に利用できるため、ここでは言語特有の知識も書き起こしデータもないようなゼロ資源の音声処理の問題を対象とする。教師なし音響ユニットモデリングやパターン発見技術は存在するが、実際に言語的および意味的表現との関連まではまだ開発されていないため、本研究では未知言語の音声と意味表現を結びつける手段としてEEG 解析に基づく認知知識をゼロ資源モデリングに組み込む手法について提案し、フレームワークを完成させ複数言語での応用を実証する。2017年度は、自然言語処理および認知科学に関する文献調査、ならびに言語および音声の認知についてアフリカ言語（ツォンガ語など）のゼロ資源モデリングおよびEEG 解析の設計およびシステム構築、Dirichlet プロセスのガウス混合モデルを中心に、音声特徴ベクトルをクラスター化してクラスの動的なセットを行った。各クラスを音響単位とみなすことにより、音声は、クラス後立腺系列として表すことができる。この最適化により、サブワードモデリングの品質が大幅に向上することが示された。この研究手法は、ゼロ資源のスピーチチャレンジに参加して最高のパフォーマンスを達成し、コンペティションに優勝したことで示される。また、脳波検査を用いて日本語の文章を判別する実験を行った。この実験では、テンプレートマッチングと分類器を使用して、さまざまな設定でのパフォーマンスを調査した。さらに話者依存についても実験を行った。さらに、シータ、アルファ、ベータ、低ガンマ、およびすべての周波数帯の組み合わせを含む複数の周波数帯で実験を行った結果、複数の組み合わせの周波数帯が最も良い結果を示した。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 当初計画にあったアフリカ言語（ツォンガ語）のゼロリソースモデリングの構築に成功した。さらに、英語、ドイツ語、フランス語、中国語、アフリカ言語を含むゼロリソースの音声チャレンジに勝つことができた。EEG分析に関しては、日本語の分析は予定通り進んだ。ただし、アフリカ系言語のような低資源言語は国内で被験者を見つけるのが難しく課題として残る。
Strategy for Future Research Activity	2018年と2019年に以下の研究活動を継続する。 2018年度: ゼロ資源モデリングの構築とEEG 実験の継続、認知知識ソースの解析、およびゼロ資源モデリングへの知識統合のための設計検討 2019年度: 提案フレームワークの完成、性能検討、複数言語での応用について実証実験。最終的にはツォンガ語から日本語/英語への音声翻訳ができるシステムを開発する。

Research Products
(20 results)

All 2018 2017 Other

All Int'l Joint Research (1 results) Journal Article (10 results) (of which Int'l Joint Research: 10 results, Peer Reviewed: 10 results, Open Access: 4 results) Presentation (8 results) (of which Int'l Joint Research: 8 results) Patent(Industrial Property Rights) (1 results)

[Int'l Joint Research] University of Indonesia/Institute Technology Bandung(Indonesia)
- Country Name
  Indonesia
- Counterpart Institution
  University of Indonesia/Institute Technology Bandung
[Journal Article] Graph Regularized Tensor Factorization for Single-trial EEG Analysis2018
- Author(s)
  Hayato Maki, Hiroki Tanaka, Sakriani Sakti, Satoshi Nakamura
- Journal Title
  
  Proceeding of International Conference on Acoustic, Speech, and Signal Processing (ICASSP)
  
  Volume: Vol. 1 Pages: -
- Peer Reviewed / Int'l Joint Research
[Journal Article] Quality Prediction of Synthesized Speech Based on Tensor Structured EEG Signals2018
- Author(s)
  Hayato Maki, Hiroki Tanaka, Sakriani Sakti, Satoshi Nakamura
- Journal Title
  
  Transaction of PLOS One
  
  Volume: Vol. 1 Pages: -
- Peer Reviewed / Int'l Joint Research
[Journal Article] Subject-independent Classification of Japanese Spoken Sentences by Multiple Frequency Bands Phase Pattern of EEG Response during Speech Perception2017
- Author(s)
  Hiroki Watanabe, Hiroki Tanaka, Sakriani Sakti, Satoshi Nakamura
- Journal Title
  
  Proceeding of INTERSPEECH 2017
  
  Volume: Vol.1 Pages: pp. 2431-2435
- DOI
  10.21437/Interspeech.2017-854
- Peer Reviewed / Int'l Joint Research
[Journal Article] Speech Recognition Features Based On Deep Latent Gaussian Models2017
- Author(s)
  Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
- Journal Title
  
  Proceeding of IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2017)
  
  Volume: Vol.1 Pages: -
- DOI
  10.1109/MLSP.2017.8168174
- Peer Reviewed / Int'l Joint Research
[Journal Article] Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing2017
- Author(s)
  Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
- Journal Title
  
  Proceedings of the The 8th International Joint Conference on Natural Language Processing
  
  Volume: Vol. 1 Pages: pp. 431-440
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] End-to-End Speech Recognition with Local Monotonic Attention2017
- Author(s)
  Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
- Journal Title
  
  Proceedings of NIPS Workshop on Machine Learning for Audio Signal Processing (ML4Audio)
  
  Volume: なし Pages: -
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Listening while Speaking: Speech Chain by Deep Learning2017
- Author(s)
  Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
- Journal Title
  
  Proceedings of IEEE Automatic Speech Recognition and Understanding (ASRU)
  
  Volume: Vol. 1 Pages: -
- DOI
  10.1109/ASRU.2017.8268950
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Attention-based Wav2Text with Feature Transfer Learning2017
- Author(s)
  Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
- Journal Title
  
  Proceedings of IEEE Automatic Speech Recognition and Understanding (ASRU)
  
  Volume: Vol. 1 Pages: -
- DOI
  10.1109/ASRU.2017.8268951
- Peer Reviewed / Open Access / Int'l Joint Research
[Journal Article] Feature Optimized DPGMM Clustering for Unsupervised Subword Modeling: A Contribution to ZeroSpeech 20172017
- Author(s)
  Michael Heck, Sakriani Sakti, Satoshi Nakamura
- Journal Title
  
  Proceedings of IEEE Automatic Speech Recognition and Understanding (ASRU)
  
  Volume: Vol. 1 Pages: -
- DOI
  10.1109/ASRU.2017.8269011
- Peer Reviewed / Int'l Joint Research
[Journal Article] Learning Supervised Feature Transformations on Zero Resources for Improved Acoustic Unit Discovery2017
- Author(s)
  Michael Heck, Sakriani Sakti, Satoshi Nakamura
- Journal Title
  
  Transaction on Information and Systems
  
  Volume: Vol.E101-D Pages: -
- DOI
  10.1587/transinf.2017EDP7175
- Peer Reviewed / Int'l Joint Research
[Presentation] Graph Regularized Tensor Factorization for Single-trial EEG Analysis2018
- Author(s)
  Hayato Maki
- Organizer
  International Conference on Acoustic, Speech, and Signal Processing (ICASSP)
- Int'l Joint Research
[Presentation] Subject-independent Classification of Japanese Spoken Sentences by Multiple Frequency Bands Phase Pattern of EEG Response during Speech Perception2017
- Author(s)
  Hiroki Watanabe
- Organizer
  INTERSPEECH
- Int'l Joint Research
[Presentation] Speech Recognition Features Based On Deep Latent Gaussian Models2017
- Author(s)
  Andros Tjandra, Sakriani Sakti
- Organizer
  IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2017)
- Int'l Joint Research
[Presentation] Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing2017
- Author(s)
  Andros Tjandra
- Organizer
  the International Joint Conference on Natural Language Processing (IJCNLP 2017)
- Int'l Joint Research
[Presentation] End-to-End Speech Recognition with Local Monotonic Attention2017
- Author(s)
  Andros Tjandra
- Organizer
  NIPS Workshop on Machine Learning for Audio Signal Processing (ML4Audio)
- Int'l Joint Research
[Presentation] Listening while Speaking: Speech Chain by Deep Learning2017
- Author(s)
  Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
- Organizer
  IEEE Automatic Speech Recognition and Understanding (ASRU)
- Int'l Joint Research
[Presentation] Attention-based Wav2Text with Feature Transfer Learning2017
- Author(s)
  Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
- Organizer
  IEEE Automatic Speech Recognition and Understanding (ASRU)
- Int'l Joint Research
[Presentation] Feature Optimized DPGMM Clustering for Unsupervised Subword Modeling: A Contribution to ZeroSpeech 20172017
- Author(s)
  Michael Heck, Sakriani Sakti
- Organizer
  IEEE Automatic Speech Recognition and Understanding (ASRU)
- Int'l Joint Research
[Patent(Industrial Property Rights)] 国立大学法人　奈良先端科学技術大学院大学2017
- Inventor(s)
  アンドロスチャンドラ, サクリアニサクティ,中村哲
- Industrial Property Rights Holder
  アンドロスチャンドラ, サクリアニサクティ,中村哲
- Industrial Property Rights Type
  特許
- Industrial Property Number
  特願2018-1538

2017 Fiscal Year Research-status Report

ゼロ資源での教師なし音響パターン発見のための研究

Principal Investigator

サクリアニ サクティ 奈良先端科学技術大学院大学, 情報科学研究科, 特任准教授 (00395005)

Current Status of Research Progress

Reason

Research Products

[Int'l Joint Research] University of Indonesia/Institute Technology Bandung(Indonesia)

Country Name

Counterpart Institution

[Journal Article] Graph Regularized Tensor Factorization for Single-trial EEG Analysis2018

Author(s)

Journal Title

[Journal Article] Quality Prediction of Synthesized Speech Based on Tensor Structured EEG Signals2018

Author(s)

Journal Title

[Journal Article] Subject-independent Classification of Japanese Spoken Sentences by Multiple Frequency Bands Phase Pattern of EEG Response during Speech Perception2017

Author(s)

Journal Title

DOI

[Journal Article] Speech Recognition Features Based On Deep Latent Gaussian Models2017

Author(s)

Journal Title

DOI

[Journal Article] Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing2017

Author(s)

Journal Title

[Journal Article] End-to-End Speech Recognition with Local Monotonic Attention2017

Author(s)

Journal Title

[Journal Article] Listening while Speaking: Speech Chain by Deep Learning2017

Author(s)

Journal Title

DOI

[Journal Article] Attention-based Wav2Text with Feature Transfer Learning2017

Author(s)

Journal Title

DOI

[Journal Article] Feature Optimized DPGMM Clustering for Unsupervised Subword Modeling: A Contribution to ZeroSpeech 20172017

Author(s)

Journal Title

DOI

[Journal Article] Learning Supervised Feature Transformations on Zero Resources for Improved Acoustic Unit Discovery2017

Author(s)

Journal Title

DOI

[Presentation] Graph Regularized Tensor Factorization for Single-trial EEG Analysis2018

Author(s)

Organizer

[Presentation] Subject-independent Classification of Japanese Spoken Sentences by Multiple Frequency Bands Phase Pattern of EEG Response during Speech Perception2017

Author(s)

Organizer

[Presentation] Speech Recognition Features Based On Deep Latent Gaussian Models2017

Author(s)

Organizer

[Presentation] Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing2017

Author(s)

Organizer

[Presentation] End-to-End Speech Recognition with Local Monotonic Attention2017

Author(s)

Organizer

[Presentation] Listening while Speaking: Speech Chain by Deep Learning2017

Author(s)

Organizer

[Presentation] Attention-based Wav2Text with Feature Transfer Learning2017

Author(s)

Organizer

[Presentation] Feature Optimized DPGMM Clustering for Unsupervised Subword Modeling: A Contribution to ZeroSpeech 20172017

Author(s)

Organizer

[Patent(Industrial Property Rights)] 国立大学法人 奈良先端科学技術大学院大学2017

Inventor(s)

Industrial Property Rights Holder

Industrial Property Rights Type

Industrial Property Number

サクリアニサクティ奈良先端科学技術大学院大学, 情報科学研究科, 特任准教授 (00395005)

[Patent(Industrial Property Rights)] 国立大学法人　奈良先端科学技術大学院大学2017