2015 Fiscal Year Annual Research Report

Simultaneous speech translation methods for news and lectures in foreign languages

Research Project

Project/Area Number	24240032
Research Institution	Nara Institute of Science and Technology
Principal Investigator	中村哲奈良先端科学技術大学院大学, 情報科学研究科, 教授 (30263429)
Co-Investigator(Kenkyū-buntansha)	松本裕治奈良先端科学技術大学院大学, 情報科学研究科, 教授 (10211575) サクリアニサクティ奈良先端科学技術大学院大学, 情報科学研究科, 助教 (00395005) Neubig Graham 奈良先端科学技術大学院大学, 情報科学研究科, 助教 (70633428) Duh Kevin 奈良先端科学技術大学院大学, 情報科学研究科, 助教 (80637322) 戸田智基奈良先端科学技術大学院大学, 情報科学研究科, 准教授 (90403328)
Project Period (FY)	2012-05-31 – 2017-03-31
Keywords	音声情報処理
Outline of Annual Research Achievements	①同時通訳基本方式研究：平成27年度は、品詞情報をもとに、訳文に単語順序の入れ替えが発生するかを予測するモデルを構築し、同時通訳精度の改善を試みた。また、ニューラル翻訳を実装し、統計翻訳のリランキングに用いることで性能改善できることを示した。 ②コミュニケーション評価：音声認識における置換、挿入、脱落誤りに対する発話者の反応測定を行った。誤り単語の品詞や、役割によって認知負荷が異なることが明らかとなった。 ③同時通訳コーパス構築、プロトタイプ構築：同時通訳（日→英）9時間分（Aクラス18データ、Bクラス24データ）、同時通訳（英→日）7時間分（30データ）の書き起こし、講義１０コマ分の日英翻訳を行った。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 同一の英語講演に対し、レベルの異なる同時通訳者による同時通訳を行い、それらの差の分析を進めた。また、この同時通訳コーパスを用いて、同時通訳アルゴリズムの高精度化を進めた。方法として、フレーズベース統計翻訳における文分割を、形態素情報を使って分割する方法を提案し、さらなる高精度化を実現した。現在研究中の同時通訳システムが、経験年数１年のプロの通訳者に勝る性能を達成したことは特筆できる。この同時通訳用機械翻訳モジュールを、多言語音声認識と音声合成と統合し、音声同時通訳プロトタイプを構築した。
Strategy for Future Research Activity	①同時通訳基本方式研究：同時通訳用機械翻訳の高度化、途中結果を五月雨的に出力する音声認識システムのさらなる改良、実装を行う。入力発話に於ける強調を保持して目的言語の音声を生成する音声翻訳についての研究に着手、システムに導入する。また、深層学習法による翻訳手法についても検討を開始する。 ②コミュニケーション評価：人間の同時通訳者との比較をさらに継続する。さらに、人間の翻訳者における訳出パターンにおける重要性判定からの評価尺度構築を高精度化する。 ③ニュース・講演同時通訳コーパス構築、プロトタイプ構築：平成26年度は、２１時間のニュース、講演の収録を終了した。一部データの書き起こしが終わっているが、残りのデータの書き起こし、アノテーションを完了する。日本語のニュース、講演の音声と同時通訳の収録についても収集データをさらに増やしてゆく

Research Products
(16 results)

All 2016 2015

All Presentation (16 results)

[Presentation] ボトルネック特徴量を用いた感情音声認識の検討2016
- Author(s)
  向原康平, サクティサクリアニ, 吉野幸一郎, ニュービッググラム, 中村哲
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  桐蔭横浜大学（神奈川県横浜市）
- Year and Date
  2016-03-09 – 2016-03-11
[Presentation] Deep Neural Networkを用いた音声と環境音のマルチタスク学習2016
- Author(s)
  川西誠司, サクティサクリアニ, 吉野幸一郎, ニュービッググラム, 中村哲
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  桐蔭横浜大学（神奈川県横浜市）
- Year and Date
  2016-03-09 – 2016-03-11
[Presentation] 英語習熟度を考慮した発音辞書と音響モデル逐次適応による非母語音声認識2016
- Author(s)
  辻岡聡, サクティサクリアニ, 吉野幸一郎, ニュービッググラム, 中村哲
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  桐蔭横浜大学（神奈川県横浜市）
- Year and Date
  2016-03-09 – 2016-03-11
[Presentation] The NAIST ASR for IWSLT: A Multi-architecture DNN System Combination Approach2016
- Author(s)
  Michael Heck, Quoc Truong Do, Sakriani Sakti, Graham Neubig, Satoshi Nakamura
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  桐蔭横浜大学（神奈川県横浜市）
- Year and Date
  2016-03-09 – 2016-03-11
[Presentation] Word-level Emphasis Transfer in Speech-to-speech Translation2016
- Author(s)
  Do Truong, Shinnosuke Takamichi, Sakriani Sakti, Graham Neubig(NAIST), Tomoki Toda(NAIST/Nagoya university), Satoshi Nakamura
- Organizer
  日本音響学会春季研究発表会
- Place of Presentation
  桐蔭横浜大学（神奈川県横浜市）
- Year and Date
  2016-03-09 – 2016-03-11
[Presentation] 対訳コーパスを利用した構文解析器の自己学習2016
- Author(s)
  森下睦, 赤部晃一, 波多腰優斗, Graham Neubig, 吉野幸一郎, 中村哲
- Organizer
  言語処理学会第22回年次大会
- Place of Presentation
  東北大学（宮城県仙台市）
- Year and Date
  2016-03-07 – 2016-03-11
[Presentation] A Study of Social-Affective Communication: Automatic Prediction of Emotion Triggers and Responses in Television Talk Shows2015
- Author(s)
  Nurul Lubis, Sakriani Sakti, Graham Neubig, Koichiro Yoshino, Tomoki Toda, and Satoshi Nakamura
- Organizer
  2015 IEEE Automatic Speech Recognition and Understanding
- Place of Presentation
  アリゾナ（米国）
- Year and Date
  2015-12-13 – 2015-12-17
[Presentation] Parser Self-Training for Syntax-Based Machine Translation2015
- Author(s)
  Makoto Morishita, Koichi Akabe, Yuto Hatakoshi, Graham Neubig, Koichiro Yoshino, Satoshi Nakamura
- Organizer
  12th International Workshop on Spoken Language Translation (IWSLT)
- Place of Presentation
  ダナン（ベトナム）
- Year and Date
  2015-12-03 – 2015-12-04
[Presentation] Improving Translation of Emphasis with Pause Prediction in Speech-to-speech Translation Systems2015
- Author(s)
  Quoc Truong Do, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura
- Organizer
  12th International Workshop on Spoken Language Translation (IWSLT)
- Place of Presentation
  ダナン（ベトナム）
- Year and Date
  2015-12-03 – 2015-12-04
[Presentation] 発音変換知識を用いないデータ駆動型発音学習による非母語話者の音声認識2015
- Author(s)
  辻岡聡, サクティサクリアニ, ニュービッググラム, 吉野幸一郎, 中村哲
- Organizer
  音声言語処理研究会
- Place of Presentation
  名古屋工業大学（愛知県名古屋市）
- Year and Date
  2015-12-02 – 2015-12-03
[Presentation] Construction and Analysis of Social-Affective Interaction Corpus in English and Indonesian2015
- Author(s)
  Nurul Lubis, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura
- Organizer
  Oriental COCOSDA 2015
- Place of Presentation
  上海（中国）
- Year and Date
  2015-10-28 – 2015-10-30
[Presentation] Preserving Word-level Emphasis in Speech-to-speech Translation using Linear Regression HSMMs2015
- Author(s)
  Quoc Truong Do, Shinnosuke Takamichi, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura
- Organizer
  Interspeech 2015
- Place of Presentation
  ドレスデン（ドイツ）
- Year and Date
  2015-09-06 – 2015-09-10
[Presentation] Reinforcement Learning in Multi-Party Trading Dialog2015
- Author(s)
  Takuya Hiraoka, Kallirroi Georgila, Elnaz Nouri, David Traum, Satoshi Nakamura
- Organizer
  The 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL)
- Place of Presentation
  プラハ（チェコ）
- Year and Date
  2015-09-02 – 2015-09-04
[Presentation] Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic Constituents2015
- Author(s)
  Yusuke Oda, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura
- Organizer
  The 53rd Annual Meeting of the Association for Computational Linguistics (ACL)
- Place of Presentation
  北京（中国）
- Year and Date
  2015-07-26 – 2015-07-31
[Presentation] 辻岡聡, リアンルー (エディンバラ大), サクリアニサクティ, グラムニュービッグ, 戸田智基, 中村哲2015
- Author(s)
  非母語音声の認識のための実音声を用いた発音辞書獲得
- Organizer
  音声言語処理研究会
- Place of Presentation
  上諏訪温泉かたくら諏訪湖ホテル（長野県諏訪市）
- Year and Date
  2015-07-16 – 2015-07-17
[Presentation] WFST-BASED STRUCTURAL CLASSIFICATION INTEGRATING DNN ACOUSTIC FEATURES AND RNN LANGUAGE FEATURES FOR SPEECH RECOGNITION2015
- Author(s)
  Quoc Truong Do, Satoshi Nakamura, Marc Delcroix, Takaaki Hori
- Organizer
  ICASSP 2015
- Place of Presentation
  ブリスベン（オーストラリア）
- Year and Date
  2015-04-19 – 2015-04-24

2015 Fiscal Year Annual Research Report

Simultaneous speech translation methods for news and lectures in foreign languages

Principal Investigator

中村 哲 奈良先端科学技術大学院大学, 情報科学研究科, 教授 (30263429)

Current Status of Research Progress

Reason

Research Products

[Presentation] ボトルネック特徴量を用いた感情音声認識の検討2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Deep Neural Networkを用いた音声と環境音のマルチタスク学習2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 英語習熟度を考慮した発音辞書と音響モデル逐次適応による非母語音声認識2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] The NAIST ASR for IWSLT: A Multi-architecture DNN System Combination Approach2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Word-level Emphasis Transfer in Speech-to-speech Translation2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 対訳コーパスを利用した構文解析器の自己学習2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A Study of Social-Affective Communication: Automatic Prediction of Emotion Triggers and Responses in Television Talk Shows2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Parser Self-Training for Syntax-Based Machine Translation2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Improving Translation of Emphasis with Pause Prediction in Speech-to-speech Translation Systems2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 発音変換知識を用いないデータ駆動型発音学習による非母語話者の音声認識2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Construction and Analysis of Social-Affective Interaction Corpus in English and Indonesian2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Preserving Word-level Emphasis in Speech-to-speech Translation using Linear Regression HSMMs2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Reinforcement Learning in Multi-Party Trading Dialog2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic Constituents2015

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 辻岡 聡, リアン ルー (エディンバラ大), サクリアニ サクティ, グラム ニュービッグ, 戸田 智基, 中村 哲2015

Author(s)

Organizer

中村哲奈良先端科学技術大学院大学, 情報科学研究科, 教授 (30263429)

[Presentation] 辻岡聡, リアンルー (エディンバラ大), サクリアニサクティ, グラムニュービッグ, 戸田智基, 中村哲2015