2023 Fiscal Year Annual Research Report

音声対話系の統一的モデリングに基づくユーザへのモデル自動適応

Research Project

Project/Area Number	23H03457
Allocation Type	Single-year Grants
Research Institution	Osaka University
Principal Investigator	武田龍大阪大学, 産業科学研究所, 准教授 (20749527)
Project Period (FY)	2023-04-01 – 2027-03-31
Keywords	統一的モデル化 / 音声対話システム / 音声認識モデル / 知識グラフ / ユーザ応答予測
Outline of Annual Research Achievements	本年度は，３つの課題の内，①統一的モデル化に向けた要素技術開発と③対話的学習に取り組み，音声対話システムの基盤モデル構築も進めた．要素技術開発では，２つのモデルを統合するための技術開発を進めた．まず，ミッシングデータ技術を応用し，音声強調モデルの信頼度を音声認識モデル内へ伝播させることで，雑音環境下での認識率を改善した．本技術は別のモデル間の統合にも応用できる．次に，知識モデル（知識グラフ）と大規模言語モデル，エンティティ同定モデルを生成モデルの枠組みで解釈・統合し，未知エンティティの補完技術を開発した．これらは，査読付き国際会議 APSIPA，PRICAI，IJCKG で発表し，IJCKG では Best Research Paper を受賞した．次に，対話的学習では，ユーザ応答の予測や未知語認識の高精度化に取り組んだ．第一歩として，システムがユーザに未知語を確認して教わるという状況を取り扱った．システムの質問に対するユーザ応答パタンをモデル化し，認識の際に言語予測モデルとして活用することで未知語の検出精度を改善した．また，未知語認識で用いられる音声認識と単語分割モデルに関して，性質の異なるモデルを複数統合することで未知語の検出精度を改善した．これらは，査読付き国際会議 APSIPA, IWSDS で発表した．最後に基盤モデル構築では，実環境下で動作する音声対話システム実装のため，雑音に頑健な音声認識モデル・音声区間検出モデルの構築を進めた．複数の音声・非音声コーパスを活用し，1000時間を超えるデータを用いて各モデルのマルチコンディション学習を行った．公開に向けた準備を進めている．
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 統合モデル化や対話的学習についてはおおむね順調に進展している．要素モデルに関しては音声強調モデルから知識モデルまで一通り扱い，また，対話的学習ではユーザ応答のモデル化にも着手し，国際会議で成果発表を行った．シチュエーションを限定してはいるが，音声対話システムを用いた会話データ収集も進めており，次年度に向けた準備も行えた．
Strategy for Future Research Activity	本年度の取り組みをより一般化していく方向で進める．統合モデル化では，３つ以上のモデル間の統合，ユーザの知識モデル予測などに着手する．対話的学習では，まずシチュエーションを限定したうえで，データ収集とユーザ応答・対話モデルの拡張を進める．そこに，語彙や知識モデルの適応を織り交ぜて進めていく．得られた成果は適宜，査読付き国際会議などへ投稿する．

Research Products
(7 results)

All 2024 2023

All Journal Article (5 results) (of which Peer Reviewed: 5 results) Presentation (2 results)

[Journal Article] Toward OOV-word Acquisition during Spoken Dialogue using Syllable-based ASR and Word Segmentation2024
- Author(s)
  Ryu Takeda, Kazunori Komatani
- Journal Title
  
  Proceedings of International Workshop on Spoken Dialogue Systems (IWSDS)
  
  Volume: - Pages: -
- Peer Reviewed
[Journal Article] Link Prediction Based on Large Language Model and Knowledge Graph Retrieval under Open-World and Resource-Restricted Environment2023
- Author(s)
  Ryu Takeda, Hokuto Munakata, Kazunori Komatani
- Journal Title
  
  Proceedings of International Joint Conference on Knowledge Graphs (IJCKG)
  
  Volume: - Pages: -
- Peer Reviewed
[Journal Article] Flexible Evidence Model to Reduce Uncertainty Mismatch Between Speech Enhancement and ASR Based on Encoder-Decoder Architecture2023
- Author(s)
  Takeda Ryu, Sudo Yui, Komatani Kazunori
- Journal Title
  
  Proceedings of Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
  
  Volume: - Pages: 1830-1837
- DOI
  10.1109/APSIPAASC58517.2023.10317247
- Peer Reviewed
[Journal Article] Out-Of-Vocabulary Word Detection in Spoken Dialogues Based on Joint Decoding with User Response Patterns2023
- Author(s)
  Miki Oshio, Hokuto Munakata, Ryu Takeda, Kazunori Komatani
- Journal Title
  
  Proceedings of Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
  
  Volume: - Pages: 1753-1759
- DOI
  10.1109/APSIPAASC58517.2023.10317375
- Peer Reviewed
[Journal Article] Knowledge Graph Augmentation with Entity Identification for Improving Knowledge Graph Completion Performance2023
- Author(s)
  Shuichi Chikatsuji, Kenta Yamamoto, Ryu Takeda, Kazunori Komatani
- Journal Title
  
  Proceedings of Pacific Rim International Conference on Artificial Intelligence (PRICAI)
  
  Volume: - Pages: 480-487
- DOI
  10.1007/978-981-99-7019-3_43
- Peer Reviewed
[Presentation] 誤りを含む音節認識結果に対応する知識グラフ内エンティティの同定2024
- Author(s)
  平川巧人，大塩幹，近辻脩壱，武田龍，駒谷和範
- Organizer
  情報処理学会全国大会
[Presentation] 未知語認識機能を有する音声対話システムの構築とデータ収集2024
- Author(s)
  大塩幹，武田龍，駒谷和範
- Organizer
  情報処理学会全国大会

2023 Fiscal Year Annual Research Report

音声対話系の統一的モデリングに基づくユーザへのモデル自動適応

Principal Investigator

武田 龍 大阪大学, 産業科学研究所, 准教授 (20749527)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Toward OOV-word Acquisition during Spoken Dialogue using Syllable-based ASR and Word Segmentation2024

Author(s)

Journal Title

[Journal Article] Link Prediction Based on Large Language Model and Knowledge Graph Retrieval under Open-World and Resource-Restricted Environment2023

Author(s)

Journal Title

[Journal Article] Flexible Evidence Model to Reduce Uncertainty Mismatch Between Speech Enhancement and ASR Based on Encoder-Decoder Architecture2023

Author(s)

Journal Title

DOI

[Journal Article] Out-Of-Vocabulary Word Detection in Spoken Dialogues Based on Joint Decoding with User Response Patterns2023

Author(s)

Journal Title

DOI

[Journal Article] Knowledge Graph Augmentation with Entity Identification for Improving Knowledge Graph Completion Performance2023

Author(s)

Journal Title

DOI

[Presentation] 誤りを含む音節認識結果に対応する知識グラフ内エンティティの同定2024

Author(s)

Organizer

[Presentation] 未知語認識機能を有する音声対話システムの構築とデータ収集2024

Author(s)

Organizer

武田龍大阪大学, 産業科学研究所, 准教授 (20749527)