Improving Speaker Recognition by Using Linguistic Information Inherent in Speech

Research Project

Project/Area Number	21K11967
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	Yokohama City University
Principal Investigator	Koshinaka Takafumi 横浜市立大学, データサイエンス学部, 教授 (60895928)
Project Period (FY)	2021-04-01 – 2024-03-31
Project Status	Completed (Fiscal Year 2023)
Budget Amount *help	¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000) Fiscal Year 2023: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000) Fiscal Year 2022: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000) Fiscal Year 2021: ¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000)
Keywords	筆者認識 / 話者認識 / 生体認証 / 深層学習 / 大規模言語モデル / 生成AI / ディープフェイク検出 / ニューラルネットワーク / 自然言語処理 / Transformer / Speaker recognition / Authorship recognition / Adversarial attacks / Deepfake
Outline of Research at the Start	人が話した言葉や文章には、用いる語彙や言い回し、書きぶりなど、人それぞれの個性がある。本研究では言葉の個人性に着目し、話し言葉/書き言葉のテキストデータから、それを話した/書いた人物を推論する技術を開発する。近年のニューラルネットワークの手法を活用することで、音声やメッセージから個人を認証したり、AIが生成した人工的なメッセージを見つけて不正行為を防いだりすることができると期待される。
Outline of Final Research Achievements	Aiming to improve speaker recognition technology, which has not yet been put into practical use compared to facial recognition or vein recognition, we investigated methods for utilizing the linguistic information contained in speech and demonstrated the effectiveness of recent deep learning models. We also examined the discriminability of generative AI, including large-scale language models, which have made remarkable progress in recent years, and humans. Furthermore, to clarify the capabilities and behavior of generative AI, we compared the quality of text data produced by large-scale language models and demonstrated that state-of-the-art models, such as GPT-4, have text generation capabilities equal to or greater than humans. We also examined image captioning models, quantitatively measuring the capabilities of the latest models in image classification tasks.
Academic Significance and Societal Importance of the Research Achievements	ディジタル社会の進展に伴い, ユーザの本人確認を安全かつ簡便に行う技術が求められている中で, 顔認証や静脈認証などと並んで普及が期待される音声認証の精度を改善し, より安全で便利な社会の実現に貢献する. 加えて, 近年急速に発展して社会的な注目度も高い, 大規模言語モデルなどのいわゆる生成AIの性質を明らかにすることにより, AIの社会への普及を促進し, AI技術の健全な発展に貢献する.

Report

(4 results)

2023 Annual Research Report Final Research Report ( PDF )
2022 Research-status Report
2021 Research-status Report

Research Products
(8 results)

All 2024 2023 2022 Other

All Presentation (7 results) Remarks (1 results)

[Presentation] LLM生成コンテンツのSEO観点での品質評価2024
- Author(s)
  益子怜, 木村賢, 越仲孝文
- Organizer
  言語処理学会年次大会 (NLP2024)
- Related Report
  2023 Annual Research Report
[Presentation] テキストプロンプトによるデザイン変更が可能な試着画像生成2024
- Author(s)
  武本孝輔, 越仲孝文
- Organizer
  人工知能学会全国大会 (JSAI2024)
- Related Report
  2023 Annual Research Report
[Presentation] 感情付与を用いた低評価レビューに対する応答生成2023
- Author(s)
  益子怜, 越仲孝文
- Organizer
  人工知能学会全国大会 (JSAI2023)
- Related Report
  2023 Annual Research Report
[Presentation] 画像キャプショニングは画像そのものよりも多くを語る2023
- Author(s)
  有働帆乃璃, 越仲孝文
- Organizer
  人工知能学会全国大会 (JSAI2023)
- Related Report
  2023 Annual Research Report
[Presentation] 感情付与を用いた低評価レビューに対する応答生成2023
- Author(s)
  益子怜，越仲孝文
- Organizer
  2023年度人工知能学会全国大会(JSAI2023)
- Related Report
  2022 Research-status Report
[Presentation] 画像キャプショニングは画像そのものよりも多くを語る2023
- Author(s)
  有働帆乃璃，越仲孝文
- Organizer
  2023年度人工知能学会全国大会(JSAI2023)
- Related Report
  2022 Research-status Report
[Presentation] EC サイトのレビューテキストからのレーティング予測と購買者評価の分析2022
- Author(s)
  小林, 越仲
- Organizer
  2022年度人工知能学会全国大会(JSAI2022)
- Related Report
  2021 Research-status Report
[Remarks] 越仲研究室
- URL
  https://sites.google.com/view/koshinak-lab/home-jp
- Related Report
  2023 Annual Research Report

Improving Speaker Recognition by Using Linguistic Information Inherent in Speech

Principal Investigator

Koshinaka Takafumi 横浜市立大学, データサイエンス学部, 教授 (60895928)

¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)

Report

Research Products

[Presentation] LLM生成コンテンツのSEO観点での品質評価2024

Author(s)

Organizer

Related Report

[Presentation] テキストプロンプトによるデザイン変更が可能な試着画像生成2024

Author(s)

Organizer

Related Report

[Presentation] 感情付与を用いた低評価レビューに対する応答生成2023

Author(s)

Organizer

Related Report

[Presentation] 画像キャプショニングは画像そのものよりも多くを語る2023

Author(s)

Organizer

Related Report

[Presentation] 感情付与を用いた低評価レビューに対する応答生成2023

Author(s)

Organizer

Related Report

[Presentation] 画像キャプショニングは画像そのものよりも多くを語る2023

Author(s)

Organizer

Related Report

[Presentation] EC サイトのレビューテキストからのレーティング予測と購買者評価の分析2022

Author(s)

Organizer

Related Report

[Remarks] 越仲研究室

URL

Related Report