Project/Area Number |
24K15004
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Review Section |
Basic Section 61010:Perceptual information processing-related
|
Research Institution | National Institute of Information and Communications Technology |
Principal Investigator |
沈 鵬 国立研究開発法人情報通信研究機構, ユニバーサルコミュニケーション研究所先進的音声翻訳研究開発推進センター, 主任研究員 (80773118)
|
Co-Investigator(Kenkyū-buntansha) |
LU Xugang 国立研究開発法人情報通信研究機構, ユニバーサルコミュニケーション研究所先進的音声翻訳研究開発推進センター, 主任研究員 (20362022)
|
Project Period (FY) |
2024-04-01 – 2027-03-31
|
Project Status |
Granted (Fiscal Year 2024)
|
Budget Amount *help |
¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000)
Fiscal Year 2026: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2025: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2024: ¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
|
Keywords | latent representation / Speech recognition / Summarization / Meeting minutes system / Large speech model |
Outline of Research at the Start |
Motivated by the success of large language models, building large speech models(LSM) to handle a variety of speech tasks has become a hot direction. However, unlike text-based tasks, there isn't a set of meaningful autoregressive symbols available that would allow the models to process speech signals iteratively. In this project, our core objective is to build a speech representation-based recursive symbol system, aiming to allow the LSM to capture the multi-level logical relationships within speech signals.
|