Project/Area Number |
12480085
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | KYOTO UNIVERSITY |
Principal Investigator |
KAWAHARA Tatsuya Kyoto University, Graduate School of Informatics, Associate Professor, 情報学研究科, 助教授 (00234104)
|
Co-Investigator(Kenkyū-buntansha) |
DOSHITA Shuji Ryukoku University, Faculty of Science and Technology, Professor, 理工学部, 教授 (00025925)
IKEDA Katsuo Osaka Institute of Technology, Department of Information Science, Professor, 情報科学部, 教授 (30026009)
KUROHASHI Sadao The University of Tokyo, Graduate School of Information Science and Technology, Associate Professor, 情報処理工学系研究科, 助教授 (50263108)
OKUNO Hiroshi Kyoto University, Graduate School of Informatics, Professor, 情報学研究科, 教授 (60318201)
SATO Satoshi Kyoto University, Graduate School of Informatics, Associate Professor, 情報学研究科, 助教授 (30205918)
|
Project Period (FY) |
2000 – 2002
|
Project Status |
Completed (Fiscal Year 2002)
|
Budget Amount *help |
¥7,200,000 (Direct Cost: ¥7,200,000)
Fiscal Year 2002: ¥1,500,000 (Direct Cost: ¥1,500,000)
Fiscal Year 2001: ¥1,800,000 (Direct Cost: ¥1,800,000)
Fiscal Year 2000: ¥3,900,000 (Direct Cost: ¥3,900,000)
|
Keywords | spoken language processing / speech recognition / spontaneous speech / acoustic model / language model / HMM / N-gram / 話者認識 |
Research Abstract |
Automatic transcription of lectures is addressed using the corpus of spontaneous Japanese collected under the priority research project in Japan. First, we investigate the effect of speaking style and data amount for acoustic modeling. Then, to complement training data for language model, incorporation of other text corpora with optimization of mixture weights is performed. We also implement a sequential decoding method that does not need prior segmentation of lecture recordings Then, we investigate the acoustic, pronunciation and language modeling for improving the accuracy focusing the following issues (1) Speaking-rate dependent decoding and adaptation of acoustic model (2) Statistical modeling of pronunciation variations and unsupervised adaptation of language model Furthermore, we also study the following spoken language processings (3) Automatic indexing of lecture audio by extracting topic-independent discourse markers (4) Automatic transformation of lecture transcription into document style using statistical framework (5) Extraction of important sentences from lectures using statistics of discourse markers and topic words
|