2007 Fiscal Year Final Research Report Summary
Automatic indexing for lecture speech and its advanced utilization through speech interaction
Project/Area Number |
17300064
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Perception information processing/Intelligent robotics
|
Research Institution | Toyohashi University of Technology |
Principal Investigator |
NAKAGAWA Seiichi Toyohashi University of Technology, Faculty of Engineering, Professor (20115893)
|
Co-Investigator(Kenkyū-buntansha) |
AKIBA Tomoyoshi University of Technology, Faculty of Engineering, Assistant Professor (00356346)
TSUCHIYA Masatoshi Toyohashi University of Technology, Faculty of Engineering, Associate Professor (70378256)
KITAOKA Norihide Nagoya University, Graduate School of Information Science, Asscoiate Professor (10333501)
KOGURE Satoru Shizuoka University, Faculty of Engineering, Associate Professor (40359758)
NISHIZAKI Hiromitsu University of Yamanashi, Faculty of engineering, Associate Professor (40362082)
|
Project Period (FY) |
2005 – 2007
|
Keywords | class room lecture speech / speech recognition / spoken language / language model / speech summarization / indexing / speech retrieval / brounsing |
Research Abstract |
We collected the class room lecture speech consisting of 16 speakers, 114 lectures, and 3860 minutes, and publised the corpus. We developed the procedure of automatic speech recognition, sentence extraction, segmentation/indexing, spoken retrieval and construction of lecture browsing system for classroom lecture data of our university's graduated course. These processes axe necessary to improve the usability of broadcasting sound or video data In the case of lecture, summarized and indexed lecture speech or video enables to students to more effective leaning. Our goal was to construct a framework of such structured lecture contents. To achieve this goal, first, we investigated influence of the recording methods on the speech recognition performance. It turned out that there was 23% difference on the accuracy between a high quality hand-microphone and a low quality lapel microphone. Furthermore, we improved the domain-dependent language model by using related Web texts and developed a filler insertion model. Second, we tried automatic summarization by extracting important sentences, and we obtained 0.319-0.456 κ value, comparable with human doing 0.407-0.477. Finally, we constructed the lecture browsing system which enables users to learn more effectively by using results of the procedure described above, and evaluated it
|
Research Products
(15 results)