Content Retrieval against large-scale spoken documents based on the integration of speech and language processing
Project/Area Number |
22500090
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Media informatics/Database
|
Research Institution | Toyohashi University of Technology |
Principal Investigator |
AKIBA Tomoyosi 豊橋技術科学大学, 大学院・工学研究科, 准教授 (00356346)
|
Co-Investigator(Kenkyū-buntansha) |
NAKAGAWA Seiichi 豊橋技術科学大学, 工学研究科, 教授 (20115893)
|
Project Period (FY) |
2010 – 2012
|
Project Status |
Completed (Fiscal Year 2012)
|
Budget Amount *help |
¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)
Fiscal Year 2012: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2011: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2010: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
|
Keywords | 情報検索 / 音声情報処理 / 自然言語処理 / 音声ドキュメント処理 / 音声ドキュメント検索 / 音声中の検索語検出 / 音声内容検索 / 音声ドキュメント / 検索 / 音声検索語検出 / 索引付け / パッセージ検索 / 音声認識 / クエリ拡張 / 認識誤り対策 / Spoken Term Detection / 適合性モデル |
Research Abstract |
We conducted the research and the development of spoken content retrieval targeting large-scale spoken documents. Firstly, for the spoken term detection (STD) task, which aimed to detect the position in a spoken document that a given term appeared at, we developed the method that did not require any detection threshold but, instead, outputted the candidates in increasing order of their plausibility. Finally, we achieved about 70 times faster detection at the almost same detection performance than the baseline continuous DP matching. Next, for the spoken content retrieval (SCR) task, which aimed to find the segment in a spoken document that was relevant to a given query topic represented in natural language, we developed the method robust for recognition errors and out-of-vocabularies (OOVs) that made use of STD as its preprocessing. We found that the proposed method was effective for the query including OOVs and worked complementally with the conventional SCR method, which made use of the large vocabulary continuous speech recognition (LVCSR), and that the combination of them improved the retrieval performance.
|
Report
(4 results)
Research Products
(73 results)
-
-
-
-
[Journal Article] 音声中の検索語検出のためのテストコレクションの構築と分析2013
Author(s)
伊藤慶明,西崎博光,中川聖一,秋葉友良,河原達也,胡新輝,南條浩輝,松井知子,山下洋一,相川清明
-
Journal Title
情報処理学会論文誌
Volume: Vol.54,No.2
Pages: 471-483
NAID
Related Report
Peer Reviewed
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Presentation] NTCIR-9総括と今後の展望2012
Author(s)
酒井哲也,上保秀夫,神門典子,加藤恒昭,相澤彰子,秋葉友良,後藤功雄,木村文則,三田村照子,西崎博光,嶋秀樹,吉岡真治,ShlomoGeva,Ling-XiangTang,AndrewTrotman,YueXu
Organizer
情報処理学会研究報告
Place of Presentation
白百合女子大学
Year and Date
2012-03-26
Related Report
-
[Presentation] NTCIR-9総括と今後の展望2012
Author(s)
酒井哲也, 上保秀夫, 神門典子, 加藤恒昭, 相澤彰子, 秋葉友良, 後藤功雄, 木村文則, 三田村照子, 西崎博光, 嶋秀樹, 吉岡真治, Shlomo Geva, Ling-Xiang Tang, Andrew Trotman, Yue Xu
Organizer
第106回情報基礎とアクセス技術研究会
Place of Presentation
白百合女子大学(東京都)
Year and Date
2012-03-26
Related Report
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Presentation] Constructing Japanese Test Collections for Spoken Term Detection2010
Author(s)
Yoshiaki Itoh,Hiromitsu Nishizaki, Xinhui Hu,Hiroaki Nanjo, Tomoyosi Akiba, Tatsuya Kawahara, Seiichi Nakagawa, Tomoko Matsui,Yoichi Yamashita, Kiyoaki Aikawa
Organizer
In Proceedings of 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010)
Place of Presentation
幕張・千葉
Year and Date
2010-09-28
Related Report
-
-
-
-
-
-
-