Automatic recognition of topic transition for newspaper articles and application to document summary
Project/Area Number |
15500086
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | University of Yamanashi |
Principal Investigator |
SUZUKI Yoshimi University of Yamanashi, Department of Research Interdisciplinary Graduate School of Medicine and Engineering, Associate Professor, 大学院・医学工学総合研究部, 助教授 (20206551)
|
Project Period (FY) |
2003 – 2004
|
Project Status |
Completed (Fiscal Year 2004)
|
Budget Amount *help |
¥3,900,000 (Direct Cost: ¥3,900,000)
Fiscal Year 2004: ¥1,700,000 (Direct Cost: ¥1,700,000)
Fiscal Year 2003: ¥2,200,000 (Direct Cost: ¥2,200,000)
|
Keywords | newspaper articles / subsequent articles / synonym / summarization / subject cluster / 話題推移 / 続報記事の抽出 / 自動要約 |
Research Abstract |
In this study, we paid attention to the automatic summary for the newspaper articles. The following was researched as a first step to summarize multi-documents precisely. (1)A subject template is made from the large-scale corpus, and we extract the subsequent articles of a target article using that subject template correctly. The extracted subsequent articles are classified in the subject cluster. (2)Every subject cluster is summarized, and the whole of the subsequent articles is summarized in consideration of a connection between the clusters. Regarding (1), we proposed a method for topic tracking using subject templates and machine learning (support vector machines). And also, we showed that our methods can extract subsequent articles with high accuracy using large corpus (the corpus by Topic detection and Tracking and articles of Mainichi Shimbun newspaper) (research paper 4,5). Regarding (2), we found that we have to extract synonyms of each word for multi-document summarization. We proposed a method to identify synonym pairs from Japanese newspaper (3,4). For identifying synonyms, we compared Lin's method with Hindle's method and we found Lin's method is better than Hindle's method for Japanese documents. We proposed a method which is based on Lin's method for Japanese documents. Moreover, we performed some experiments of sentence extraction using automatically extracted synonym pairs and title of newspaper article (research paper 1,2). The method is as the following, firstly from newspaper articles we extracted synonyms of words in titles of newspaper articles using the proposed method which is based on Lin's method, then we performed sentence extraction using the results. The results show that identifying synonyms is useful for sentence extraction.
|
Report
(3 results)
Research Products
(14 results)