2015 Fiscal Year Research-status Report
A Study on Social Context Summarization
Project/Area Number |
15K16048
|
Research Institution | Japan Advanced Institute of Science and Technology |
Principal Investigator |
NGUYEN MinhLe 北陸先端科学技術大学院大学, 情報科学研究科, 准教授 (30509401)
|
Project Period (FY) |
2015-04-01 – 2018-03-31
|
Keywords | comment extraction / sentence extraction / text summarization |
Outline of Annual Research Achievements |
In this research, we have proposed a novel framework for utilizing comments in social context summarization. We proposed a Dual Wing Entailment Graph utilizing the uses of textual entailment recognition techniques on the graph building from news and comments. Our system obtained the state-of-the-art result on the benchmark data. This work is published on the main forum of Information Retrieval (ECIR 2016) entitles "“SoRTESum: A Social Context Framework for Single-Document Summarization". In addition, we extend the framework by designing a semantic similarity ranking method for news and comments summarization. We create the data set for news and comment summarization, and the data set is available for research aims. Our work has been submitted to the ECAI 2016 conference. The systems in the two papers are also published through the webs: (http://150.65.242.101:9293/?paper=ecir) and (http://150.65.242.101:9293/?paper=ecai). We have also implemented an abstraction text summarization method using phrase selection and merging with integer linear programing techniques (ILP). We put the demonstration of the system on the web (http://150.65.242.101:9998/) We also develop a feature selection method and a feature-weighting tool for SVM-RBF kernel using the GA algorithm. This machine-learning tool can be used for learning for text summarization. We have submitted this work to an international Journal.
|
Current Status of Research Progress |
Current Status of Research Progress
1: Research has progressed more than it was originally planned.
Reason
In current work, we have developed a social context summarization framework for utilizing comments for text summarization. The results showed that the proposed model achieved the state-of-the-art results on the benchmark data. We found that the semantic similarity between comments and news is also helpful for improving the performance of summarization systems. Our results are promising, and we have published one result in the top conference on information retrieval (ECIR 2016).
|
Strategy for Future Research Activity |
In the future work, we will focus on multiple document summarizations in social context summarization. We will study how multiple sentence compression and abstract summarization can be utilized for social context in summarization. In addition, we would like to investigate deep learning models for scoring importance sentences and comments under a social context framework.
|
Causes of Carryover |
The reason why we have carry-over in FY 2015 as follows: We planed to buy the Mac Server with 900,000 yen. However, we bought a cheaper one with the price 476,000 yen. In addition, I and my student (research collaborator) planned to attend the ECIR 2016 conference. However, only the research collaborator attended this conference for presenting our results. So, we save an amount of budget. In FY 2016, we would like to use more budget for hiring LA/RA to develop summarization systems.
|
Expenditure Plan for Carryover Budget |
We would like to hire three RA/LA students in the next year. We also would like to attend an internal conference (KSE 2016) and COLING 2016.
|
Research Products
(5 results)