研究課題/領域番号 |
18K19841
|
研究機関 | 京都大学 |
研究代表者 |
Adam Jatowt 京都大学, 情報学研究科, 特定准教授 (00415861)
|
研究期間 (年度) |
2018-06-29 – 2020-03-31
|
キーワード | question answering / document archives |
研究実績の概要 |
We have designed a working demo system to answer queries about similarity of objects across-time. The proposed system allows also for input of a viewpoint to further specify the query. The description of this work was published as a demo paper at WSDM2019 conference.
We have also built the foundations for generic news archive answering system. It uses solr as a search engine and state of the art answer selector based on bi-directional LSTM to extract answers from pool of candidate documents. The main focus was on re-ranking search results returned by solr so that documents that have highest probability of answer are collected. On the corpus of manually created 200 temporal questions we foud out that our reranking approach improves results by 10%.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
There is no substantial delay in this project. We plan to submit conference paper further in this year and we will soon release the questions dataset.
|
今後の研究の推進方策 |
We will create larger datasets of questions and answers in the near future. Next step assumes incorporation of a supervised approach based on neural networks to improve the computation or extraction of the answer. The decision if the answer should be computed or extracted should be also performed automatically in the future. We will also continue improving the document re-ranking functions to further boost the performance.
|
次年度使用額が生じた理由 |
We need to use the grant funds in the next year in order to continue the project. In particular to develop neural network based mechanism for answering questions. This research field is very new and challenging. Developing methods is then quite difficult and needs a lot of effort and manual testing. Furthermore we want to develop large scale test sets for evaluating of developed methods. We will then employ crowdsourcing to gather a lot of questions (around 2000 manually made questions).
|