2016 Fiscal Year Annual Research Report
Real-time, Best-effort Query Processing of Semantic Web data
Project/Area Number |
15K15994
|
Research Institution | National Institute of Advanced Industrial Science and Technology |
Principal Investigator |
Lynden Steven 国立研究開発法人産業技術総合研究所, 人工知能研究センター, 研究員 (30528279)
|
Project Period (FY) |
2015-04-01 – 2017-03-31
|
Keywords | Linked Open Data / Query Processing / Semantic Web / Web Data Integration |
Outline of Annual Research Achievements |
Further research has been performed into techniques to support efficient distributed query processing in two complementary directions. Firstly, to optimize distributed querying over federated SPARQL endpoints, an investigation has been performed into the application of machine learning techniques to predict the behavior of SPARQL endpoints in terms of response time, and number of results returned. This can support efficient distributed query processing by providing query plan optimizers with estimations of response times prior to the execution of queries across multiple SPARQL endpoints. This has successfully been demonstrated by applying machine learning techniques such as Random Forest Regression and gradient boosting using SPARQL query logs for several endpoints deployed on the Web. Secondly, techniques to support Linked Data queries over structured data (e.g. RDFa, Microdata, JSON-LD) embedded in Web pages have been developed. It has been demonstrated that machine learning can be utilised to predict the presence or absence of relevant structured data in Web pages using data from previously explored pages. The use of data mining techniques to automatically link knowledge bases such as DBpedia to structured data on the Web has also been demonstrated.
|