2003 Fiscal Year Final Research Report Summary
A Metaphoric Search Method for HTML documents
Project/Area Number |
13480086
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Hokkaido University |
Principal Investigator |
HARAGUCHI Makoto Hokkaido Univ., Graduate School of Engineering, Professor, 大学院・工学研究科, 教授 (40128450)
|
Co-Investigator(Kenkyū-buntansha) |
SADOHARA Ken National Institute of Advanced Industrial Science and Technology, Researcher, 研究員 (90344168)
OKUBO Yoshiaki Hokkaido Univ., Graduate School of Engineering, Instructor, 大学院・工学研究科, 助手 (40271639)
|
Project Period (FY) |
2001 – 2003
|
Keywords | Metaphorical Search / HTML documents / Concept Graph Representation / Similarity among texts |
Research Abstract |
Normally, it is troublesome task for users of search engines to represent their intention explicitly beforehand. This is one of major reasons why the search results do not meet user's intention in many cases. Instead of presenting queries describing those intentions precisely, we suppose as a query a pair of an abstract query and its examples. What to be searched is an instance of the abstract one similar to the examples. In other words, our search task is to "find an instance of the abstract one like examples". An HTML document, the object of our search task, can be viewed as a rooted tree of tags with some text contents as its leaves. In order to judge the similarities between text contents and tag structure as well, we consider an ordering on the class of concept graph representations. Both instance generalization relationship and similarity relationship can be defined in terms of the ordering. Based on this fundamental structure of objects for our search problem, we have developed an algorithm to find an instance of abstract query, given its examples. That is, it first computes a set of segments of sentences in text contents from the abstract query Secondly, by matching those text segments, it forms an instance of the query that is a generalization of the given example documents. Finally, any document subsumed by the instance is regarded relevant to the initial query Our experimental result shows that it can compute the generalized document of about 50 sentences within 3 seconds.
|
Research Products
(10 results)