An Information Retrieval using Conceptual Index Term for Technical Papers
Project/Area Number |
09480076
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
情報システム学(含情報図書館学)
|
Research Institution | Nara Institute of Science and Technology |
Principal Investigator |
NAKAMURA Takayuki Nara Institute of Science and Technology Dept.of Information Systems, Research Associate, 情報科学研究科, 助手 (50291969)
|
Co-Investigator(Kenkyū-buntansha) |
IMAI Masakazu Nara Institute of Science and Technology Dept.of Information Systems, Associate, 情報科学研究科, 助教授 (60193653)
OGASAWARA Tsukasa Nara Institute of Science and Technology Dept.of Information Systems, Professor, 情報科学研究科, 教授 (30304158)
藤川 和利 奈良先端科学技術大学院大学, 情報科学研究科, 助手 (30252729)
|
Project Period (FY) |
1997 – 1998
|
Project Status |
Completed (Fiscal Year 1998)
|
Budget Amount *help |
¥11,100,000 (Direct Cost: ¥11,100,000)
Fiscal Year 1998: ¥4,000,000 (Direct Cost: ¥4,000,000)
Fiscal Year 1997: ¥7,100,000 (Direct Cost: ¥7,100,000)
|
Keywords | Conceptual Retrieval / Digital Library / Information Retrieval / ディジタル図書館 / ディジタルライブラリ / 情報検索 / 文献検索 |
Research Abstract |
In this research, we proposed a new Information Retrieval (IR) method using semantic information from technical papers. The proposed method is suitable for use in Digital Libraries (DL). Users of Digital Libraries need to retrieve information which meets their semantic requirements. An important problem is the reduction of retrieval errors caused by differences in requests among individual users. To solve this problem, we used some techniques of natural language processing, and dictionaries that describe the relations between words and concepts. To extract semantic information of technical papers, we applied morphological analysis program for text data derived from images of technical papers as OCR results. After morphological analysis, we extract only nouns and examine their distribution of word appearance frequency for later use. We also extract concepts of nouns with EDR concept dictionary and calculate their distribution of concept appearance frequency. Combining the distribution of word appearance frequency and that of concept appearance frequency, we get the concepts which correspond with the subject of technical papers. One of the key ideas of this research is handling concept as concept paths which include the relations among concepts. This helps abstraction of concepts of the subjects of technical paper. Experimental results show effectiveness of the proposed method. We also realized a prototype system of conceptual information retrieval. We focused on technical papers written in Japanese in this paper. As EDR dictionary also has concept dictionary of English words, we can apply our method to technical papers written in English easily. One of the characteristics of EDR concept dictionary is that expression of concepts are in common with Japanese words and English words. This will help cross lingual information retrieval.
|
Report
(3 results)
Research Products
(8 results)