Project/Area Number |
10558055
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 展開研究 |
Research Field |
情報システム学(含情報図書館学)
|
Research Institution | Nara Institute of Science and Technology |
Principal Investigator |
IMAI Masakazu Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor, 情報科学研究科, 助教授 (60193653)
|
Co-Investigator(Kenkyū-buntansha) |
MATSUMOTO Yoshio Nara Institute of Science and Technology, Graduate School of Information Science, Research Associate, 情報科学研究科, 助手 (00314534)
NAKAMURA Takayuki Nara Institute of Science and Technology, Graduate School of Information Science, Researc Associate, 情報科学研究科, 助手 (50291969)
OGASAWARA Tsukasa Nara Institute of Science and Technology, Graduate School of Information Science, Professor, 情報科学研究科, 教授 (30304158)
HADA Hisakazu Nara Institute of Science and Technology, Graduate School of Information Science, Research Associate, バイオサイエンス研究科, 助手 (00311788)
ATARASHI Rei Nara Institute of Science and Technology, Graduate School of Information Science, Researcg Associate, 情報科学研究科, 助手 (60294287)
|
Project Period (FY) |
1998 – 1999
|
Project Status |
Completed (Fiscal Year 1999)
|
Budget Amount *help |
¥11,700,000 (Direct Cost: ¥11,700,000)
Fiscal Year 1999: ¥4,400,000 (Direct Cost: ¥4,400,000)
Fiscal Year 1998: ¥7,300,000 (Direct Cost: ¥7,300,000)
|
Keywords | conceptual information retrieval / information retrieval / digital libraries / semantic information retrieval / digital archives / ディジタルライブラリ / 情報検索 |
Research Abstract |
In this research, we proposed a new Information Retrieval (IR) method using semantic information from technical papers. The proposed method is suitable for Digital Libraries (DL). Users of Digital Libraries need to retrieve information which meets their semantic requirements. An important problem is the reduction of retrieval errors caused by differences in requests among individual users. To solve this problem, we used some techniques of natural language processing, and dictionaries which describe the relations between words and concepts. To extract semantic information from technical papers, we applied morphological analysis program for text data derived from images of technical papers as OCR results. After morphological analysis, we extract only nouns and examine their distribution of word appearance frequency for later use. We also extract concepts of nouns with EDR concept dictionary and calculate their distribution of concept appearance frequency. Combining the distribution of word appearance frequency and that of concept appearance frequency, we got the concepts which correspond with the subject of technical papers. One of the key ideas of this research is handling concept as concept paths which include the relations among concepts. This helps abstraction of concepts of the subjects of technical paper. Experimental results show effectiveness of the proposed method. We also realized a prototype system of conceptual information retrieval. We focused on technical papers written in both Japanese and English. As EDR dictionary also has concept dictionary of English words, we applied our method to technical papers written in English easily. One of the characteristics of EDR concept dictionary is that expression of concepts are in common with Japanese words and English words. This will help cross lingual information retrieval and we realized.
|