Project/Area Number |
07558273
|
Research Category |
Grant-in-Aid for Scientific Research (A)
|
Allocation Type | Single-year Grants |
Section | 展開研究 |
Research Field |
情報システム学(含情報図書館学)
|
Research Institution | The University of Tokushima |
Principal Investigator |
AOE Junichi The University of Tokushima Information Science Professor, 工学部, 教授 (90108853)
|
Co-Investigator(Kenkyū-buntansha) |
ONO Norihiko The University of Tokushima Information Science Professor, 工学部, 教授 (60194594)
SATO Takashi Osaka-kyoiku University Information Science Asso.Professor, 教育学部, 助教授 (20124117)
|
Project Period (FY) |
1995 – 1997
|
Project Status |
Completed (Fiscal Year 1997)
|
Budget Amount *help |
¥3,100,000 (Direct Cost: ¥3,100,000)
Fiscal Year 1997: ¥1,600,000 (Direct Cost: ¥1,600,000)
Fiscal Year 1996: ¥1,500,000 (Direct Cost: ¥1,500,000)
|
Keywords | partial match / keyword search / multi-attribute keys / text data base / information vetrieval / 多属性検索 / 文書処理 |
Research Abstract |
Extracting keywords efficiently is an important task in text retrieval systems. In Japanese text, there are many compound words consisting some kinds of characters (Katakana, Kanji, etc.) and the text has no delimiter among words. Therefore, extracting keywords from such a text takes a lot of time. This research presents a technique of detecting keywords from compound keywords by introducing a set of rules, which represents multi-attribute conditions for keywords conctruction. A string pattern matching machine for a finit number of patterns is applied to matching of the rules and storing keyword candidates together with information bout both long term and short term words. The approach is estimated by theoretical analysis. By the simulation results for 34 Japanese text files, it has been that the algorithm presented has performed 19.4ms/KB and that the ratio of extracting expected keywords increase from the traditional approaches.
|