Project/Area Number |
07558274
|
Research Category |
Grant-in-Aid for Scientific Research (A)
|
Allocation Type | Single-year Grants |
Section | 試験 |
Research Field |
情報システム学(含情報図書館学)
|
Research Institution | The Science University of Tokyo |
Principal Investigator |
FUJISAKI Hiroya Science University of Tokyo, Faculty of Industrial Science and Technology, Professor, 基礎工学部, 教授 (80010776)
|
Co-Investigator(Kenkyū-buntansha) |
KURASHIMA Tokihisa Sanseido Publishing Company, Publishing Division, Managing Director, 出版局(データベースシステム研究開発), 常務取締役出版局長
OHNO Sumio Science University of Tokyo, Faculty of Industrial Science and Technology, Resea, 基礎工学部, 助手 (80256677)
KAMEDA Hiroyuki Tokyo Engineering University, Faculty of Engineering, Associate Professor, 工学部, 助教授 (00194994)
|
Project Period (FY) |
1995 – 1996
|
Project Status |
Completed (Fiscal Year 1996)
|
Budget Amount *help |
¥2,100,000 (Direct Cost: ¥2,100,000)
Fiscal Year 1996: ¥2,100,000 (Direct Cost: ¥2,100,000)
|
Keywords | Computer-readable Lexical Database / Acquisition of Lexical Information / Detection of Unknown Words / Inference on Syntactic Information / Inference on Semantic Information / Lexical Database System / 自然言語処理 / 機械学習 / 辞書データベースシステム / 未知語検出 / 未知語品詞推定 / 知識獲得 |
Research Abstract |
(1) Expansion of lexical data and classification of newspaper article data : Existing lexical data have been expanded using "Shin-Meikai Kokugo Jiten" (by Sanseido Publishing Co.) and EDR Electronic Dictionary (by Japan Electronic Dictionary Research Institute, Ltd.), and newspaper article data have been classified. (2) Determination of data structure for describing the semantic system : The data structure has been determined on the basis of the EDR Electronic Dictionary, and the data for the semantic system have been classified on a computer. (3) Design and implementation of the subsystem for automatic detection of unknown words : A program has been designed and implemented for the morphological and syntactic analysis of text, and for detecting unknown words. (4) Design and implementation of the subsystem for automatic inference on syntactic and semantic information of unknown words : The data structure has been determined and the program has been designed and implemented for the subsystem for automatic inference of systematic and semantic information of unknown words. (5) Implementation of the basic part of the advanced lexical database system and preliminary confirmation of its operation. (6) Determination of the detailed specifications for the lexical database system : A 40M byte main memory was adopted for the system. (7) Determination of the detailed specifications for the lexical data : The total numbern of lexical items are : 187,868 nouns, 645 pronouns, 10,620 verbs, 1,124 adjectives, 1,345 adverbs, and 144 others. (8) Construction of the advanced lexical database system : The system has been constructed using Arity/Prolog language, and currently occupies 142k bytes of memory. (9) Evaluation of the system : The performance of the system has been evaluated using the lexical data and the electronic texts from newspaper articles, and the results congirmed the basic validity of the current system.
|