Project/Area Number |
08558027
|
Research Category |
Grant-in-Aid for Scientific Research (A)
|
Allocation Type | Single-year Grants |
Section | 展開研究 |
Research Field |
Intelligent informatics
|
Research Institution | The University of Tokyo |
Principal Investigator |
TSUJI Junichi The University of Tokyo, Graduate School of Science, Professor, 大学院・理学系研究科, 教授 (20026313)
|
Co-Investigator(Kenkyū-buntansha) |
IKEHARA Satoru The University of Tottori, Faculty of Engineering, Professor, 工学部, 教授 (70283968)
KAGEURA Kyo National Center for Science Information Systems, Associate Professor, 助教授 (00211152)
KOYAMA Teruo National Center for Science Information Systems, Professor, 教授 (80124410)
KIYONO Masaki Matsushita Electric Industrial company, Research institute of Tokyo, research worker, 東京研究所, 研究員
|
Project Period (FY) |
1996 – 1998
|
Project Status |
Completed (Fiscal Year 1998)
|
Budget Amount *help |
¥13,200,000 (Direct Cost: ¥13,200,000)
Fiscal Year 1998: ¥2,600,000 (Direct Cost: ¥2,600,000)
Fiscal Year 1997: ¥3,200,000 (Direct Cost: ¥3,200,000)
Fiscal Year 1996: ¥7,400,000 (Direct Cost: ¥7,400,000)
|
Keywords | knowledge acquisition / semantic classification / database / technical term extraction / 専門用語 / オントロジー / 係り受け解析 / 分布モデル / コーパス / 自動抽出 / 記号処理プログラム / 言語の統計的処理 / タ-ミノロジー / 知識表現 / 情報検索 |
Research Abstract |
The goal of this project was to provide the systems that can acquire knowledge on terminology from texts in a semi-automatic manner. In order to accomplish the goal, we have developed the following three systems. 1. Central Database for Terminology : We have created a database system for terminology by integrating the text/lexicon database developed by EDR and the programming language LiLFeS, which was developed at University of Tokyo for easy and flexible treatment linguistic entities By this system, we can perform a systematic maintenance of the knowledge acquired by the following two systems. 2. Systems for term recognition : The research group in the NACSIS introduced a statistical metric to identify technical terminology in texts, and built the programs that can recognize terms using this metric. The group in University of Tokyo attacked the same problem in a different perspective, and succeeded in providing a term recognition method based on character n-grams. Those programs are integrated so that they can work as a front end of the database system described in 1. 3. Systems for acquiring ontological knowledge on terms : The research group in University of Tokyo developed the programs for obtaining semantic classifications of words according to surface clues appearing in texts. The Matsushita research group developed a similar technique using deeper syntactic structures of texts. Those systems were applied to the documents in Genome texts, the news articles about stock markets and so on.
|