2004 Fiscal Year Final Research Report Summary
Analysis of the Relationship between Proper nouns in Large Scale Corpus
Project/Area Number |
15500090
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Toyohashi University of Technology |
Principal Investigator |
UMEMURA Kyoji Toyohashi University of Technology, Information and Computer Sciences, Professor, 工学部, 教授 (80273324)
|
Project Period (FY) |
2003 – 2004
|
Keywords | Statistical Analysis / Support Vector Machine / Medical System / Synonym |
Research Abstract |
In the first year, we have developed computer cluster system from parts, and developed the specialized software package for frequency analysis. Though most of these works are combination of existing result, we have realized a powerful environment to analyze the corpus. In the second year, we have used the SVM to detect keywords from corpus. The input of SVM is the statistical values of many strings, and the SVM judges whether the string is keywords or not. Sine this method does not use any kind of dictionary, the identical program works for both Japanese and Chinese. It is very interesting and remarkable result that the keyword can be extracted without any kind of dictionaries. All we need are samples of keywords in each language. We have also applied our environment to analyze the decease name of medical information systems. The data in the system consists of 7 years of medical record. Without our environment, it would be very difficult to analyze the data and get the synonyms of decease names from the data.
|
Research Products
(6 results)