Analysis of the Relationship between Proper nouns in Large Scale Corpus
Project/Area Number |
15500090
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Toyohashi University of Technology |
Principal Investigator |
UMEMURA Kyoji Toyohashi University of Technology, Information and Computer Sciences, Professor, 工学部, 教授 (80273324)
|
Project Period (FY) |
2003 – 2004
|
Project Status |
Completed (Fiscal Year 2004)
|
Budget Amount *help |
¥3,300,000 (Direct Cost: ¥3,300,000)
Fiscal Year 2004: ¥1,700,000 (Direct Cost: ¥1,700,000)
Fiscal Year 2003: ¥1,600,000 (Direct Cost: ¥1,600,000)
|
Keywords | Statistical Analysis / Support Vector Machine / Medical System / Synonym / 関連語 / シソーラス / 統計的言語処理 |
Research Abstract |
In the first year, we have developed computer cluster system from parts, and developed the specialized software package for frequency analysis. Though most of these works are combination of existing result, we have realized a powerful environment to analyze the corpus. In the second year, we have used the SVM to detect keywords from corpus. The input of SVM is the statistical values of many strings, and the SVM judges whether the string is keywords or not. Sine this method does not use any kind of dictionary, the identical program works for both Japanese and Chinese. It is very interesting and remarkable result that the keyword can be extracted without any kind of dictionaries. All we need are samples of keywords in each language. We have also applied our environment to analyze the decease name of medical information systems. The data in the system consists of 7 years of medical record. Without our environment, it would be very difficult to analyze the data and get the synonyms of decease names from the data.
|
Report
(3 results)
Research Products
(16 results)