Grammatical Analysis of Japanese KANA-Sentences in Old Style by Computer
Project/Area Number |
13680492
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
情報システム学(含情報図書館学)
|
Research Institution | Musashi Institute of Technology |
Principal Investigator |
UEHARA Tetsuzou MUSASHI INST. OF TECHNOLOGY, FACULTY OF ENGINEERING, PROF., 工学部, 教授 (60257102)
|
Co-Investigator(Kenkyū-buntansha) |
SHIMIZU Yumiko MUSASHI INST. OF TECHNOLOGY, FACULTY OF ENGINEERING, ASST. PROF., 環境情報学部, 助教授 (30298020)
ARAI Shuuichi MUSASHI INST. OF TECHNOLOGY, FACULTY OF ENGINEERING, ASST. PROF., 工学部, 助教授 (20212590)
|
Project Period (FY) |
2001 – 2002
|
Project Status |
Completed (Fiscal Year 2002)
|
Budget Amount *help |
¥2,400,000 (Direct Cost: ¥2,400,000)
Fiscal Year 2002: ¥1,100,000 (Direct Cost: ¥1,100,000)
Fiscal Year 2001: ¥1,300,000 (Direct Cost: ¥1,300,000)
|
Keywords | OLD JAPANESE TEXT / CORPUS / GRAMMAR / CONCEPT / NATURAL LANGUAGE PROCESSING / JAPANESE LANGUAGE PROC. / LEXICAL PROCESSING / 検索 |
Research Abstract |
Dictionaries with concept data and corpora with part-of-speech tags are important tools in natural language processing. However, there is not any such data tool for Japanese old style language. Therefore we studied a method for preparing such corpora and realized an old-Japanese corpus with 150,000 words. We performed an experiment of lexical analysis on old Japanese by statistically learning of the part-of-speech tagged corpus. In addition, we studied a method of acquiring a common concept by finding a portion where concepts concerning the words concentrated on a thesaurus. In order to estimate the method, we applied it to English-Japanese Dictionary, which is a sequence of records each having an English headword and Japanese equivalent words. The result shows some usefulness. However, it is not sufficient to get the concepts of old-Japanese words using old Japanese dictionary. This is a future study.
|
Report
(3 results)
Research Products
(11 results)