1995 Fiscal Year Final Research Report Summary
COMPUTER PROCESSING OF JAPANESE SENTENCES IN OLD WRITING STYLE
Project/Area Number |
06680383
|
Research Category |
Grant-in-Aid for General Scientific Research (C)
|
Allocation Type | Single-year Grants |
Research Field |
情報システム学(含情報図書館学)
|
Research Institution | MUSASHI INSTITUTE OF TECHNOLOGY |
Principal Investigator |
UEHARA Tetuzo MUSASHI INST.OF TECH., DEP.OF ENG., PROFESSOR, 工学部, 教授 (60257102)
|
Co-Investigator(Kenkyū-buntansha) |
ISHIKAWA Tomo MUSASHI INST.OF TECH., DEP.OF ENG., PROFESSOR, 工学部, 教授 (00202961)
|
Project Period (FY) |
1994 – 1995
|
Keywords | OLD JAPANESE TEXT / GRAMMAR / MORPHOLOGICAL ANALYSIS / DICTIONARY / SYNTACTIC ANALYSIS / KANJI-KANA TRANSLATION / JAPANESE LANGUAGE PROCESSING / NATURAL LANGUAGE PROCESSING / 自然語処理 |
Research Abstract |
We carried out following two experiments on computer processing of classical Japanese sentences in an old story Ise Monogatari. 1. Word segmentation and Kanji-to-Kana conversion : We made these experiments by using Japanese text inputting system Wnn which has features of Kanji-to-Kana and Kana-to-Kanji conversion according to user defined dictionary and grammar. The word dictionary we prepared records only the words used in Ise Monogatari. As for the grammar we defined, it is more general and applicable to the contemporary literature of Ise Monogatari. The success ratios in word segmentation and Kanji-to-Kana conversion were about 90 and 97 percent respectively. 2. Syntactic analysis : We attempted an experiment of syntax analysis of sentences in Ise Monogatari using a sequence of words with part of speech data as input. We prepared syntactic rules defining inter-word modification relationship of syntactically well-formed sentences. The result of syntax analysis of a sentence consists of possible sequences of words with modification relation to other word of the sentence. We also prepared some heuristic rules giving likelihood estimation among the results of syntax analysis of sentence. Syntactic rules are based on ordinary Japanese grammar taught in school. The heuristic rules reflect the properties of encountered sentence expressions during experiments. In syntactic analysis 0f 107 sentences, 96 cases were analyzed correctly. Several directions for near future study are as follows. 1. To make part of speech more fine and to update syntactic rules accordingly. 2. To strengthen the power of heuristic rules, e.g.by application of case grammar. 3. To collect word data and corpora on old literature.
|