COMPUTER PROCESSING OF JAPANESE SENTENCES IN OLD WRITING STYLE

Research Project

Project/Area Number	06680383
Research Category	Grant-in-Aid for General Scientific Research (C)
Allocation Type	Single-year Grants
Research Field	情報システム学(含情報図書館学)
Research Institution	MUSASHI INSTITUTE OF TECHNOLOGY
Principal Investigator	UEHARA Tetuzo MUSASHI INST.OF TECH., DEP.OF ENG., PROFESSOR, 工学部, 教授 (60257102)
Co-Investigator(Kenkyū-buntansha)	ISHIKAWA Tomo MUSASHI INST.OF TECH., DEP.OF ENG., PROFESSOR, 工学部, 教授 (00202961)
Project Period (FY)	1994 – 1995
Project Status	Completed (Fiscal Year 1995)
Budget Amount *help	¥2,100,000 (Direct Cost: ¥2,100,000) Fiscal Year 1995: ¥500,000 (Direct Cost: ¥500,000) Fiscal Year 1994: ¥1,600,000 (Direct Cost: ¥1,600,000)
Keywords	OLD JAPANESE TEXT / GRAMMAR / MORPHOLOGICAL ANALYSIS / DICTIONARY / SYNTACTIC ANALYSIS / KANJI-KANA TRANSLATION / JAPANESE LANGUAGE PROCESSING / NATURAL LANGUAGE PROCESSING / 自然語処理
Research Abstract	We carried out following two experiments on computer processing of classical Japanese sentences in an old story Ise Monogatari. 1. Word segmentation and Kanji-to-Kana conversion : We made these experiments by using Japanese text inputting system Wnn which has features of Kanji-to-Kana and Kana-to-Kanji conversion according to user defined dictionary and grammar. The word dictionary we prepared records only the words used in Ise Monogatari. As for the grammar we defined, it is more general and applicable to the contemporary literature of Ise Monogatari. The success ratios in word segmentation and Kanji-to-Kana conversion were about 90 and 97 percent respectively. 2. Syntactic analysis : We attempted an experiment of syntax analysis of sentences in Ise Monogatari using a sequence of words with part of speech data as input. We prepared syntactic rules defining inter-word modification relationship of syntactically well-formed sentences. The result of syntax analysis of a sentence consists of possible sequences of words with modification relation to other word of the sentence. We also prepared some heuristic rules giving likelihood estimation among the results of syntax analysis of sentence. Syntactic rules are based on ordinary Japanese grammar taught in school. The heuristic rules reflect the properties of encountered sentence expressions during experiments. In syntactic analysis 0f 107 sentences, 96 cases were analyzed correctly. Several directions for near future study are as follows. 1. To make part of speech more fine and to update syntactic rules accordingly. 2. To strengthen the power of heuristic rules, e.g.by application of case grammar. 3. To collect word data and corpora on old literature.

Report

(3 results)

1995 Annual Research Report Final Research Report Summary
1994 Annual Research Report