2007 Fiscal Year Final Research Report Summary
Plain Japanese and Paraphrasing for Readability Enhancement
Project/Area Number |
16200009
|
Research Category |
Grant-in-Aid for Scientific Research (A)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Nagoya University (2005-2007) Kyoto University (2004) |
Principal Investigator |
SATO Satoshi Nagoya University, Graduate School of Engineering, Professor (30205918)
|
Co-Investigator(Kenkyū-buntansha) |
UTSURO Takahito Tsukuba Univesity, Graduate School of Systems and Information Engineering, Associate Professor (90263433)
YAMAMOTO Kazuhide Nagaoka Univesity of Technology, Department of Electrical Engineering, Associate Professor (40359708)
INUI Kentaro Nara Institute of Science and Technology, Graduate School of Information Science, Associate Professor (60272689)
FUJITA Atsushi Nagoya University, Graduate School of Engineering, Assistant Professor (10402801)
|
Project Period (FY) |
2004 – 2007
|
Keywords | Basic Vocabulary / Functional Expressions / Idioms / Paraphrasing / Readability / Dictionaries |
Research Abstract |
1. We have compiled JC2.2, a basin vocabulary of Japanese, which contains 5,814 basic words and terms including 211 functional expressions and 177 idioms 2. We have compiled a dictionary of Japanese functional expressions It contains 341 functional expressions and their 16,711 surface forms, which are hierarchically organized. By using this dictionary, we have implemented a paraphraser of Japanese functional expressions. 3. We have compiled MUST1, an example database of Japanese compound functional expressions. Combining this database and a machine lemming method, we have implemented a detector of Japanese fimctional exressions 4. We have compiled a textbook corpus which contains 1478 sample passages (about 1M characters) extracted from 127 textbooks of elementary school, junior high school, high school, and university. By using this textbook corpus as a criterion, we have implemented a readability analyzer of Japanese texts. For a given Japanese text, this analyzer produces one of 13 grades based on character unigram models constructed from the textbook corpus. The performance of this analyzer, which is measured by the correlation coefficient, is considerably high (R>0.9) . 5. We studied a dynamic phrasal thesaurus, which generates phrasal synonyms for a given ph ram. We also studied application-oriented paraphrasing methods': paraphrasing of verbal noun phrases into compound nouns paraphrasing of verb phrases for easy understanding, transforming sentence ends into news headline style. We have proposed a new summarization method, summarization by analogy. 6.We have released JC22, MUST1, and the readability analyzer to the public.
|
Research Products
(49 results)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Journal Article] Development and Analysis of An Example Database of Japanese Compound Functional Expressions2006
Author(s)
Masatoshi, Tsuchiya, Takehito, Utsuro, Suguru, Matsuyoshi, Satoshi, Sato, Seiichi, Nakagawa
-
Journal Title
IPSJ Journal Vol.47, No.6
Pages: 1728-1741
Description
「研究成果報告書概要(欧文)」より
-
-
-
-
-
-
-
-
-
-
-
-
-
[Presentation] 濃縮還元型文要約モデルの検討2006
Author(s)
池田 諭史, 牧野 恵, 山本 和英
Organizer
情報処理学会研究報告, 2006-NL-174
Place of Presentation
函館
Year and Date
2006-07-28
Description
「研究成果報告書概要(和文)」より
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-