2019 Fiscal Year Annual Research Report
良質な用例を大規模なコーパスから自動的に抽出できるモデルの構築および試作版の開発
Project/Area Number |
18F18808
|
Research Institution | National Institute for Japanese Language and Linguistics |
Principal Investigator |
PARDESHI P.V. 大学共同利用機関法人人間文化研究機構国立国語研究所, 理論・対照研究領域, 教授 (00374984)
|
Co-Investigator(Kenkyū-buntansha) |
HMELJAK MARIJA 大学共同利用機関法人人間文化研究機構国立国語研究所, 理論・対照研究領域, 外国人特別研究員
|
Project Period (FY) |
2018-11-09 – 2020-03-31
|
Keywords | example sentences / learners' dictionary / lexicography |
Outline of Annual Research Achievements |
The aim of this project was to develop a model for selecting pedagogically valid Japanese example sentences from a general corpus, by investigating automatically measurable criteria of readability, typicality and informativity. We collected example sentences from learners' dictionaries, reference works, graded readers and learner corpora, and constructed a graded corpus of example sentences, to be used as a data set for verifying the usabililty of existing readability formulas on single sentences or short usage examples for learners of Japanese as a foreign language. We experimented using existing readability formulas on these graded example sentences, and found that while the formulas work well for longer texts, they are not applicable to single sentences. We further annotated a set of sentences extracted from a web corpus, manually scoring their readability and informativity for learners of Japanese as a foreign language, to investigate measurable criteria of readable and informative sentences. The analysis of these criteria is still in progress. We are currently exploring possible interfaces to the corpus of constructed single example sentences and the annotated set of sentences extracted from texts to be used by learners, teachers and lexicographers of Japanese as a foreign language.
|
Research Progress Status |
令和元年度が最終年度であるため、記入しない。
|
Strategy for Future Research Activity |
令和元年度が最終年度であるため、記入しない。
|
Research Products
(3 results)