Development of the methods of Exploratory Data Analysis for Japanese linguistics
Project/Area Number |
15520289
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Japanese linguistics
|
Research Institution | Osaka University |
Principal Investigator |
ISHII Masahiko Osaka University, Graduate School of Letters, Associate Professor, 大学院文学研究科, 助教授 (10159676)
|
Project Period (FY) |
2003 – 2005
|
Project Status |
Completed (Fiscal Year 2005)
|
Budget Amount *help |
¥3,300,000 (Direct Cost: ¥3,300,000)
Fiscal Year 2005: ¥600,000 (Direct Cost: ¥600,000)
Fiscal Year 2004: ¥1,200,000 (Direct Cost: ¥1,200,000)
Fiscal Year 2003: ¥1,500,000 (Direct Cost: ¥1,500,000)
|
Keywords | Exploratory Data Analysis / Quantitative linguistics / Corpus linguistics / Japanese linguistics / Statistics / 統計学 |
Research Abstract |
The purposes of this research were as follows. 1. To verify that "Exploratory Data Analysis (EDA)" is effective in Japanese linguistics by reexamining the result and knowledge of many past statistical researches. 2. To show clearly which techniques of EDA can use for investigation and research of what kind of Japanese linguistics. 3. To make the guide in which the concrete directions of EDA in Japanese linguistics were described. For these purposes, we chose some past statistical Japanese researches, and analyzed them by EDA. Moreover, we created the corpora of junior high school textbooks or newspaper columns, and analyzed them by EDA. Based on these analyses, we clarified the validity and the necessity for EDA to Japanese linguistics, and drew up the guide of the directions of EDA for Japanese linguistics. In this guide, ten sorts of techniques of EDA considered to be effective in Japanese linguistics are introduced and explained with the concrete application examples to research. The ten sorts of techniques are as follows; stem and leaf display, numerical summary and parallel boxplots, data smoothing, re-expression of data, resistant line, wandering boxplot, rootogram, two-way analysis (median polish), relative to an identified distribution analysis, and logit transformation. It is thought that the Japanese linguistics using these techniques of EDA makes it possible to discover new character and structures hidden in various data of the Japanese language.
|
Report
(4 results)
Research Products
(7 results)