On probability distributions of Japanese sentence lengths.
Project/Area Number |
20520389
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Linguistics
|
Research Institution | The University of Tokushima |
Principal Investigator |
ISHIDA Motohiro The University of Tokushima, 大学院・ソシオ・アーツ・アンド・サイエンス研究部, 准教授 (40232318)
|
Project Period (FY) |
2008 – 2010
|
Project Status |
Completed (Fiscal Year 2010)
|
Budget Amount *help |
¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
Fiscal Year 2010: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2009: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2008: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
|
Keywords | 計量言語学 / テキストマイニング / 日本語 / 英語 / 独語 / 統計学 / 言語学 / 確率論 |
Research Abstract |
The first aim of this study was to see if any of these probability distributions, log-normal distribution, Pascal distribution, and negative binomial distribution, could be really fit to Japanese sentence lengths. These had long been hypothesized as the best fit to some European languages (log-normal distribution to Japanese), but none of these proved to be appropriate. And applying a generalized linear model showed some possibility that sentence lengths might be affected by some other fixed factors than only probability phenomena. The second aim of developing the software that calculate Japanese sentence lengths automatically has been achieved, and two original software packages are now open to the public.
|
Report
(4 results)
Research Products
(13 results)