2010 Fiscal Year Final Research Report
On probability distributions of Japanese sentence lengths.
Project/Area Number |
20520389
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Linguistics
|
Research Institution | The University of Tokushima |
Principal Investigator |
ISHIDA Motohiro The University of Tokushima, 大学院・ソシオ・アーツ・アンド・サイエンス研究部, 准教授 (40232318)
|
Project Period (FY) |
2008 – 2010
|
Keywords | 計量言語学 / テキストマイニング / 日本語 / 英語 / 独語 |
Research Abstract |
The first aim of this study was to see if any of these probability distributions, log-normal distribution, Pascal distribution, and negative binomial distribution, could be really fit to Japanese sentence lengths. These had long been hypothesized as the best fit to some European languages (log-normal distribution to Japanese), but none of these proved to be appropriate. And applying a generalized linear model showed some possibility that sentence lengths might be affected by some other fixed factors than only probability phenomena. The second aim of developing the software that calculate Japanese sentence lengths automatically has been achieved, and two original software packages are now open to the public.
|
Research Products
(7 results)
-
-
-
-
[Remarks] ホームページ等
-
[Remarks] 言語解析用ソフトの開発と公開
-
-