Budget Amount *help |
¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2012: ¥520,000 (Direct Cost: ¥400,000、Indirect Cost: ¥120,000)
Fiscal Year 2011: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
Fiscal Year 2010: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000)
|
Research Abstract |
With the sizes of corpora increasingly large and with the development and spread of “user-friendly” environments, corpora are now widely used in linguistic research, but these “user-friendly” environments, in spite of their usefulness, also make corpora and corpus studies a kind of “black box”; as a result, users often pay attention only to the output of software while disregarding the input and the process and not examining whether the output can be interpreted as appropriate data for their studies. For the improvement of corpus-based research, I conducted the following surveys: i) I first examined basic data shown in several articles-when errors were detected, I inferred the processes and the causes of the errors, then classified them; ii) from the standpoint of viewing a grammar as a part of the internal state of a speaker, I examined what kinds of information about what aspects of the grammar can be obtained from corpora publicly available now; and iii) I also examined, among other issues, problems concerning representativeness of corpora and corpus data, the reliability and validity of statistical scores such as t-score and MI, and the ambiguity of the term “collocation.”
|