2012 Fiscal Year Final Research Report
On the Objectivity and Reliability of Corpus Data for Linguistic Research
Project/Area Number |
22520494
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
English linguistics
|
Research Institution | Nagoya University |
Principal Investigator |
OHNA Tsutomu 名古屋大学, 国際開発研究科, 教授 (00233205)
|
Project Period (FY) |
2010 – 2012
|
Keywords | 言語学 / コーパス / 文法 / 語法 |
Research Abstract |
With the sizes of corpora increasingly large and with the development and spread of “user-friendly” environments, corpora are now widely used in linguistic research, but these “user-friendly” environments, in spite of their usefulness, also make corpora and corpus studies a kind of “black box”; as a result, users often pay attention only to the output of software while disregarding the input and the process and not examining whether the output can be interpreted as appropriate data for their studies. For the improvement of corpus-based research, I conducted the following surveys: i) I first examined basic data shown in several articles-when errors were detected, I inferred the processes and the causes of the errors, then classified them; ii) from the standpoint of viewing a grammar as a part of the internal state of a speaker, I examined what kinds of information about what aspects of the grammar can be obtained from corpora publicly available now; and iii) I also examined, among other issues, problems concerning representativeness of corpora and corpus data, the reliability and validity of statistical scores such as t-score and MI, and the ambiguity of the term “collocation.”
|
Research Products
(9 results)