Project/Area Number |
11551009
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 展開研究 |
Research Field |
国語学
|
Research Institution | Tokyo Metropolitan University |
Principal Investigator |
OGINO Tsunao Tokyo Metropolitan Univ., Faculty of Social Sciences and Humanities, Professor, 人文学部, 教授 (00111443)
|
Co-Investigator(Kenkyū-buntansha) |
KUMAGAI Yasuo The National Institute for Japanese Language, Department of Language Information and Resources, Director of Division, 情報資料部門, 部門長 (30215016)
SANADA Shinji Osaka Univ., Faculty of Letters, Professor, 大学院・文学研究科, 教授 (00099912)
MIYAJIMA Tatsuo Kyoto Tachibana Women's Univ., Faculty of Letters, Professor, 文学部, 教授 (30099915)
中野 洋 国立国語研究所, 日本語教育センター, センター長 (40000426)
|
Project Period (FY) |
1999 – 2001
|
Project Status |
Completed (Fiscal Year 2001)
|
Budget Amount *help |
¥7,800,000 (Direct Cost: ¥7,800,000)
Fiscal Year 2001: ¥2,200,000 (Direct Cost: ¥2,200,000)
Fiscal Year 2000: ¥2,700,000 (Direct Cost: ¥2,700,000)
Fiscal Year 1999: ¥2,900,000 (Direct Cost: ¥2,900,000)
|
Keywords | quantitative study / database of research papers / CD-ROM / Japanese linguistics |
Research Abstract |
1. We prepared a preliminary trial version of CD-ROM of Japanese Quantitative papers. We then recruited some monitors and we asked them to evaluate the CD-ROM and to give us some suggestions. We considered some valid and useful usage of the CD-ROM. 2. We utilized five Optical Character Reader softwares and put some scanned papers into the softwares in order to evaluate OCR softwares whether they can be practically used in the process of character recognition or not. The result showed that even the most effective software cannot recognize our documents completely. The error level is some percents order. The fact says that OCR softwares are far from practically successful level, and we cannot utilize those softwares as an input method. 3. We surveyed hundreds of research papers in each research field, and tried to realize the quantity of data of each paper. We classified hard-copied research papers which will be stored in the CD-ROM according to its field (vocabulary, sociolinguistics, letters, dialect, phonetics, and grammar), and checked the quantity of each data. The result indicated that the quantity of data of each paper is full of variance, and we found it difficult to grasp a tendency of quantity. The reason is that each research has its own specific purpose different from that of other papers.
|