2007 Fiscal Year Final Research Report Summary
Automatic Measuring of English Language Proficiency for Global Communications
Project/Area Number |
16300048
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Doshisha University (2005-2007) Advanced Telecommunications Research Institute International (2004) |
Principal Investigator |
YAMAMOTO Seiichi Doshisha University, Faculty of Engineering, Professor (20374100)
|
Co-Investigator(Kenkyū-buntansha) |
SUMITA Eiichiro Doshisha University, ATR, SLP Laboratories, Department head (90395020)
YASUDA Keiji Doshisha University, ATR, SLP Laboratories, Researcher (50395018)
YANAGIDA Masuzo Doshisha University, Faculty of Engineering, Professor (00116120)
YOSHIDA Kensaku Sophia, University, Faculty of Foreign studies, Professor (80053718)
NISHINO Haruo Doshisha University, Institute for Language and Culture, Associate Professor (50172680)
|
Project Period (FY) |
2004 – 2007
|
Keywords | e-Learning / TOEIC / translatability / English corpus / learner corpus / fill-in-the-blank question / measuring of translation quality / measuring of proficiency |
Research Abstract |
Purposes of this re arch are (1) creation of a learner corpus which contains English sentences translated by various Japanese subjects of different English proficiency and some additional items available for subjective and objective evaluation of translation quality, and (2) development of an automatic measuring method of English proficiency. Main research results are as follows ; 1. As tools for automatic measuring English proficiency, we created a learner corpus of 150,000 English sentences translated by 500 Japanese subjects, of which Japanese source sentences were randomly selected from BTEC (Basic Travel Expression Corpus), English-Japanese parallel corpus of 500,000 paired sentences, and textbooks on English for junior and senior high schools. Ten reference sentences per each source sentence were made to measure some features such as edit-distance, similarity of n-grams between the reference sentences and translated sentences by the subject. Qualities of some of translated sentenc
… More
es in the learner corpus were subjectively evaluated by native English speakers who are frequent speaker of Japanese. 2. Correlation were calculated between subjective evaluation by the English native speakers and objective evaluation such as BLEU for some English sentences translated by the subjects, and high correlation was assured if dozens of English sentences were utilized. This experimental result assured that the objective evaluation method of BLEU, which was proposed to objectively evaluate quality of sentences translated with MT technologies, can be used for measuring English proficiency of the subjects. 3. Different method for selecting target sentences for translation were checked necessary number of target sentences in order to obtain a high correlation score, and one selection method for obtaining high correlation with smaller number sentences was proposed. 4. The learner corpus is expected to be used for various applications besides automatic measuring of English proficiency. One of the other applications is to create a probabilistic language model which represents lexical and syntactical difference between utterances by native English speakers and Japanese. In order to verify the idea, we created a probabilistic language model such bi-gram and tri-gram and linearly interpolated them with a language models which were trained with BTEC corpus, a large text corpus of canonical English sentences. The experimental results using the linearly interpolated language model shown that the language model provided better word accuracy than a language model trained with BTEC alone. This shows that the language model adaptation between a model trained with the learner corpus and a model trained with large corpus of native English is effective for compensating for the mismatch between the lexical and syntactical characteristics of native speakers and second language speakers. Less
|
Research Products
(21 results)
-
-
[Journal Article] The ATR Multilingual Speech-to-Speech Translation System2006
Author(s)
S. Nakamura, K. Markov, H. Nakaiwa, G. Kikui, H. Kawai, T. Jitsuhiro, J. Zhang, H. Yamamoto, E. Sumita, S. Yamamoto
-
Journal Title
IEEE Trans. ASLP Vol.14, No.2
Pages: 365-376
Description
「研究成果報告書概要(欧文)」より
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-