研究実績の概要 |
Last year we were working to finish the corpus and develop the word list. We digitalized all the texts required for the corpus, including, scanning, OCRing and cleaning them. We also worked with our partner schools to make sure the text in the corpus were reflective of the texts being used in the classroom. We have also cleaned and processed the files following the procedure set out by previous studies (e.g. Green & Lambert, 2018; Greene & Coxhead, 2015). Cleaning and processing induced manually checking the text files for errors that may have occurred during the OCR process, removing page numbers, tagging headings, caption, figures, and tables so they are easy to identify when processing the texts. These texts were then compiled into a final database for analysis. We are now in the process of processing the files in order to identify words for inclusion in the corpus and checking the validity of the resulting wordlist. Following the methodology of previous studies, this includes lemmatizing and tagging the text files for POS using R and Spacy. We have started compiling a number of reference corpora, both of similar and different text types, to be used to validate the word list on. When this has finished, we will compare our word list to existing word lists against these reference corpora.
|
今後の研究の推進方策 |
We now need to check the word lists against the corpora we are creating and being to write the final papers for publication. Over the next year, we have identified three potential papers that we can write using the research we have conducted to date. The next semester will be used to write up these papers, and submit them to international journals. We will also present our research at a number of international conferences.
|
次年度使用額が生じた理由 |
Due to the difficulties presenting at International conferences caused by COVID a number of our planned conferences were done virtually, rather than face to face. As a result, the money originally planned for travel to these conferences was not used. We are hoping to be able to present at more international conferences this year as they are an important part of networking and making other researchers aware of the project, as well as getting feedback.
|