2021 Fiscal Year Annual Research Report
Natural language processing for academic writing in English
Project/Area Number |
18K11446
|
Research Institution | The University of Kitakyushu |
Principal Investigator |
Goh ChooiLing 北九州市立大学, 国際環境工学部, 特任准教授 (90531616)
|
Co-Investigator(Kenkyū-buntansha) |
LEPAGE YVES 早稲田大学, 理工学術院(情報生産システム研究科・センター), 教授 (70573608)
|
Project Period (FY) |
2018-04-01 – 2022-03-31
|
Keywords | academic writing aids / word embeddings / lexical bundles / text style transfer / sentence embeddings / text generation |
Outline of Annual Research Achievements |
In the extended year, a word embedding model trained on the ACL Anthology Reference Corpus (ACL-ARC) and some public available masked language models such as BERT and SciBERT were used to fill in some blanks for verbs used in academic writing in NLP field. Our experiments show promising results and motivate us to build a learning and writing system that includes both word embedding and masked language model features. We have setup a website to hold the word infilling system. This system can help the users to look for words that have similar word vectors or propose some word candidates based on the surrounding contexts (Published 1 paper at international conference with reviewing committee, PACLIC 2021) The research aims to transform the research ideas in uncertain and simple writing into professional writing, which helps the researchers to publish more papers in shorter time and lower cost. This can lead them to larger diffusion and to be taken into account in global rankings. In order to achieve the goal, we have (1) collected the texts from ACL-ARC and used in our experiments; and (2) built a word embedding model and (3) extracted lexical bundles from it. Furthermore, (4) a text style transfer model between abstract and conclusion was developed, and (5) the typicality of a lexical bundle to appear in a given type of section was calculated. This typicality measure ensures the use of plagiarism-free lexical bundles in given sections of articles. Finally, (6) a writing system that allows the users to search for suitable vocabularies when writing an academic article was built.
|