研究実績の概要 |
In the extended year, a word embedding model trained on the ACL Anthology Reference Corpus (ACL-ARC) and some public available masked language models such as BERT and SciBERT were used to fill in some blanks for verbs used in academic writing in NLP field. Our experiments show promising results and motivate us to build a learning and writing system that includes both word embedding and masked language model features. We have setup a website to hold the word infilling system. This system can help the users to look for words that have similar word vectors or propose some word candidates based on the surrounding contexts (Published 1 paper at international conference with reviewing committee, PACLIC 2021) The research aims to transform the research ideas in uncertain and simple writing into professional writing, which helps the researchers to publish more papers in shorter time and lower cost. This can lead them to larger diffusion and to be taken into account in global rankings. In order to achieve the goal, we have (1) collected the texts from ACL-ARC and used in our experiments; and (2) built a word embedding model and (3) extracted lexical bundles from it. Furthermore, (4) a text style transfer model between abstract and conclusion was developed, and (5) the typicality of a lexical bundle to appear in a given type of section was calculated. This typicality measure ensures the use of plagiarism-free lexical bundles in given sections of articles. Finally, (6) a writing system that allows the users to search for suitable vocabularies when writing an academic article was built.
|