2019 年度実施状況報告書

Feature visualizer and detector for scientific texts

研究課題

研究課題/領域番号	19K00850
研究機関	会津大学
研究代表者	BLAKE John 会津大学, コンピュータ理工学部, 准教授 (80635954)
研究分担者	Mozgovoy Maxim 会津大学, コンピュータ理工学部, 准教授 (60571776)
研究期間 (年度)	2019-04-01 – 2022-03-31
キーワード	lexical patterns / grammatical patterns / genre / feature visualization
研究実績の概要	In the first year we have achieved all our target objectives. We annotated a small corpus of short research articles that will form the dataset of the feature visualizer. We have also created a number of explanatory videos to be displayed in the online feature detector. We created some low-fidelity and high-fidelity prototypes in order to select a user-friendly interface with the required functionalities. The base for the feature visualizer was created using Django and Vue.js. This is now deployed online.We have also made progress on the second-year goals. We created software programs that can automatically identify grammatical tenses and voice in Python. We have created an initial prototype for the feature detector, which will allow users to input their own texts for analysis.
現在までの達成度 (区分)	現在までの達成度 (区分) 1: 当初の計画以上に進展している理由 We have been able to address some of the goals set for the second year. In addition to creating programs that match pre-annotated segments of texts, we have created programs that run on raw text. Initially, we expected to have to rely on using annotations to visualize complex features such as tense and aspect. However, we were able to create a program that works on raw text. This alleviates the need for additional annotations. These functionalities will be incorporated into both the feature visualizer and the feature detector. A prototype for the feature detector is currently deployed online via Heroku. The deployed feature detector currently incorporates readability statistics and lexical profiles (using academic word and academic vocabulary lists).
今後の研究の推進方策	In the second year, we aim to improve the feature visualizer by integrating more functionalities, such as tense-aspect identification and various types of information structure (e.g. information flow, information focus and end weight). Our focus will be on developing programs that work on natural language without the need for pre-annotation. This will enable the same functionalities to be deployed in the feature visualizer for the pre-annotated corpus and for the feature detector that is designed for users to input their own texts. The key challenge will be to increase the accuracy and precision of the pattern-matching functions.
次年度使用額が生じた理由	The balance of approximately 15000 will be added to the second-year budget.

研究成果
(2件)

すべて雑誌論文 (1件) (うち国際共著 1件、査読あり 1件、オープンアクセス 1件) 学会発表 (1件) (うち国際学会 1件)

[雑誌論文] Annotated scientific text visualizer: Design, development and deployment2019
- 著者名/発表者名
  Blake, John
- 雑誌名
  
  CALL and complexity - EUROCALL
  
  巻: 1 ページ: 45-50
- DOI
  10.14705/rpnet.2019.38.984
- 査読あり / オープンアクセス / 国際共著
[学会発表] Generic integrity: Visualizing lexicogrammatical features in scientific articles2019
- 著者名/発表者名
  Blake, John
- 学会等名
  British Association of Applied Linguistics Conference
- 国際学会