2019 Fiscal Year Research-status Report
Feature visualizer and detector for scientific texts
Project/Area Number |
19K00850
|
Research Institution | The University of Aizu |
Principal Investigator |
BLAKE John 会津大学, コンピュータ理工学部, 准教授 (80635954)
|
Co-Investigator(Kenkyū-buntansha) |
Mozgovoy Maxim 会津大学, コンピュータ理工学部, 准教授 (60571776)
|
Project Period (FY) |
2019-04-01 – 2022-03-31
|
Keywords | lexical patterns / grammatical patterns / genre / feature visualization |
Outline of Annual Research Achievements |
In the first year we have achieved all our target objectives. We annotated a small corpus of short research articles that will form the dataset of the feature visualizer. We have also created a number of explanatory videos to be displayed in the online feature detector. We created some low-fidelity and high-fidelity prototypes in order to select a user-friendly interface with the required functionalities. The base for the feature visualizer was created using Django and Vue.js. This is now deployed online.We have also made progress on the second-year goals. We created software programs that can automatically identify grammatical tenses and voice in Python. We have created an initial prototype for the feature detector, which will allow users to input their own texts for analysis.
|
Current Status of Research Progress |
Current Status of Research Progress
1: Research has progressed more than it was originally planned.
Reason
We have been able to address some of the goals set for the second year. In addition to creating programs that match pre-annotated segments of texts, we have created programs that run on raw text. Initially, we expected to have to rely on using annotations to visualize complex features such as tense and aspect. However, we were able to create a program that works on raw text. This alleviates the need for additional annotations. These functionalities will be incorporated into both the feature visualizer and the feature detector. A prototype for the feature detector is currently deployed online via Heroku. The deployed feature detector currently incorporates readability statistics and lexical profiles (using academic word and academic vocabulary lists).
|
Strategy for Future Research Activity |
In the second year, we aim to improve the feature visualizer by integrating more functionalities, such as tense-aspect identification and various types of information structure (e.g. information flow, information focus and end weight). Our focus will be on developing programs that work on natural language without the need for pre-annotation. This will enable the same functionalities to be deployed in the feature visualizer for the pre-annotated corpus and for the feature detector that is designed for users to input their own texts. The key challenge will be to increase the accuracy and precision of the pattern-matching functions.
|
Causes of Carryover |
The balance of approximately 15000 will be added to the second-year budget.
|
Research Products
(2 results)