研究実績の概要 |
In the second year, we aimed to and were able to improve the feature detector by integrating more functionalities, such as tense-aspect identification and various types of information structure. The tense-aspect identification function classifies and labels grammatical tenses using the twelve commonly-used terms (e.g. past progressive, future perfect, etc.). The tense-aspect identification function also classifies finite verbs by voice, and so that feature will also be available for users. The information structure function, which identifies information focus, information flow and end-weight is currently deployed. In both functionalities the accuracy and precision can be further improved. In the deployed feature detector, for any text submitted users can: 1. Create a text profile using standard lists such as the academic word list and academic vocabulary list; 2. Identify particular sets of words, such as TOEIC vocabulary and words related to computer science; 3. Display readability indices (e.g. Gunning Fog and Flesch Kincaid scores); 4. Show text statistics (e.g. percent of complex words, average words per sentence); 5. Identify whether sentences are front-heavy or adhere to the end-weight principle; 6. Display the thematic development of subsequent sentences (e.g. constant or ruptured), and 7. Show the information focus (e.g. new or given information). Links to the deployed version are available on the homepage of the principal investigator.
|
今後の研究の推進方策 |
In the third year our focus will be on increasing the usability of both the text visualizer, which reveals language features in a pre-annotated corpus and the text detector, which shows language features in raw text. Functionalities developed for the text detector that can be adapted for use in the text visualizer will be identified and incorporated. A systematic evaluation of the accuracy, usability and efficacy will be conducted to identify areas for future work.
|