• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Feature visualizer and detector for scientific texts

Research Project

Project/Area Number 19K00850
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Review Section Basic Section 02100:Foreign language education-related
Research InstitutionThe University of Aizu

Principal Investigator

BLAKE John  会津大学, コンピュータ理工学部, 上級准教授 (80635954)

Co-Investigator(Kenkyū-buntansha) Mozgovoy Maxim  会津大学, コンピュータ理工学部, 准教授 (60571776)
Project Period (FY) 2019-04-01 – 2022-03-31
Project Status Granted (Fiscal Year 2020)
Budget Amount *help
¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)
Fiscal Year 2021: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2020: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2019: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Keywordslanguage processing / feature extraction / tense identification / feature visualization / lexical patterns / grammatical patterns / genre / visualization / language features / nlp / iCALL
Outline of Research at the Start

This research aims to develop and evaluate an interactive online multimedia tool that can visualize the typical language features in scientific texts written in English. There are two functionalities. (1) The feature visualizer shows and explains commonly-used language features present in a corpus of fully-annotated short research articles. (2) The feature detector identifies core language features in texts submitted by users. This helps students compare their own writing to expected conventions in scientific writing.

Outline of Annual Research Achievements

In the second year, we aimed to and were able to improve the feature detector by integrating more functionalities, such as tense-aspect identification and various types of information structure.
The tense-aspect identification function classifies and labels grammatical tenses using the twelve commonly-used terms (e.g. past progressive, future perfect, etc.). The tense-aspect identification function also classifies finite verbs by voice, and so that feature will also be available for users. The information structure function, which identifies information focus, information flow and end-weight is currently deployed. In both functionalities the accuracy and precision can be further improved.
In the deployed feature detector, for any text submitted users can: 1. Create a text profile using standard lists such as the academic word list and academic vocabulary list; 2. Identify particular sets of words, such as TOEIC vocabulary and words related to computer science; 3. Display readability indices (e.g. Gunning Fog and Flesch Kincaid scores); 4. Show text statistics (e.g. percent of complex words, average words per sentence); 5. Identify whether sentences are front-heavy or adhere to the end-weight principle; 6. Display the thematic development of subsequent sentences (e.g. constant or ruptured), and 7. Show the information focus (e.g. new or given information). Links to the deployed version are available on the homepage of the principal investigator.

Current Status of Research Progress
Current Status of Research Progress

1: Research has progressed more than it was originally planned.

Reason

Many of the technical challenges have been overcome. The primary focus now is on increasing the accuracy and precision of pattern-matching functions, and increasing the usability of the system.

Strategy for Future Research Activity

In the third year our focus will be on increasing the usability of both the text visualizer, which reveals language features in a pre-annotated corpus and the text detector, which shows language features in raw text. Functionalities developed for the text detector that can be adapted for use in the text visualizer will be identified and incorporated. A systematic evaluation of the accuracy, usability and efficacy will be conducted to identify areas for future work.

Report

(2 results)
  • 2020 Research-status Report
  • 2019 Research-status Report

Research Products

(6 results)

All 2020 2019

All Journal Article (5 results) (of which Int'l Joint Research: 5 results,  Peer Reviewed: 5 results,  Open Access: 5 results) Presentation (1 results) (of which Int'l Joint Research: 1 results)

  • [Journal Article] Development of an online tense and aspect identifier for English2020

    • Author(s)
      Blake, John
    • Journal Title

      CALL for widening participation: short papers from EUROCALL 2020

      Volume: 1 Pages: 36--41

    • DOI

      10.14705/rpnet.2020.48.1161

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] English Verb Analyzer: Identifying tense, voice, aspect, sense and grammatical meaning in context for pedagogic purposes.2020

    • Author(s)
      Blake, John
    • Journal Title

      Proceedings of 8th Swedish Language Technology Conference 2020

      Volume: 1 Pages: 1--5

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Automatic identification of tense and grammatical meaning in context2020

    • Author(s)
      Blake, John
    • Journal Title

      Proceedings of the International Conference on Computers in Education 2020

      Volume: 2 Pages: 739--742

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Generic integrity: Visualizing lexicogrammatical features in scientific articles.2020

    • Author(s)
      Blake, John
    • Journal Title

      Selected online proceedings of the British Association of Applied Linguists Annual Conference 2019

      Volume: 1 Pages: 1--3

    • Related Report
      2020 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Annotated scientific text visualizer: Design, development and deployment2019

    • Author(s)
      Blake, John
    • Journal Title

      CALL and complexity - EUROCALL

      Volume: 1 Pages: 45-50

    • DOI

      10.14705/rpnet.2019.38.984

    • Related Report
      2019 Research-status Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Presentation] Generic integrity: Visualizing lexicogrammatical features in scientific articles2019

    • Author(s)
      Blake, John
    • Organizer
      British Association of Applied Linguistics Conference
    • Related Report
      2019 Research-status Report
    • Int'l Joint Research

URL: 

Published: 2019-04-18   Modified: 2021-12-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi