• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2017 Fiscal Year Research-status Report

統辞・意味解析情報タグ付き日本語ツリーバンクからの視覚意味情報の抽出と応用

Research Project

Project/Area Number 15K02469
Research InstitutionNational Institute for Japanese Language and Linguistics

Principal Investigator

バトラー アラステア  大学共同利用機関法人人間文化研究機構国立国語研究所, 大学共同利用機関等の部局等, 研究員 (90588873)

Project Period (FY) 2015-04-01 – 2019-03-31
Keywordssemantic dependencies / parsed corpus / visualisation / annotation / predicate arguments / discourse relations
Outline of Annual Research Achievements

The research aim has been to develop methods of visualising and making accessible semantic information from analyses of Japanese and English, e.g., predicate argument information, but also higher levels of analysis, such as propositional connectives as well as modals, negation and factors of discourse.

The key part of this work has been the development of a visualisation tool for semantic relationships derivable from a parsed corpus. This enables human annotators to assess whether their interpretations of discourse have been adequately captured by the parsed corpus. As now realised, this tool has the capability of capturing many relationships found in discourse, providing a framework in which a fleshed out account of semantic roles, quantification, and modality becomes feasible.

Current Status of Research Progress
Current Status of Research Progress

1: Research has progressed more than it was originally planned.

Reason

The visualisation tool is now being used as a key part in the creation and presentation chain of three corpus resources: the NINJAL Parsed Corpus of Modern Japanese (NPCMJ; (http://npcmj.ninjal.ac.jp), the Oxford-NINJAL Corpus of Old Japanese (ONCOJ; http://oncoj.ninjal.ac.jp/?lang=en), and the Treebank Semantics Parsed Corpus (TSPC; http://www.compling.jp/ajb129/tspc.html).

The developed visualisation tool has revealed layers of dependencies that were not easily visible before. At the same time, the tool has revealed
inadequacies of analyses in the present state of the corpus data.

Strategy for Future Research Activity

Until now, two essential components for establishing semantic dependencies (allocation of "sort" information and the specification of clause linkages) have been handled by a small number of specialists who are able to cache out the results of complex grammatical rules (such as involve an antecedent hierarchy) and build these into annotation information without the aid of visualisation tools.
Now the project is in a position to turn these tasks over to non-specialists who need only have intuitions about meaningful relationships in texts and enough knowledge to be able to spot whether they are represented in the visualisation or not.
Only after reviewing the results of a program of annotation that takes advantage of this new technology can the adequacy of the tool be properly assessed, and the feasibility of including additional layers of semantic information be ascertained.
For the remainder of the term of the project the plan is to increase the volume of relevant data by hiring annotators, and to publicise the results of the project domestically and abroad at academic conferences.

Causes of Carryover

The developed visualisation tool has revealed layers of dependencies that were not easily visible before. At the same time, the tool has revealed inadequacies of analyses in the present state of the corpus data.

For the remainder of the term of the project the plan is to increase the volume and quality of relevant data by hiring annotators, and to publicise the results of the project domestically and abroad at academic conferences.

Remarks

The Treebank Semantics Parsed Corpus (TSPC) and Keyaki Treebank are corpus resources that can be viewed and downloaded. Treebank Semantics implements obtaining meaning representations.

  • Research Products

    (8 results)

All 2018 2017 Other

All Journal Article (2 results) (of which Int'l Joint Research: 1 results,  Open Access: 2 results,  Peer Reviewed: 1 results) Presentation (3 results) Remarks (3 results)

  • [Journal Article] 統語解析情報付きコーパス検索用インタフェースの開発2018

    • Author(s)
      長崎郁 and アラステア・バトラー and スティーブン・ライト・ホーン and プラシャント・パルデシ and 吉本
    • Journal Title

      『言語処理学会第24回年次大会発表論文集』

      Volume: - Pages: 1123--1126

    • Open Access
  • [Journal Article] Annotating syntax and lexical semantics with(out) indexing2017

    • Author(s)
      Alastair Butler and Stephen Wright Horn
    • Journal Title

      Proceedings of the Fourteenth International Workshop of Logic and Engineering of Natural Language Semantics (LENLS 14)

      Volume: - Pages: -

    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Presentation] Developing a model of typical Japanese grammar development: The role of parsed corpora and parsing programs2017

    • Author(s)
      Susanne Miyata and Alastair Butler
    • Organizer
      Exploiting Parsed Corpora: Applications in Research, Pedagogy, and Processing
  • [Presentation] Developing a model of typical Japanese grammar development: The role of parsed corpora and parsing programs2017

    • Author(s)
      Stephen Wright Horn, Alastair Butler and Iku Nagasaki
    • Organizer
      Exploiting Parsed Corpora: Applications in Research, Pedagogy, and Processing
  • [Presentation] Annotating syntax and lexical semantics with(out) indexing2017

    • Author(s)
      Alastair Butler and Stephen Wright Horn
    • Organizer
      Logic and Engineering of Natural Language Semantics (LENLS 14)
  • [Remarks] The Treebank Semantics Parsed Corpus (TSPC)

    • URL

      http://www.compling.jp/ajb129/tspc.html

  • [Remarks] Treebank Semantics

    • URL

      http://www.compling.jp/ajb129/ts.html

  • [Remarks] The Keyaki Treebank Homepage

    • URL

      http://www.compling.jp/keyaki/

URL: 

Published: 2018-12-17  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi