• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2016 Fiscal Year Research-status Report

統辞・意味解析情報タグ付き日本語ツリーバンクからの視覚意味情報の抽出と応用

Research Project

Project/Area Number 15K02469
Research InstitutionNational Institute for Japanese Language and Linguistics

Principal Investigator

バトラー アラステア  大学共同利用機関法人人間文化研究機構国立国語研究所, 理論・対照研究領域, プロジェクト非常勤研究員 (90588873)

Project Period (FY) 2015-04-01 – 2018-03-31
Keywordsコーパス / 日本語 / 意味論 / 統語論
Outline of Annual Research Achievements

The research aims to develop methods of visualising and making accessible semantic information, e.g., predicate argument information, but also higher levels of analysis, such as propositional connectives that distinguish between coordination and subordination of structure. Such information enables, for example, mapping out binding dependencies, which has proved relevant as a method to reconstruct unpronounced argument information (zero pronouns) for Japanese, and extract valence patterns for predicates, an essential part of word meaning.

To carry out this work it has been necessary to continue developing a method for reaching semantic representations automatically from syntactic parsed representations and to create a large base of already analysed and human checked syntactic structures that can be transformed to semantic representations. The establishment of such a base forms training data for creating yet more like data, with the potential to scale to large volumes of data.

Current Status of Research Progress
Current Status of Research Progress

1: Research has progressed more than it was originally planned.

Reason

The pipeline for producing analysed data has continued to improve. Models resulting from training are slightly smaller than a year ago despite a large increase in new data, reflecting improvements to the annotation.

The work on developing methods of visualising and making accessible semantic information has focused on ways to embed information back into parsed data. This has led to the enrichment of the existing corpus data with a second layer of special-purpose annotation made up of indexing information. This corpus semantic information can now be searched because of a transformation to the TIGER-XML format that includes a structure sharing mechanism (multi-dominance) that can be queried.

Research results can be seen in the interfaces of the NINJAL Parsed Corpus of Modern Japanese (NPCMJ; http://npcmj.ninjal.ac.jp/interfaces/), where, aside from a default tree view of the syntactic annotation, examples can be seen (semantic view) as predicate logic formulas capturing semantic content, as well as a view (indexed view) that embeds the calculated semantic content into the trees as indexing information. In addition, there is a visualisation for how the semantics was derived (eval view).

Strategy for Future Research Activity

The semantic component will continue to be developed, especially in use as a basis for visualising dependencies. The existing indexing component will be extended so as to produce the character-indexed report format of FrameNet. This will allow creation of browsable reports that display semantic dependencies in a very intuitive way.

A new "scaffolding" component will be built as a layer of automated analysis to further specify part-of-speech analysis derived from systems of morphological analysis (mecab/Comainu). It is expected that additional specification will lead to improvements of the automatic parsing.

The project will also be extending the range of data analysed
to more genres and to historical Japanese texts.

Causes of Carryover

Money has been carried over to pay for assistance in the process of
undertaking human annotation correction.

Expenditure Plan for Carryover Budget

Money has been carried over to pay for assistance in the process of
undertaking human annotation correction.

  • Research Products

    (11 results)

All 2017 2016 Other

All Journal Article (4 results) (of which Int'l Joint Research: 3 results,  Open Access: 4 results,  Peer Reviewed: 3 results,  Acknowledgement Compliant: 3 results) Presentation (5 results) (of which Int'l Joint Research: 3 results,  Invited: 2 results) Remarks (1 results) Funded Workshop (1 results)

  • [Journal Article] Keyaki Treebank segmentation and part-of-speech labelling2017

    • Author(s)
      Alastair Butler and Stephen Wright Horn and Kei Yoshimoto
    • Journal Title

      言語処理学会第23回年次大会発表論文集

      Volume: なし Pages: 414-417

    • Open Access
  • [Journal Article] From meaning representations to syntactic trees2016

    • Author(s)
      Alastair Butler
    • Journal Title

      Proceedings of the Thirteenth International Workshop of Logic and Engineering of Natural Language Semantics 13 (LENLS 13)

      Volume: なし Pages: 147-160

    • Peer Reviewed / Open Access / Int'l Joint Research / Acknowledgement Compliant
  • [Journal Article] DynamicPower at SemEval-2016 Task 8: Processing syntactic parse trees with a Dynamic Semantics core2016

    • Author(s)
      Alastair Butler
    • Journal Title

      Proceedings of SemEval-2016

      Volume: なし Pages: 1148-1153

    • Peer Reviewed / Open Access / Int'l Joint Research / Acknowledgement Compliant
  • [Journal Article] Deterministic natural language generation from meaning representations for machine translation2016

    • Author(s)
      Alastair Butler
    • Journal Title

      Proceedings of the 2nd Workshop on Semantics-Driven Machine Translation

      Volume: なし Pages: 1-9

    • Peer Reviewed / Open Access / Int'l Joint Research / Acknowledgement Compliant
  • [Presentation] From meaning representations to syntactic trees2016

    • Author(s)
      Alastair Butler
    • Organizer
      Logic and Engineering of Natural Language Semantics (LENLS 13)
    • Place of Presentation
      Tokyo, Japan
    • Year and Date
      2016-11-15
    • Int'l Joint Research
  • [Presentation] Treebank annotation of FraCaS and JSeM2016

    • Author(s)
      Alastair Butler, Ai Kubota, Shota Hiyama and Kei Yoshimoto
    • Organizer
      Logic and Engineering of Natural Language Semantics (LENLS 13)
    • Place of Presentation
      Tokyo, Japan
    • Year and Date
      2016-11-13
    • Int'l Joint Research
  • [Presentation] Parsed Corpus Semantics2016

    • Author(s)
      Alastair Butler
    • Organizer
      New Landscapes in Theoretical Computational Linguistics
    • Place of Presentation
      Ohio State University, USA
    • Year and Date
      2016-10-15
    • Invited
  • [Presentation] A parsed corpus of Japanese enriched to reach levels of semantic analysis2016

    • Author(s)
      Alastair Butler, Shiro Akasegawa, Prashant Pardeshi and Kei Yoshimoto
    • Organizer
      なし
    • Place of Presentation
      Brandeis University, Boston, USA
    • Year and Date
      2016-09-02
    • Invited
  • [Presentation] Deterministic natural language generation from meaning representations for machine translation2016

    • Author(s)
      Alastair Butler
    • Organizer
      2nd Workshop on Semantics-Driven Machine Translation
    • Place of Presentation
      San Diego, California
    • Year and Date
      2016-06-16
    • Int'l Joint Research
  • [Remarks] Alastair Butler - Homepage

    • URL

      http://www.compling.jp/ajb129/index.html

  • [Funded Workshop] Unshared Task at LENLS 13 (Theory and System analysis with FraCaS, MultiFraCaS and JSeM Test Suites)2016

    • Place of Presentation
      National Institute for Japanese Language and Linguistics, Tokyo, Japan
    • Year and Date
      2016-11-13 – 2016-11-13

URL: 

Published: 2018-01-16  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi