2017 Fiscal Year Research-status Report

統辞・意味解析情報タグ付き日本語ツリーバンクからの視覚意味情報の抽出と応用

Research Project

Project/Area Number	15K02469
Research Institution	National Institute for Japanese Language and Linguistics
Principal Investigator	バトラーアラステア大学共同利用機関法人人間文化研究機構国立国語研究所, 大学共同利用機関等の部局等, 研究員 (90588873)
Project Period (FY)	2015-04-01 – 2019-03-31
Keywords	semantic dependencies / parsed corpus / visualisation / annotation / predicate arguments / discourse relations
Outline of Annual Research Achievements	The research aim has been to develop methods of visualising and making accessible semantic information from analyses of Japanese and English, e.g., predicate argument information, but also higher levels of analysis, such as propositional connectives as well as modals, negation and factors of discourse. The key part of this work has been the development of a visualisation tool for semantic relationships derivable from a parsed corpus. This enables human annotators to assess whether their interpretations of discourse have been adequately captured by the parsed corpus. As now realised, this tool has the capability of capturing many relationships found in discourse, providing a framework in which a fleshed out account of semantic roles, quantification, and modality becomes feasible.
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason The visualisation tool is now being used as a key part in the creation and presentation chain of three corpus resources: the NINJAL Parsed Corpus of Modern Japanese (NPCMJ; (http://npcmj.ninjal.ac.jp), the Oxford-NINJAL Corpus of Old Japanese (ONCOJ; http://oncoj.ninjal.ac.jp/?lang=en), and the Treebank Semantics Parsed Corpus (TSPC; http://www.compling.jp/ajb129/tspc.html). The developed visualisation tool has revealed layers of dependencies that were not easily visible before. At the same time, the tool has revealed inadequacies of analyses in the present state of the corpus data.
Strategy for Future Research Activity	Until now, two essential components for establishing semantic dependencies (allocation of "sort" information and the specification of clause linkages) have been handled by a small number of specialists who are able to cache out the results of complex grammatical rules (such as involve an antecedent hierarchy) and build these into annotation information without the aid of visualisation tools. Now the project is in a position to turn these tasks over to non-specialists who need only have intuitions about meaningful relationships in texts and enough knowledge to be able to spot whether they are represented in the visualisation or not. Only after reviewing the results of a program of annotation that takes advantage of this new technology can the adequacy of the tool be properly assessed, and the feasibility of including additional layers of semantic information be ascertained. For the remainder of the term of the project the plan is to increase the volume of relevant data by hiring annotators, and to publicise the results of the project domestically and abroad at academic conferences.
Causes of Carryover	The developed visualisation tool has revealed layers of dependencies that were not easily visible before. At the same time, the tool has revealed inadequacies of analyses in the present state of the corpus data. For the remainder of the term of the project the plan is to increase the volume and quality of relevant data by hiring annotators, and to publicise the results of the project domestically and abroad at academic conferences.
Remarks	The Treebank Semantics Parsed Corpus (TSPC) and Keyaki Treebank are corpus resources that can be viewed and downloaded. Treebank Semantics implements obtaining meaning representations.

Research Products
(8 results)

All 2018 2017 Other

All Journal Article (2 results) (of which Int'l Joint Research: 1 results, Open Access: 2 results, Peer Reviewed: 1 results) Presentation (3 results) Remarks (3 results)

[Journal Article] 統語解析情報付きコーパス検索用インタフェースの開発2018
- Author(s)
  長崎郁 and アラステア・バトラー and スティーブン・ライト・ホーン and プラシャント・パルデシ and 吉本
- Journal Title
  
  『言語処理学会第24回年次大会発表論文集』
  
  Volume: - Pages: 1123--1126
- Open Access
[Journal Article] Annotating syntax and lexical semantics with(out) indexing2017
- Author(s)
  Alastair Butler and Stephen Wright Horn
- Journal Title
  
  Proceedings of the Fourteenth International Workshop of Logic and Engineering of Natural Language Semantics (LENLS 14)
  
  Volume: - Pages: -
- Peer Reviewed / Open Access / Int'l Joint Research
[Presentation] Developing a model of typical Japanese grammar development: The role of parsed corpora and parsing programs2017
- Author(s)
  Susanne Miyata and Alastair Butler
- Organizer
  Exploiting Parsed Corpora: Applications in Research, Pedagogy, and Processing
[Presentation] Developing a model of typical Japanese grammar development: The role of parsed corpora and parsing programs2017
- Author(s)
  Stephen Wright Horn, Alastair Butler and Iku Nagasaki
- Organizer
  Exploiting Parsed Corpora: Applications in Research, Pedagogy, and Processing
[Presentation] Annotating syntax and lexical semantics with(out) indexing2017
- Author(s)
  Alastair Butler and Stephen Wright Horn
- Organizer
  Logic and Engineering of Natural Language Semantics (LENLS 14)
[Remarks] The Treebank Semantics Parsed Corpus (TSPC)
- URL
  http://www.compling.jp/ajb129/tspc.html
[Remarks] Treebank Semantics
- URL
  http://www.compling.jp/ajb129/ts.html
[Remarks] The Keyaki Treebank Homepage
- URL
  http://www.compling.jp/keyaki/

2017 Fiscal Year Research-status Report

統辞・意味解析情報タグ付き日本語ツリーバンクからの視覚意味情報の抽出と応用

Principal Investigator

バトラー アラステア 大学共同利用機関法人人間文化研究機構国立国語研究所, 大学共同利用機関等の部局等, 研究員 (90588873)

Current Status of Research Progress

Reason

Research Products

[Journal Article] 統語解析情報付きコーパス検索用インタフェースの開発2018

Author(s)

Journal Title

[Journal Article] Annotating syntax and lexical semantics with(out) indexing2017

Author(s)

Journal Title

[Presentation] Developing a model of typical Japanese grammar development: The role of parsed corpora and parsing programs2017

Author(s)

Organizer

[Presentation] Developing a model of typical Japanese grammar development: The role of parsed corpora and parsing programs2017

Author(s)

Organizer

[Presentation] Annotating syntax and lexical semantics with(out) indexing2017

Author(s)

Organizer

[Remarks] The Treebank Semantics Parsed Corpus (TSPC)

URL

[Remarks] Treebank Semantics

URL

[Remarks] The Keyaki Treebank Homepage

URL

バトラーアラステア大学共同利用機関法人人間文化研究機構国立国語研究所, 大学共同利用機関等の部局等, 研究員 (90588873)