2016 Fiscal Year Research-status Report

統辞・意味解析情報タグ付き日本語ツリーバンクからの視覚意味情報の抽出と応用

Research Project

Project/Area Number	15K02469
Research Institution	National Institute for Japanese Language and Linguistics
Principal Investigator	バトラーアラステア大学共同利用機関法人人間文化研究機構国立国語研究所, 理論・対照研究領域, プロジェクト非常勤研究員 (90588873)
Project Period (FY)	2015-04-01 – 2018-03-31
Keywords	コーパス / 日本語 / 意味論 / 統語論
Outline of Annual Research Achievements	The research aims to develop methods of visualising and making accessible semantic information, e.g., predicate argument information, but also higher levels of analysis, such as propositional connectives that distinguish between coordination and subordination of structure. Such information enables, for example, mapping out binding dependencies, which has proved relevant as a method to reconstruct unpronounced argument information (zero pronouns) for Japanese, and extract valence patterns for predicates, an essential part of word meaning. To carry out this work it has been necessary to continue developing a method for reaching semantic representations automatically from syntactic parsed representations and to create a large base of already analysed and human checked syntactic structures that can be transformed to semantic representations. The establishment of such a base forms training data for creating yet more like data, with the potential to scale to large volumes of data.
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason The pipeline for producing analysed data has continued to improve. Models resulting from training are slightly smaller than a year ago despite a large increase in new data, reflecting improvements to the annotation. The work on developing methods of visualising and making accessible semantic information has focused on ways to embed information back into parsed data. This has led to the enrichment of the existing corpus data with a second layer of special-purpose annotation made up of indexing information. This corpus semantic information can now be searched because of a transformation to the TIGER-XML format that includes a structure sharing mechanism (multi-dominance) that can be queried. Research results can be seen in the interfaces of the NINJAL Parsed Corpus of Modern Japanese (NPCMJ; http://npcmj.ninjal.ac.jp/interfaces/), where, aside from a default tree view of the syntactic annotation, examples can be seen (semantic view) as predicate logic formulas capturing semantic content, as well as a view (indexed view) that embeds the calculated semantic content into the trees as indexing information. In addition, there is a visualisation for how the semantics was derived (eval view).
Strategy for Future Research Activity	The semantic component will continue to be developed, especially in use as a basis for visualising dependencies. The existing indexing component will be extended so as to produce the character-indexed report format of FrameNet. This will allow creation of browsable reports that display semantic dependencies in a very intuitive way. A new "scaffolding" component will be built as a layer of automated analysis to further specify part-of-speech analysis derived from systems of morphological analysis (mecab/Comainu). It is expected that additional specification will lead to improvements of the automatic parsing. The project will also be extending the range of data analysed to more genres and to historical Japanese texts.
Causes of Carryover	Money has been carried over to pay for assistance in the process of undertaking human annotation correction.
Expenditure Plan for Carryover Budget	Money has been carried over to pay for assistance in the process of undertaking human annotation correction.

Research Products
(11 results)

All 2017 2016 Other

All Journal Article (4 results) (of which Int'l Joint Research: 3 results, Open Access: 4 results, Peer Reviewed: 3 results, Acknowledgement Compliant: 3 results) Presentation (5 results) (of which Int'l Joint Research: 3 results, Invited: 2 results) Remarks (1 results) Funded Workshop (1 results)

[Journal Article] Keyaki Treebank segmentation and part-of-speech labelling2017
- Author(s)
  Alastair Butler and Stephen Wright Horn and Kei Yoshimoto
- Journal Title
  
  言語処理学会第23回年次大会発表論文集
  
  Volume: なし Pages: 414-417
- Open Access
[Journal Article] From meaning representations to syntactic trees2016
- Author(s)
  Alastair Butler
- Journal Title
  
  Proceedings of the Thirteenth International Workshop of Logic and Engineering of Natural Language Semantics 13 (LENLS 13)
  
  Volume: なし Pages: 147-160
- Peer Reviewed / Open Access / Int'l Joint Research / Acknowledgement Compliant
[Journal Article] DynamicPower at SemEval-2016 Task 8: Processing syntactic parse trees with a Dynamic Semantics core2016
- Author(s)
  Alastair Butler
- Journal Title
  
  Proceedings of SemEval-2016
  
  Volume: なし Pages: 1148-1153
- Peer Reviewed / Open Access / Int'l Joint Research / Acknowledgement Compliant
[Journal Article] Deterministic natural language generation from meaning representations for machine translation2016
- Author(s)
  Alastair Butler
- Journal Title
  
  Proceedings of the 2nd Workshop on Semantics-Driven Machine Translation
  
  Volume: なし Pages: 1-9
- Peer Reviewed / Open Access / Int'l Joint Research / Acknowledgement Compliant
[Presentation] From meaning representations to syntactic trees2016
- Author(s)
  Alastair Butler
- Organizer
  Logic and Engineering of Natural Language Semantics (LENLS 13)
- Place of Presentation
  Tokyo, Japan
- Year and Date
  2016-11-15
- Int'l Joint Research
[Presentation] Treebank annotation of FraCaS and JSeM2016
- Author(s)
  Alastair Butler, Ai Kubota, Shota Hiyama and Kei Yoshimoto
- Organizer
  Logic and Engineering of Natural Language Semantics (LENLS 13)
- Place of Presentation
  Tokyo, Japan
- Year and Date
  2016-11-13
- Int'l Joint Research
[Presentation] Parsed Corpus Semantics2016
- Author(s)
  Alastair Butler
- Organizer
  New Landscapes in Theoretical Computational Linguistics
- Place of Presentation
  Ohio State University, USA
- Year and Date
  2016-10-15
- Invited
[Presentation] A parsed corpus of Japanese enriched to reach levels of semantic analysis2016
- Author(s)
  Alastair Butler, Shiro Akasegawa, Prashant Pardeshi and Kei Yoshimoto
- Organizer
  なし
- Place of Presentation
  Brandeis University, Boston, USA
- Year and Date
  2016-09-02
- Invited
[Presentation] Deterministic natural language generation from meaning representations for machine translation2016
- Author(s)
  Alastair Butler
- Organizer
  2nd Workshop on Semantics-Driven Machine Translation
- Place of Presentation
  San Diego, California
- Year and Date
  2016-06-16
- Int'l Joint Research
[Remarks] Alastair Butler - Homepage
- URL
  http://www.compling.jp/ajb129/index.html
[Funded Workshop] Unshared Task at LENLS 13 (Theory and System analysis with FraCaS, MultiFraCaS and JSeM Test Suites)2016
- Place of Presentation
  National Institute for Japanese Language and Linguistics, Tokyo, Japan
- Year and Date
  2016-11-13 – 2016-11-13

2016 Fiscal Year Research-status Report

統辞・意味解析情報タグ付き日本語ツリーバンクからの視覚意味情報の抽出と応用

Principal Investigator

バトラー アラステア 大学共同利用機関法人人間文化研究機構国立国語研究所, 理論・対照研究領域, プロジェクト非常勤研究員 (90588873)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Keyaki Treebank segmentation and part-of-speech labelling2017

Author(s)

Journal Title

[Journal Article] From meaning representations to syntactic trees2016

Author(s)

Journal Title

[Journal Article] DynamicPower at SemEval-2016 Task 8: Processing syntactic parse trees with a Dynamic Semantics core2016

Author(s)

Journal Title

[Journal Article] Deterministic natural language generation from meaning representations for machine translation2016

Author(s)

Journal Title

[Presentation] From meaning representations to syntactic trees2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Treebank annotation of FraCaS and JSeM2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Parsed Corpus Semantics2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A parsed corpus of Japanese enriched to reach levels of semantic analysis2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Deterministic natural language generation from meaning representations for machine translation2016

Author(s)

Organizer

Place of Presentation

Year and Date

[Remarks] Alastair Butler - Homepage

URL

[Funded Workshop] Unshared Task at LENLS 13 (Theory and System analysis with FraCaS, MultiFraCaS and JSeM Test Suites)2016

Place of Presentation

Year and Date

バトラーアラステア大学共同利用機関法人人間文化研究機構国立国語研究所, 理論・対照研究領域, プロジェクト非常勤研究員 (90588873)