2015 年度実績報告書

文脈を考慮した数学的知識へのアクセスに関する研究

研究課題

研究課題/領域番号	14J09896
研究機関	東京大学
研究代表者	KRISTIANTO GIOVANNIYOKO 東京大学, 情報理工学系研究科, 特別研究員(DC1)
研究期間 (年度)	2014-04-25 – 2017-03-31
キーワード	Mathematical Knowledge / Dependency relationships / Math search system / MathML indexing / Learning-to-rank / Unification
研究実績の概要	The goal of this research is to design an intelligent browsing system for mathematical information that helps researchers to explore mathematical concepts shared across different scientific disciplines. To pursue this goal, this research attempts to establish a general framework for knowledge extraction based on semantic understanding of mathematical expressions. Following issues will be addressed in the research period. 1. Extracting and analyzing textual descriptions of math entities. 2. Capturing dependency relationships between math expressions within a document. 3. Building a concept graph using the relationships between expressions obtained from (2). The extraction of descriptions for mathematical expressions has already been done in the first year. Up to the end of the second year, we have developed a heuristic method to capture dependency relationships between similar or related mathematical expressions. This method successfully extracted the relationships with an accuracy of 82.44%. This accuracy is good compared to the baseline method (65.41% accuracy) and our initial method developed in the first year (73.38%). Furthermore, we exploited these dependency relationships to provide textual descriptions to symbols and sub-expressions inside mathematical expressions. We then utilized this information in a mathematical search system to investigate the effectiveness of exploiting dependency relationships between math expressions for retrieving math expressions.
現在までの達成度 (区分)	現在までの達成度 (区分) 2: おおむね順調に進展している理由 The progress we achieved in the second year is consistent with the research plan. As we planned, we successfully improve the extraction accuracy of dependency relationships between math expressions and also demonstrate its effectiveness in math search system. Our heuristic method to extract the dependency relationships between math expressions was submitted to the "Information Retrieval Journal". However, we are still waiting for its review result. Meanwhile, we improved the accuracy of our math search system using dependency relationships and several other additional techniques. We first investigate the effectiveness of applying machine learning (learning-to-rank) algorithms to our system. The experimental results showing the effectiveness of combining score normalization and these learning-to-rank were published in "The 3rd International Workshop on Digitization and E-Inclusion in Mathematics and Science 2016". Finally, we also evaluated our search system by participating in the NTCIR-12 MathIR Task, where our system came out as the top performer. The results of our participation will appear in the NTCIR-12 proceedings in June.
今後の研究の推進方策	For constructing an intelligent browsing system for math information that helps researchers to explore math concepts, we need not only a math search system, but also a knowledge/concept graph. The research focus in the third year is to obtain a knowledge/concept graph from each scientific document. To achieve this goal, we plan to utilize the dependency graph of math expressions. We need to investigate if such concept graph can be directly obtained by applying any of the graph clustering or community detection methods to the dependency graph of math expressions. We also need to evaluate the accuracy of the obtained concept graph. Details of the next academic year plan are given as follows: (a) developing a gold-standard concept graph from Wikipedia, (b) performing graph clustering or community detection over the dependency graph of math expressions to obtain initial automatically constructed concept graph, (c) Depending on the result of step b, we may need to exploit other types of information, such as surrounding text or topics of each math expression, to obtain better concept graph, (d) Quantitatively evaluating the concept graph using the annotated data obtained from step a, and (e) Qualitatively evaluating the concept graph. We can interpret the concept graph obtained for each document as the requisite knowledge required to understand the document.

研究成果
(3件)

すべて 2016 2015

すべて雑誌論文 (1件) (うち査読あり 1件) 学会発表 (2件) (うち国際学会 2件)

[雑誌論文] Efficient Algorithm for Math Formula Semantic Search2016
- 著者名/発表者名
  Shunsuke Ohashi, Giovanni Yoko Kristianto, Goran Topic, Akiko Aizawa
- 雑誌名
  
  IEICE TRANSACTIONS on Information and System
  
  巻: E99-D (4) ページ: 979 - 988
- DOI
  10.1587/transinf.2015DAP0023
- 査読あり
[学会発表] Combining Effectively Math Expressions and Textual Keywords in Math IR2016
- 著者名/発表者名
  Giovanni Yoko Kristianto, Goran Topic, Akiko Aizawa
- 学会等名
  The 3rd International Workshop on Digitization and E-Inclusion in Mathematics and Science 2016
- 発表場所
  Shonan Village Center, Kanagawa, Japan
- 年月日
  2016-02-04 – 2016-02-06
- 国際学会
[学会発表] Math Expressions, Text, and Dependency Graph in MathIR2015
- 著者名/発表者名
  Giovanni Yoko Kristianto
- 学会等名
  The 2nd Asian Summer School in Information Access
- 発表場所
  National Taiwan Normal University, Taipei City, Taiwan
- 年月日
  2015-08-24 – 2015-08-27
- 国際学会

2015 年度 実績報告書

文脈を考慮した数学的知識へのアクセスに関する研究

研究代表者

KRISTIANTO GIOVANNIYOKO 東京大学, 情報理工学系研究科, 特別研究員(DC1)

現在までの達成度 (区分)

理由

研究成果

[雑誌論文] Efficient Algorithm for Math Formula Semantic Search2016

著者名/発表者名

雑誌名

DOI

[学会発表] Combining Effectively Math Expressions and Textual Keywords in Math IR2016

著者名/発表者名

学会等名

発表場所

年月日

[学会発表] Math Expressions, Text, and Dependency Graph in MathIR2015

著者名/発表者名

学会等名

発表場所

年月日

2015 年度実績報告書