2014 年度実績報告書

文脈を考慮した数学的知識へのアクセスに関する研究

研究課題

研究課題/領域番号	14J09896
研究機関	東京大学
研究代表者	KRISTIANTO GIOVANNIYOKO 東京大学, 情報理工学系研究科, 特別研究員(DC1)
研究期間 (年度)	2014-04-25 – 2017-03-31
キーワード	Mathematical knowledge / Description / Dependency graph / Math formulae search / MathML indexing
研究実績の概要	The goal of this research is to design an intelligent browsing system for mathematical information that helps researchers to explore mathematical concepts shared across different scientific disciplines. To pursue this goal, this research attempts to establish a general framework for knowledge extraction based on semantic understanding of mathematical formulae. Following issues will be addressed in the research period: (1) Extracting and analyzing textual descriptions of math entities. (2) Capturing relationships between formulae within a document. (3) Finding related and similar formulae across documents. This is important for utilizing external math resources to compensate for implicitly assumed domain specific knowledge. Up to the end of the first year, we have developed a method to automatically extract textual descriptions of mathematical expressions. Furthermore, we also developed a method to capture relationships between similar or related mathematical expressions using a simple substring matching. We call a graph that depicts such relationships between mathematical expressions as "dependency graph". We then exploited the dependency graph to provide textual descriptions to symbols and sub-expressions inside mathematical expressions. Finally, we developed a mathematical search system to investigate the effectiveness of textual descriptions and dependency graph for retrieving mathematical expressions.
現在までの達成度 (区分)	現在までの達成度 (区分) 2: おおむね順調に進展している理由 The progress we achieved in the first year was consistent with the research plan. As we planned, we accomplished two tasks required to establish a math browsing system. We have already developed a machine learning based method to automatically extract textual descriptions of mathematical expressions. This method was published in "The 3rd International Workshop on Mining Scientific Publications". Moreover, we also have developed a method to capture relationships between similar of related mathematical expressions (dependency graph of mathematical expressions). Subsequently, we built a mathematical search system that accepts both a mathematical expression and free text as query. The experimental results showed that the use of descriptions and dependency graph together delivered better retrieval performance than when no text provided in the queries. These results were published in "The 9th International Conference on Digital Information Management", where we won the "Best Paper Award". Finally, we also evaluated our search system by participating in the NTCIR-11 Math-2 task. The results of our participation were published in the NTCIR-11 proceedings.
今後の研究の推進方策	The research focus in the second year is to improve the number of mathematical expression relationships captured in current dependency graph. To achieve this goal, we need a heuristic method to capture relationships that were overlooked by substring matching method. Subsequently, we also need to detect the scope of math expressions, that is to investigate if the meaning of each mathematical expression is kept same within a document or not. Details of the next year plan are given as follows. (a) Performing manual annotation to create a gold-standard dataset of relationships between mathematical expressions. (b) Developing a heuristic method to create dependency graphs. We consider a heuristic that resembles generalization technique that is usually used in logic to do automated reasoning. (c) Evaluating the heuristic method using annotated data. The baseline method for the evaluation will be the substring matching method (d) Detecting the semantic scope of mathematical expressions within a document.

研究成果
(3件)

すべて 2014

すべて学会発表 (3件)

[学会発表] The MCAT Math Retrieval System for NTCIR-11 Math Track2014
- 著者名/発表者名
  Giovanni Yoko Kristianto
- 学会等名
  The 11th NTCIR Conference
- 発表場所
  国立情報学研究所、東京都
- 年月日
  2014-12-09 – 2014-12-12
[学会発表] Exploiting Textual Descriptions and Dependency Graph for Searching Mathematical Expressions in Scientific Papers2014
- 著者名/発表者名
  Giovanni Yoko Kristianto
- 学会等名
  The 9th International Conference on Digital Information Management
- 発表場所
  Bangkok, Thailand
- 年月日
  2014-09-29 – 2014-10-01
[学会発表] Extracting Textual Descriptions of Mathematical Expressions in Scientific Papers2014
- 著者名/発表者名
  Giovanni Yoko Kristianto
- 学会等名
  The 3rd International Workshop on Mining Scientific Publications
- 発表場所
  London, United Kingdom
- 年月日
  2014-09-08 – 2014-09-12

2014 年度 実績報告書

文脈を考慮した数学的知識へのアクセスに関する研究

研究代表者

KRISTIANTO GIOVANNIYOKO 東京大学, 情報理工学系研究科, 特別研究員(DC1)

現在までの達成度 (区分)

理由

研究成果

[学会発表] The MCAT Math Retrieval System for NTCIR-11 Math Track2014

著者名/発表者名

学会等名

発表場所

年月日

[学会発表] Exploiting Textual Descriptions and Dependency Graph for Searching Mathematical Expressions in Scientific Papers2014

著者名/発表者名

学会等名

発表場所

年月日

[学会発表] Extracting Textual Descriptions of Mathematical Expressions in Scientific Papers2014

著者名/発表者名

学会等名

発表場所

年月日

2014 年度実績報告書