• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Improvement and performance evaluation of the mathematical formula recognition method for digitalization of mathematical journals

Research Project

Project/Area Number 14580446
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field 情報システム学(含情報図書館学)
Research InstitutionShinshu University

Principal Investigator

OKAMOTO Masayuki  Shinshu University, Department of Information Engineering, Professor, 工学部, 教授 (50109196)

Co-Investigator(Kenkyū-buntansha) SUZUKI Masakazu  Kyushu University, Graduate School of Mathematics, Professor, 大学院・数理学研究院, 教授 (20112302)
Project Period (FY) 2002 – 2004
Project Status Completed (Fiscal Year 2004)
Budget Amount *help
¥3,300,000 (Direct Cost: ¥3,300,000)
Fiscal Year 2004: ¥900,000 (Direct Cost: ¥900,000)
Fiscal Year 2003: ¥1,000,000 (Direct Cost: ¥1,000,000)
Fiscal Year 2002: ¥1,400,000 (Direct Cost: ¥1,400,000)
KeywordsMathematical formula Recognition / Document Image Processing / Character Recognition / Pattern Recognition
Research Abstract

This research project aimed improvement and performance evaluation of the mathematical formula recognition system which has been developed in our laboratory. Automatic recognition of mathematical formula plays an important roles in digitization of scientific or engineering documents. But current OCR systems can not deal with mathematical formulas due to their two dimensional layout of characters or symbols.
We have collaborated with Professor Michler of the University of Essen, Germany, on the project of "Retro-digitalization of mathematical journals, and their integration searchable digital libraries". In this project, we developed a mathematical formula recognition system. This time, we improved this system in order to deal with the problems such as wide variety of formula types, low printing quality, and touching or separated characters and symbols. To evaluate the recognition performance, two kinds of mathematical journals were scanned and a Ground-Truth of formula images were created. This Ground-Truth includes 21472 formula images. The results of performance evaluation with respect to the recognition of symbols and structures are 99.4% and 99.09% respectively, This results show the potential of OCR which can convert scientific documents into electronic forms.

Report

(4 results)
  • 2004 Annual Research Report   Final Research Report Summary
  • 2003 Annual Research Report
  • 2002 Annual Research Report
  • Research Products

    (21 results)

All 2005 2003 2002 Other

All Journal Article (12 results) Publications (9 results)

  • [Journal Article] 大量の印刷数式画像を用いた数式認識システムの性能評価2005

    • Author(s)
      北原卓, 仲正幸, 岡本正行
    • Journal Title

      電子情報通信学会技術研究報告 PRMU2004-212-230

      Pages: 31-36

    • NAID

      110003314500

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] 英文数学文書の正解付き文字・記号画像データベース2005

    • Author(s)
      野村明弘, 内田誠一, 鈴木昌和
    • Journal Title

      電子情報通信学会技術研究報告 PRMU2004-212-230

      Pages: 37-42

    • NAID

      10015557140

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Annual Research Report 2004 Final Research Report Summary
  • [Journal Article] Performance Evaluation of a Mathematical Formula Recognition System with a Large Scale of Printed Formula Images2005

    • Author(s)
      T.Kitahara
    • Journal Title

      IEICE Technical Report PRMU2004-212-230

      Pages: 31-36

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] A Ground-Truthed Mathematical Character and Symbol Image Database2005

    • Author(s)
      A.Nomura
    • Journal Title

      IEICE Technical Report PRMU2004-212-230

      Pages: 37-42

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Detection and Segmentation of Touching Characters in Mathematical Expressions2003

    • Author(s)
      A.Nomura, K.Michishita S.Uchida, M.Suzuki
    • Journal Title

      Proceedings of ICDAR2003

      Pages: 126-130

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Detection of Matrices and Segmentation of Matrix Elements in Scanned Images of Scientific Documents2003

    • Author(s)
      T.Kanahori, M.Suzuki
    • Journal Title

      Proceedings of ICDAR2003

      Pages: 433-437

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] 数式認識システムについての一考察2003

    • Author(s)
      中塚 翼, 仲正幸, 岡本正行
    • Journal Title

      「科学情報の自動処理とその応用をめぐる諸問題」研究集会資料

      Pages: 30-33

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Detection and Segmentation of Touching Characters in Mathematical Expressions2003

    • Author(s)
      A.Nomura
    • Journal Title

      Proceedings of ICDAR2003

      Pages: 126-130

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Detection of Matrices and Segmentation of Matrix Elements in Scanned Images of Scientific Documents2003

    • Author(s)
      T.Kanahori
    • Journal Title

      Proceedings of ICDAR2003

      Pages: 433-437

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] A Discussion on Mathematical Formula Recognition System2003

    • Author(s)
      T.Nakatsuka
    • Journal Title

      Report on Problems on Automatic Processing of Scientific Information and Its Applications

      Pages: 30-33

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] 数式認識性能評価用データベースの作成2002

    • Author(s)
      中塚 翼, 仲正幸, 岡本正行
    • Journal Title

      科学技術分野における電子的情報処理に関する研究集会資料

      Pages: 11-13

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Journal Article] Ground Truth for Performance Evaluation of Mathematical Formula Recognition2002

    • Author(s)
      T.Nakatsuka
    • Journal Title

      Report on Electronic Information Processing in the Scientific and Engineering Field

      Pages: 11-13

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2004 Final Research Report Summary
  • [Publications] A.Nomura: "Detection and Segmentation of Touching Characters in Mathematical Expressions"Proceedings of ICDAR2003. Vol.1. 126-130 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] T.Kanahori: "Detection of Matrices and Segmentation of Matrix Elements in Scanned Images of Scientific Documents"Proceedings of ICDAR2003. Vol.1. 433-437 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 中塚 翼: "数式認識システムについての一考察"「科学情報の自動処理その応用をめぐる諸問題」研究集会資料. 30-33 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] T.Kanahori: "A Recognition Method of Matrices by Using Variable Block Pattern Elements Generating Rectangular Areas"Graphics Recognition, Lecture Notes in Computer Sciences, Springer. 2390. 320-329 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] Y.Baba: "An Annotated Corpus and a Grammar Model of Theorem Description Mathematical Knowledge Management"Lecture Notes Computer Sciences, Springer. 2594. 93-104 (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] 中塚 翼: "数式認識性能評価用データベースの作成"科学技術分野における電子的情報処理に関する研究集会資料. 11-13 (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] T.Kanahori: "Detection of Matrices and Segmentation of Matrix Elements in Scanned Images of Scientific Documents"Proceedings of ICDAR2003. (未定).

    • Related Report
      2002 Annual Research Report
  • [Publications] A.Nomura: "Detection and Segmentation of Touching Characters in Mathematical Expressions"Proceedings of ICDAR2003. (未定).

    • Related Report
      2002 Annual Research Report
  • [Publications] 野村 明弘: "数式中の接触文字の画像マッチングに基づく切り分け法"電子情報通信学会技術研究報告PRMU2002. 243-263. 31-35 (2003)

    • Related Report
      2002 Annual Research Report

URL: 

Published: 2002-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi