• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Research on the digitization system of scientific documents

Research Project

Project/Area Number 14380182
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field 情報システム学(含情報図書館学)
Research InstitutionKYUSHU UNIVERSITY

Principal Investigator

SUZUKI Masakazu  Kyushu University, Faculty of Mathematics, Professor, 大学院・数理学研究院, 教授 (20112302)

Co-Investigator(Kenkyū-buntansha) OKAMOTO Masayuki  Shinshu University, Faculty of Engineering, Professor, 工学部, 教授 (50109196)
UCHIDA Seiichi  Kyushu University, Faculty of Information Systems, Associate Professor, 大学院・システム情報科学研究院, 助教授 (70315125)
TAMARI Fumikazu  Fukuoka University of Education, Faculty of Education, Professor, 教育学部, 教授 (70036937)
FUJIMOTO Mitsushi  Fukuoka University of Education, Faculty of Education, Associate Professor, 教育学部, 助教授 (20270241)
KANAHORI Toshihiro  Tsukuba University of Technology, Gakunai Kyodo Riyou Shisetsu, Associate Professor, 共同利用施設等, 助教授 (00352568)
大武 信之  筑波技術短期大学, 教育方法開発センター, 助教授 (10223851)
黄瀬 浩一  大阪府立, 工学部, 助教授 (80224939)
Project Period (FY) 2002 – 2005
Project Status Completed (Fiscal Year 2005)
Budget Amount *help
¥14,400,000 (Direct Cost: ¥14,400,000)
Fiscal Year 2005: ¥2,900,000 (Direct Cost: ¥2,900,000)
Fiscal Year 2004: ¥2,800,000 (Direct Cost: ¥2,800,000)
Fiscal Year 2003: ¥2,800,000 (Direct Cost: ¥2,800,000)
Fiscal Year 2002: ¥5,900,000 (Direct Cost: ¥5,900,000)
KeywordsFormula recognition / Math recognition / Structure analysis / Optical Character recognition / Document digitization / Assist technology / Visually Impaired
Research Abstract

1.Throughout the research period, we build a ground-truthed database of page images of mathematical articles. Using the database, we developed and improved the math symbol recognition engine and the segmentation method of text areas and math expression areas. A part of the database is now open to public on our web site.
2.To improve the math structure analysis method base on virtual link network developed in the previous research, we adjusted the cost of the links of the network in detail using the database above. On the other hand, we introduced a notion of "center band", calculated robustly against mis-recognition of characters, to stabilize considerably the structure analysis of math expressions.
3.We developed a method to segment touched characters in math expressions using the matching of sub-patterns with other non-touched characters patterns in the same page. We also extended a framework used frequently to segment characters in text areas in a way adapted to math formulae images
4.We developed a method to recognize complicated matrices including repeat symbols or area symbols, using variable block pattern elements.
5.We investigated the method to detect bibliographic data and logical structure of math papers from the recognition results.
6.We finally studied the recognition of commutative diagrams in math papers and graphs of elementary functions in the figures of math texts as well. These are however still on the state of trial research.
7.A math document recognition software "Infty Reader" developed using the results of this research is available freely from the web site : http://www.inftyproject.org./

Report

(5 results)
  • 2005 Annual Research Report   Final Research Report Summary
  • 2004 Annual Research Report
  • 2003 Annual Research Report
  • 2002 Annual Research Report
  • Research Products

    (50 results)

All 2006 2005 2004 2003 Other

All Journal Article (39 results) Publications (11 results)

  • [Journal Article] A Structural Analysis of Mathematical Formulae with Verification based on Formula description Grammar2006

    • Author(s)
      S.Toyota, S.Uchida, M.Suzuki
    • Journal Title

      Lecuture Notes in Computer Sciences, Springer,(Document Analysis Systems VII) 3872

      Pages: 153-163

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Support Vector Machines for Mathematical Symbol Recognition2006

    • Author(s)
      C.Malon, S.Uchida, M.Suzuki
    • Journal Title

      Tschnical Rsport of IEICE Vol.105,No624

      Pages: 49-54

    • NAID

      110004662908

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Refinement of digitized documents through recognition of mathematical formulae2006

    • Author(s)
      T.Kanahori, M.Suzuki
    • Journal Title

      Proceedings of the 2nd International Workashop on Document Image Analysis for Libraries,Lyon

      Pages: 297-302

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Performance Evaluation of a Mathematical Formula Recognition System with a large scale of printed formula images2006

    • Author(s)
      Kazuki Ashida, Masayuki Okamoto, Hiroki Imai
    • Journal Title

      proceedings of the 2nd International Workshop on Document Image Analysis for Libraries,Lyon

      Pages: 320-311

    • NAID

      110003314500

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Structural Analysis of Mathematical Formulae with Verification based on Formula description Grammar2006

    • Author(s)
      S.Toyota, S.Uchida, M.Suzuki
    • Journal Title

      Document Analysis Systems VII, Lecture Notes in Computer Sciences 3872 (Proceedings of the 7th International Workshop DAS 2006, Nelson, New Zealand).

      Pages: 153-163

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Support Vector Machines for mathematical Symbol Recognition2006

    • Author(s)
      C.Malon, S.Uchida, M.Suzuki
    • Journal Title

      Technical Report of IEICE Vol.105,No614 (PRMU-2005-192)

      Pages: 49-54

    • NAID

      110004662908

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Refinement of digitized documents through recognition of mathematical formulae2006

    • Author(s)
      T.Kanahori, M.Suzuki
    • Journal Title

      Proceedings of the 2nd International Workshop on Document Image Analysis for Libraries, Lyon

      Pages: 297-302

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Performance Evaluation of a Mathematical Formula Recognition System with a large scale of printed formula images2006

    • Author(s)
      Kazuki Ashida, Masayuki Okamoto, Hiroki Imai
    • Journal Title

      Proceedings of the 2nd International Workshop on Document Image Analysis for Libraries, Lyon

      Pages: 320-331

    • NAID

      110003314500

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Structural Analysis of Mathematical Formulae with Verification based on Formula Description Grammar2006

    • Author(s)
      S.Toyota, S.Uchida, M.Suzuki
    • Journal Title

      Lecture Notes in Computer Sciences 3872

      Pages: 153-163

    • Related Report
      2005 Annual Research Report
  • [Journal Article] A Support Vector Machines for Mathematical Symbol Recognition2006

    • Author(s)
      C.Malon, S.Uchida, M.Suzuki
    • Journal Title

      電子情報通信学会技術研究報告,PRMU2005 Vol.105, No.614

      Pages: 49-54

    • NAID

      110004662908

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Refinement of digitized documents through recognition of mathematical formulae2006

    • Author(s)
      T.Kanahori, M.Suzuki
    • Journal Title

      Proceedings of the 2nd International Workshop on Document Image Analysis for Libraries (To Appear)(未定)

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Quantitative Analysis of Mathematical Documents2005

    • Author(s)
      S.Uchida, A.Nomura, M.Suzuki
    • Journal Title

      International Jounal on Document Analysis and Recognition Vol.7 No.4

    • NAID

      110003314232

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Mathematical knowlege brouwser with automatic hyperlink detection2005

    • Author(s)
      K.Nakagawa, M.Suzuki
    • Journal Title

      Lecuture Notes in Computer Sciences,Springer,(Mathematical Knouledge Management) 3863

      Pages: 190-202

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Ground-Truthed Mashematical Character and Symbol Image Database2005

    • Author(s)
      M.Suzuki, S.Uchida, A.Nomura
    • Journal Title

      Proceedings of 8th International Conference on Document Analysis and Recognition(ICDA2005),Seoul,Korea,Vol.2,IEEE Computer Society Press

      Pages: 675-679

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Quantitative Analysis of Mathematical Documents2005

    • Author(s)
      S.Uchida, A.Nomura, M.Suzuki
    • Journal Title

      International Journal on Document Analysis and Recognition, (Springer) Vol.7,No.4

      Pages: 211-218

    • NAID

      110003314232

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Mathematical knowledge browser with automatic hyperlink detection2005

    • Author(s)
      K.Nakagawa, M.Suzuki
    • Journal Title

      Mathematical Knowledge Management, The 4th International Conference, MKM 2005, Bremen, Germany, Revised Selected Papers, Lecture Notes in Computer Sciences 3863

      Pages: 190-202

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] A Ground-Truthed mathematical Character and Symbol Image Database2005

    • Author(s)
      M.Suzuki, S.Uchida, A.Nomura
    • Journal Title

      Proceedings of 8th International Conference on Document Analysis and Recognition (ICDAR 2005), Seoul, Korea, (IEEE Computer Society Press) Vol.2

      Pages: 675-679

    • NAID

      10015557140

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Quantitative Analysis of Mathematical Documents2005

    • Author(s)
      S.Uchida, A.Nomura, M.Suzuki
    • Journal Title

      International Journal on Document Analysis and Recognition Vol.7, No.4

      Pages: 211-218

    • NAID

      110003314232

    • Related Report
      2005 Annual Research Report
  • [Journal Article] Mathematical knowledge browser with automatic hyperlink detection2005

    • Author(s)
      K.Nakagawa, M.Suzuki
    • Journal Title

      Lecture Notes in Computer Sciences (Springer) 3863

      Pages: 190-202

    • Related Report
      2005 Annual Research Report
  • [Journal Article] A Ground-Truthed Mathematical Character and Symbol Image Database2005

    • Author(s)
      M.Suzuki, S.Uchida, A.Nomura
    • Journal Title

      Proceedings of the 8th International Conference on Document Analysis and Recognition (IEEE Computer Society Press) Vol.2

      Pages: 675-679

    • NAID

      10015557140

    • Related Report
      2005 Annual Research Report
  • [Journal Article] 英文数学文書の正解付き文字・記号画像データベース2005

    • Author(s)
      野村明弘, 内田誠一, 鈴木昌和
    • Journal Title

      電子情報通信学会技術研究報告,PRMU2004 (未定)

    • NAID

      10015557140

    • Related Report
      2004 Annual Research Report
  • [Journal Article] 印刷文書中の図・グラフの認識2005

    • Author(s)
      山本達也, 鈴木昌和
    • Journal Title

      情報処理学会九州支部,火の国情報シンポジューム2005,発表論文集 (未定)

    • Related Report
      2004 Annual Research Report
  • [Journal Article] An Integrated OCR Software for mathematical Documents and Its Output with Accessibility2004

    • Author(s)
      M.Suzuki, T.Kanahori, N.Ohtake, K.Yamaguchi
    • Journal Title

      Lecture Notes in Computer Sciences,Springer, (Computers Helping people with Special Needs). 3119

      Pages: 648-655

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Extraction of Logical Stucture from Articles in Mathematics--Mathematical Knowledge Browser2004

    • Author(s)
      K.Nakagawa, A.Nomura, M.Suzuki
    • Journal Title

      Lecture Notes in Computer Sciences, Mathematical knowlegdge Management. 3991

      Pages: 276-289

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] An Integrated OCR Software for mathematical Documents and Its Output with Accessibility2004

    • Author(s)
      M.Suzuki, T.Kanahori, N.Ohtake, K.Yamaguchi
    • Journal Title

      Computers Helping people with Special Needs, 9th International Conference ICCHP2004, Paris, July 2004 Lecture Notes in Computer Sciences 3119, Springer

      Pages: 648-655

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Extraction of Logical Structure from Articles in Mathematics -- Mathematical Knowledge Browser2004

    • Author(s)
      K.Nakagawa, A.Nomura, M.Suzuki
    • Journal Title

      Mathematical Knowledge Management, The 3rd International Conference MKM2004, Bialowieja, Poland, Lecture Notes in Computer Sciences 3119, Springer

      Pages: 276-289

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Extraction of Logical Structure from Articles in Mathematics2004

    • Author(s)
      K.Nakagawa, A.Nomura, M.Suzuki
    • Journal Title

      3rd International Conference MKM2004,Bialowieja, Poland, Lecture Notes in Computer Sciences, Springer 3119

      Pages: 276-289

    • Related Report
      2004 Annual Research Report
  • [Journal Article] An Integrated OCR Software for mathematical Documents and Its Output with Accessibility2004

    • Author(s)
      M.Suzuki, T.Kanahori, N.Ohtake, K.Yamaguchi
    • Journal Title

      9th International Conference ICCHP2004 Paris, Lecture Notes in Computer Sciences, Springer 3119

      Pages: 648-655

    • Related Report
      2004 Annual Research Report
  • [Journal Article] An Annotated Corpus and a Grpus and a Grammar Model of Theorem description2003

    • Author(s)
      Y.Baba, M.Suzuki
    • Journal Title

      Lecture Notes in Computer Sciences, Springer, (Mathematical knowledge Management) 2594

      Pages: 93-104

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] INFTY-An integrated OCR system for mathematical documents2003

    • Author(s)
      M.Suzuki, F.Tamari, R.Fukuda, S.Uchida, T.Kanahori
    • Journal Title

      Proceeding of the 2003 ACM Symposium on Document Eingineering,Grenoble

      Pages: 95-104

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Detection of Matrices and Segmentation of Matrix Elements in Scanned Images of Scientific Documents2003

    • Author(s)
      T.Kanahori, M.Suzuki
    • Journal Title

      Proceeding of the 7thInternational Conference on Document Analysis and Recognition (ICDAR2003),Edinburgh,IEEE Computer Society Press

      Pages: 433-437

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Detection and Segmentation of Touching Characters in Mathematical Expressions2003

    • Author(s)
      A.Nomura, K.Michishita, S.Uchida, M.Suzuki
    • Journal Title

      Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR2003),Edinburgh,IEEE Computer Society Press

      Pages: 126-130

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Infity A Handwriting Interface to Various Computer Algebra System via OpenXM Servers2003

    • Author(s)
      M.Fujimoto, T.Kanahori, M.Suzuki
    • Journal Title

      Computer Algebra-Algorishmsm,Implementations and Applications,RIMS Kokyuroku Vol.1335

      Pages: 217-226

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] An Annotated Corpus and a Grammar Model of Theorem Description2003

    • Author(s)
      Y.Baba, M.Suzuki
    • Journal Title

      Mathematical Knowledge Management, The 2nd International conference MKM2003, Bologna, Italy, Lecture Notes in Computer Sciences 2594, Springer

      Pages: 93-104

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] INFTY-An integrated OCR system for mathematical documents2003

    • Author(s)
      M.Suzuki, F.Tamari, R.Fukuda, S.Uchida, T.Kanahori
    • Journal Title

      Proceedings of the 2003 ACM Symposium on Document Engineering, Grenoble

      Pages: 95-104

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Detection of Matrices and Segmentation of Matrix Elements in Scanned Images of Scientific Documents2003

    • Author(s)
      T.Kanahori, M.Suzuki
    • Journal Title

      Proceedings of the 7th International Conference on Document Analysis and Recognition(ICDAR2003), Edinburgh, IEEE Computer Society Press

      Pages: 433-437

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Detection and Segmentation of Touching Characters in Mathematical Expressions2003

    • Author(s)
      A.Nomura, K.Michishita, S.Uchida, M.Suzuki
    • Journal Title

      Proceedings of the 7th International Conference on Document Analysis and Recognition(ICDAR2003), Edinburgh, IEEE Computer Society Press

      Pages: 126-130

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Infty A Handwriting Interface to Various Computer Algebra Systems via OpenXM Servers2003

    • Author(s)
      M.Fujimoto, T.Kanahori, M.Suzuki
    • Journal Title

      Computer Algebra - Algorithms, Implementations and Applications, RIMS Kokyuroku Vol.1335

      Pages: 217-226

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2005 Final Research Report Summary
  • [Journal Article] Quantitative Analysis of Mathematical Documents

    • Author(s)
      S.Uchida, A.Nomura, M.Suzuki
    • Journal Title

      International Journal on Document Analysis and Recognition (to appear)

    • NAID

      110003314232

    • Related Report
      2004 Annual Research Report
  • [Publications] M.Fujimoto, T.Kanahori, M.Suzuki: "Infty Editor - A Mathematics Typesetting Tool with a Handwriting Interface and a Graphical Front-End to Open XM Servers"Kyoto University, RIMS Kokyuroku. 1335. 217-226 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] T.Kanahori, M.Suzuki: "Detection of Matrices and Segmentation of Matrix Elements in Scanned Images of Scientific Documents"Proceedings of the 7th International Conference on Document Analysis and Recognition IEEE Commuter Society Press. 126-136 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] A.Nomura, K.Michishita, S.Uchida, M.Suzuki: "Detection and Segmentation of Touching Characters in Mathematical Expressions"Proceedings of the 7th International Conference on Document Analysis and Recognition. 126-130 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] M.Suzuki, F.Tamari, R.Fukuda, S.Uchida, T.Kanahori: "INFTY - An integrated OCR system for mathematical documents"Proceedings of the 2003 ACM Symposium on Document Engineering Ed C Vanoirbeek C Roinsin E Munson. 95-104 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 金堀利洋, 西村博人, 藤本光史, 鈴木昌和: "数学の授業におけるインタラクティブなコンテンツを含んだ授業教材作成システム"電子情報通信学会技術研究報告. ET2003-80. 117-122 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] 内田誠一, 野村明弘, 鈴木昌和: "数学文書データベースの解析"電子情報通信学会技術研究報告. PRMU2003-48. 19-24 (2003)

    • Related Report
      2003 Annual Research Report
  • [Publications] T.Kanahori, M.Suzuki: "A Recognition Method of Matrices by Using Variable Block Pattern Elements Generating Rectangular Areas,"Graphics Recognition, Lecture Notes in Computer Sciences, Springer. 2390. 320-329 (2001)

    • Related Report
      2002 Annual Research Report
  • [Publications] Y.Baba, M.Suzuki: "An Annotated Corpus and a Grammar Model of Theorem Description"Mathematical Knowledge Management, Lecture Notes in Computer Sciences, Springer. 2594. 93-104 (2002)

    • Related Report
      2002 Annual Research Report
  • [Publications] T.Kanahori, M.Suzuki: "Detection of Matrices and Segmentation of Matrix Elements in Scanned Images of Scientific Documents"Proceedings of the 7th International Conference on Document Analysis and Recognition. (To appear). (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] A.Nomura, K.Michishita: "Detection and Segmentation of Touching Characters in Mathematical Expressions"Proceedings of the 7th International Conference on Document Analysis and Recognition. (To appear). (2003)

    • Related Report
      2002 Annual Research Report
  • [Publications] 野村明弘, 道下一行, 内田誠一, 鈴木昌和: "数式中の接触文字の画像マッチングに基づく切り分け法"電子情報通信学会技術研究報告. PRMU(未定). (2003)

    • Related Report
      2002 Annual Research Report

URL: 

Published: 2002-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi