• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2002 Fiscal Year Final Research Report Summary

Studies on Development of OCR system for Historical Documents and Application to Technologies in Electronic Dictionary

Research Project

Project/Area Number 12558037
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section展開研究
Research Field 情報システム学(含情報図書館学)
Research InstitutionOsaka City University

Principal Investigator

SHIBAYAMA Mamoru  Osaka City Univ., Media Center, Professor, 学術情報総合センター, 教授 (10162645)

Co-Investigator(Kenkyū-buntansha) NAMIKI Mitaro  Tokyo University of Agriculture and Technology, Faculty of Engineering, Associate Prof, 工学部, 助教授 (10208077)
HARA Shoichiro  National Institute of Japanese Literature, Associate Prof, 研究情報学部, 助教授 (50218616)
YAMADA Shoji  International Research Center for Japanese Studies, Research Division, Associate Prof, 研究部, 助教授 (20248751)
IWASAKI Hiroshi  Kyoto Univ., Professor of Emeritus, コミュニティ振興学部, 教授 (50087904)
KAWAGUCHI Hiroshi  Tezukayama Univ., Faculty of Information and Management, Associate Prof, 経営情報学部, 助教授 (80224749)
Project Period (FY) 2000 – 2002
KeywordsHistorical Document Images / OCR / Chracter Recognition / Chracter Segmentation / Recognition Dictionary / Transliteration
Research Abstract

The purpose of this research is to build the electronic dictionaries, "Kuzushiji Kaidoku dictionary" and "Kuzushiji Yourei dictionary", used in which the specialist of the historical study, paleography, and literature deciphers the historical handwritten documents using the computer including mobil and note book styles, and to develop the computerized dictionary that can be used in a mobil environment.
Moreover, it is to apply the dictionaries directly to the character recognition researches in the transliteration supporting system for historical documents (Historical document OCR) mentioned above.
The following research results were obtained during this reserch period.
(1) The images which is the index of "Kuzushiji Yourei dictionary" (it allows us to retrieve the shape of letters and examples of letter use based on the stroke (Kihitsu-jun) index) were input as the images with attributes such as "Kuzushiji Yourei dictionary code", "Mojikyo code" and "Shift-JIS" internal code, and an elec … More tronic Moji database was built
(2) A retrieval function which the user can search the similar characters in the above-mentioned dictionary was developed
(3) The "n-gram" method was applied to the researches in the historical document transliteration supporting system (historical document OCR), and it was confirmed that "n-gram" was effective when the lost or missing charahter in the document was presumed
(4) To build the character pattern dictionary of about 240,000 characters on the historical document to be used in the recognition process, a development of segmentation program and the character selection work were carried out
(5) The second edition of HCD series below in the historical document character database had been made as one of computerized dictionaries. (a) HCD2, title line for debt bond, Fushimiya Zenbei document, 200 lines, 1,378 characters, and binary format. (b) HCD2a, title line for the bond, Fushimiya Zenbei document, 200 lines, 1,378 characters, and 256 steps. c HCD2b, title line for debt bond, Fushimiya Zenbei document, 200 lines, and 24bits 1,378 character colors format. (d) HCD3, title line for debt bond, Fushimiya Zenbei document, 183 character types, 4933 characters, and binary format
(6) The character recognition in the document focused on the title line was carried out using the above-mentioned dictionary. The research of the recognition techniques for matching the character pattern without segmentation for each character in title line was developed
(7) Study on estimation for stroke order extracted from "Database of Kuzushiji Kaidoku dictionary" made by the dictionary has been carried out. Research reports including intermediate version for this study were published in March, 2001 and 2000 respectively besides papers regarding the historical document transliteration supporting system Less

  • Research Products

    (11 results)

All Other

All Publications (11 results)

  • [Publications] 山田奨治, 柴山 守他: "類似文字検索機能をそなえた電子くずし字辞典の開発"情報処理学会研究報告2002-CH-54. Vol.2002, No.23. 43-50 (2002)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 山田奨治, 柴山 守他: "古文書を対象にした文字認識の研究"情報処理. Vol.43, No.9. 950-955 (2002)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 近藤博人, 松本隆, 柴山 守, 山田奨治, 荒木義彦: "文字切出しを前提としない古文書標題認識"情報処理学会研究報告2003-CH-57. Vol.2003, No.5. 1-8 (2002)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 安倍広多, 中塚麻記子, 柴山 守: "『くずし字解読辞典』文字画像からの筆順抽出の試み"大阪市立大学学術情報総合センター紀要. Vol.4. 19-23 (2003)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Kota Abe, Makiko Nakatsuka, and Mamoru Shibayama: "An Attempt to Extract Stroke Order from Handwritten Cursive Japanese Character Image"Bulletin of Osaka City University Media Center. 14. (2003)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Hirohito Kondo, Ryuichi Matsumoto, Mamoru Shiabayama, and Yoshihiko Araki: "Character Recognition without Segmentation for Title in Historical Document Images"IPSJ SIG-Report 2002. 57. 1-8 (2003)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Shoji Yamada and Mamoru Shibayama: "Studies on Chracter Recognition for Historical Document"Information Processing. 43 No.9. 950-955 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Shoji Yamada, Yuji Waizumi, Nei Kato, and Mamoru Shibayama: "Development of Digital Dictionary of Historical Characters with Search Function of Slimar Characters"IPSJ SIG-Report 2002. 54. 43-50 (2002)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Shoji Yamada, Nei Kato, Mamoru Shibayama, and et al.: "Historical Character Recognition (HCR) Project Report (2)"IPSJ SIG-Report 2001. 50. 9-16 (2001)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Koji OZAKI, Mamoru SHIBAYAMA, and Yoshihiko ARAKI: "Layout Recognition and Title Extraction for Historical Document Image"Proceedings of Symposium on Computer and the Humaniies, IPSJ. (2000)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Shoji YAMADA, Mamoru SHIBAYAMA: "A study of a historical document research supporting system using n-gram"IPSJ Symposium Series. 2000, No.17. 185-192 (2000)

    • Description
      「研究成果報告書概要(欧文)」より

URL: 

Published: 2004-04-14  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi