• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Studies on OCR for Historical Document

Research Project

Project/Area Number 11410090
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Japanese history
Research InstitutionOsaka City University

Principal Investigator

SHIBAYAMA Mamoru  Osaka City Univ., Media Center, Professor, 学術情報総合センター, 教授 (10162645)

Co-Investigator(Kenkyū-buntansha) NAMIKI Mitaro  Tokyo University of Agriculture and Technology, Faculty of Engineering, Associate Prof., 工学部, 助教授 (10208077)
TSUKADA Takashi  Osaka City Univ., Faculty of Literature, Associate Prof., 大学院・文学研究科, 教授 (60126125)
YAMADA Shoji  International Research Center for Japanese Studies, Research Division, Associate Prof., 研究部, 助教授 (20248751)
HOSHINO Satoshi  Kyoto Univ., Professor of Emeritus, 名誉教授 (90025867)
KAWAGUCHI Hiroshi  Tezukayama Univ., Faculty of Information and Management, Associate Prof., 経営情報学部, 助教授 (80224749)
大島 真理夫  大阪市立大学, 経済学部, 教授 (30128730)
Project Period (FY) 1999 – 2001
Project Status Completed (Fiscal Year 2001)
Budget Amount *help
¥5,900,000 (Direct Cost: ¥5,900,000)
Fiscal Year 2001: ¥1,400,000 (Direct Cost: ¥1,400,000)
Fiscal Year 2000: ¥2,400,000 (Direct Cost: ¥2,400,000)
Fiscal Year 1999: ¥2,100,000 (Direct Cost: ¥2,100,000)
KeywordsHistorical Document Images / OCR / Character Recognition / Character Segmentation / Recognition Dictionary / Transliteration / 古文書認識 / 古文書翻刻支援 / 近世文書 / 自動読み取り
Research Abstract

The purpose of this research is a trial study which try to develop an OCR (In the research, it is interpreted as an automatic recognition) for recognizing the historical document image at the early modern age, elucidating the mechanism in the character recognition of the historical document with cursive styles using writing brush. Also, the research is to focus on a new aspect in Japanese historical studies by introducing and supporting of a basic and limited character recognition system.
The research results are as follows.
(1) In the building of the dictionary for recognizing characters, the character segmentation from the document and the related computer programs for segmenting it are carried out.
(2) In a basic research on the segmentation and the recognition of the historical document character the recognition of the layout of document image and the automatic extraction of the title of document had carried out. In the experiment for recognizing characters, a new system without the segmentation of cursive characters was introduced.
(3) Supporting the transliteration of the document, the n-gram method was used and its effectiveness was confirmed. . ..
(4) In the historical document character recognition process, it was found to increase the similarity in the regularizing operation in recognizing process. Then, a newly system must be researched for the next stage.
(5) The character database focus on the title of document had developed. This database, which the number of titles has about 900 titles and 192 kinds of the characters, has been opened.
In detail, refer the research report "Research of the historical document transcription support system (1) and (2) are published in March, 2000 and in March, 2001 respectively.

Report

(4 results)
  • 2001 Annual Research Report   Final Research Report Summary
  • 2000 Annual Research Report
  • 1999 Annual Research Report
  • Research Products

    (16 results)

All Other

All Publications (16 results)

  • [Publications] 富田宏章, 柴山 守他: "古文書画像の2値化レベル制御による対話型文字分割とその評価"電気学会論文誌C. 118・C・4. 503-509 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] 山田奨治, 柴山 守: "n-gramによる古文書証文類翻刻支援の検討"情報処理学会人文科学とコンピュータシンポジウム論文集. 2000・17. 185-192 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] 尾崎浩司, 柴山 守他: "古文書画像の標題文字セグメンテーション"情報処理学会人文科学とコンピュータシンポジウム論文集. 2000・17. 279-286 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] Hiroaki TOMITA, Mamoru SHIBAYAMA et al.: "Interactive Character Segmentation of Ancient Documents by Controlling Binary Level and its Evalation"Trans. IEE of Japan. Vol.118-C, No.4. 503-509 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] Shoji YAMADA, Mamoru SHIBAYAMA: "A study of a Historical document research supporting system using n-gram"IPSJ Symposium Series. Vol.2000, No.17. 185-192 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] Koji Ozaki, Mamoru SHIBAYAMA et al.: "Title Character Segmentation for Historical Document Images"IPSJ Symposium Series. Vol.2000, No.17. 279-286 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2001 Final Research Report Summary
  • [Publications] 山田奨治, 柴山 守他: "古文書翻刻支援システム開発プロジェクト報告(2)"情報処理学会研究報告 2001-CH-50. 2001・5. 9-15 (2001)

    • Related Report
      2001 Annual Research Report
  • [Publications] 山田奨治, 柴山 守他: "類似文字検索機能をそなえた電子くずし字辞典の開発"情報処理学会研究報告 2002-CH-54. (予定). (2002)

    • Related Report
      2001 Annual Research Report
  • [Publications] 山田奨治,柴山守 他: "古文書翻刻支援システム開発プロジェクト報告(1)"情報処理学会研究報告2000-CH-45. 2000・5. 1-8 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 和泉勇治,加藤寧 他: "ニューラルネットワークを用いた古文書文字認識に関する一検討"情報処理学会研究報告2000-CH-45. 2000・5. 9-15 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 山田奨治,柴山守 他: "n-gramによる古文書証文類翻刻支援の検討"人文科学とコンピュータシンポジウム2000論文集. (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 尾崎浩司,柴山守 他: "古文書画像のレイアウト認識と標題抽出"情報処理学会研究報告2000-CH-47. 2000・67. 47-54 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 尾崎浩司,柴山守 他: "古文書画像の標題文字セグメンテーション"人文科学とコンピュータシンポジウム2000論文集. (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 尾崎浩司、柴山守、荒木義彦: "古文書レイアウト画像のピラミット型抽象化と標題の自動抽出"電気関係学会関西支部連合大会論文集 G12-6. G266 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] 山田奨治 他: "古文書翻刻支援システム開発プロジェクト報告(1)"情報処理学会研究報告 2000-CH-45. 2000.8. 1-8 (2000)

    • Related Report
      1999 Annual Research Report
  • [Publications] 和泉勇治、加藤 寧 他: "ニューラルネットワークを用いた古文書個別文字認識に関する一検討"情報処理学会研究報告 2000-CH-45. 2000.8. 9-15 (2000)

    • Related Report
      1999 Annual Research Report

URL: 

Published: 1999-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi