• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Image-based contents analysis for untranscribed document image archives

Research Project

Project/Area Number 17K00241
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Perceptual information processing
Research InstitutionFuture University-Hakodate

Principal Investigator

Terasawa Kengo  公立はこだて未来大学, システム情報科学部, 准教授 (10435985)

Project Period (FY) 2017-04-01 – 2020-03-31
Project Status Completed (Fiscal Year 2019)
Budget Amount *help
¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
Fiscal Year 2019: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2018: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2017: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Keywords画像、文章、音声等認識 / パターン認識 / データベース / デジタルアーカイブ
Outline of Final Research Achievements

In this study, we achieved the extraction of frequently appearing words and evaluation of the importance of the extracted words from machine-unreadable untranscribed document images, using image-based analysis of the frequency and pattern of occurrence of certain text strings. We also achieved to summarize the content of each document and to extract the part that is highly related a specific topic. We conducted a experiment on untranscribed newspaper images published in Meiji Era, and confirmed the performance and effectiveness of the proposed method. Our achievement will promote effective use of digital archives of document images.

Academic Significance and Societal Importance of the Research Achievements

本研究の成果により、手書きであったり経年劣化を経ているなどの理由で機械判読が困難である文書画像に対しても、その内容の要約や、特定のトピックと関連の高い箇所を閲覧者に提示することが可能となる。これにより、各地で整備が進み蓄積されている文書画像デジタルアーカイブが、専門研究者のみならず、一般市民や地域史に興味を持つ人々などにとっても、使いやすく便利な文献資料として、その価値を高めていくことが期待される。

Report

(4 results)
  • 2019 Annual Research Report   Final Research Report ( PDF )
  • 2018 Research-status Report
  • 2017 Research-status Report
  • Research Products

    (3 results)

All 2020 2018

All Presentation (3 results) (of which Int'l Joint Research: 1 results,  Invited: 1 results)

  • [Presentation] Extraction of Distinctive Keywords and Articles from Untranscribed Historical Newspaper Images2020

    • Author(s)
      Sora Ito and Kengo Terasawa
    • Organizer
      International Workshop on Advanced Image Technology, IWAIT2020
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] 文字認識が困難な文献史料画像の解析のための文字画像クラスタリング手法2018

    • Author(s)
      伊藤空,寺沢憲吾
    • Organizer
      電子情報通信学会技術研究報告PRMU
    • Related Report
      2018 Research-status Report
  • [Presentation] 歴史的文書画像に対する内容解析への取り組み2018

    • Author(s)
      寺沢憲吾
    • Organizer
      情報処理学会第116回人文科学とコンピュータ研究会発表会
    • Related Report
      2017 Research-status Report
    • Invited

URL: 

Published: 2017-04-28   Modified: 2021-02-19  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi