• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Extracting Knowledge from Japanese Early-Modern Printed Books

Research Project

Project/Area Number 17H01829
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Library and information science/Humanistic social informatics
Research InstitutionNara Women's University

Principal Investigator

Joe Kazuki  奈良女子大学, 生活環境科学系, 教授 (90283928)

Co-Investigator(Kenkyū-buntansha) 高田 雅美  奈良女子大学, 生活環境科学系, 講師 (20397574)
石川 由羽  滋賀大学, データサイエンス教育研究センター, 助教 (20814370)
Project Period (FY) 2017-04-01 – 2020-03-31
Project Status Completed (Fiscal Year 2019)
Budget Amount *help
¥15,990,000 (Direct Cost: ¥12,300,000、Indirect Cost: ¥3,690,000)
Fiscal Year 2019: ¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
Fiscal Year 2018: ¥7,280,000 (Direct Cost: ¥5,600,000、Indirect Cost: ¥1,680,000)
Fiscal Year 2017: ¥5,980,000 (Direct Cost: ¥4,600,000、Indirect Cost: ¥1,380,000)
Keywords自動テキスト化 / 深層学習 / CNN / レイアウト解析 / 言語翻訳 / デジタルアーカイブ / 文字認識 / テキスト化 / ディープラーニング / 知識処理 / 近代書籍自動テキスト化 / 文語体自動翻訳
Outline of Final Research Achievements

Four results were obtained in this study. First, we integrated the previous recognition methods in 2017, and although there is little training data, we are close to a practical application. The recognition rate of 2,678 Japanese early-modern printed characters was recorded at more than 90%. Next, to increase the training data, we used deep learning to automatically generate unknown early-modern printed character types to be presented in 2018. In addition, in 2019, the existing recognition methods was revamped, and by using deep learning, to get the same as in 2017. In addition, by performing transfer learning, the recognition rate has been increased from around 90% to 98%. We also showed that deep learning can be used for layout analysis, which is essential for practical applications.

Academic Significance and Societal Importance of the Research Achievements

近年個人所有のHDD等記憶メディアが劇的に大容量化し、インターネットを介して自由にデータのアクセスが可能になったことから、紙媒体でしか記録が残されていなかった近代書籍等のアーカイブ化が急速に行われている。しかしながら画像でのアーカイブ化では全文検索が不可能であり、現在のような規格が規定されていなかった頃の活版印刷に対応した自動テキスト化技術の確立は急務の課題である。本研究はその技術の確立を目指したもので、現時点で実用化に極めて近いレベルまで研究が進展している。

Report

(4 results)
  • 2019 Annual Research Report   Final Research Report ( PDF )
  • 2018 Annual Research Report
  • 2017 Annual Research Report
  • Research Products

    (8 results)

All 2019 2018

All Journal Article (4 results) (of which Peer Reviewed: 4 results,  Open Access: 3 results) Presentation (4 results) (of which Int'l Joint Research: 1 results)

  • [Journal Article] Applying CNNs to Early-Modern Printed Japanese Character Recognition2019

    • Author(s)
      Suzuka Yasunami, Norie Koiso, Yuki Takemoto, Yu Ishikawa, Masami Takata, Kazuki Joe
    • Journal Title

      The 2019 International Conference on Parallel and Distributed Processing Techniques and Applications

      Volume: 1 Pages: 189-195

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Structure of Neural Network Automatically Generating Fonts for Early-Modern Japanese Printed Books2019

    • Author(s)
      Yuki Takemoto, Yu Ishikawa, Masami Takata, Kazuki Joe
    • Journal Title

      The 2019 International Conference on Parallel and Distributed Processing Techniques and Applications

      Volume: 1 Pages: 182-188

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Layout Analysis using Semantic Segmentation for Imperial Meeting Minutes2019

    • Author(s)
      Sayaka Iida, Yuki Takemoto, Yu Ishikawa, Masami Takata, Kazuki Joe
    • Journal Title

      The 2019 International Conference on Parallel and Distributed Processing Techniques and Applications

      Volume: 1 Pages: 135-141

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] Automatic Font Generation for Early-Modern Japanese Printed Books2018

    • Author(s)
      Yuki Takemoto, Yu Ishikawa, Masami Takata, Kazuki Joe
    • Journal Title

      The 2018 International Conference on Parallel and Distributed Processing Techniques and Applications

      Volume: On-site Edition Pages: 326-332

    • Related Report
      2018 Annual Research Report
    • Peer Reviewed
  • [Presentation] 近代書籍における低出現頻度文字種の獲得2019

    • Author(s)
      藤田未希, 竹本有紀, 石川由羽, 髙田雅美, 城和貴
    • Organizer
      情報処理学会数理モデル化と問題解決研究会
    • Related Report
      2019 Annual Research Report
  • [Presentation] 帝国議会会議録におけるレイアウト解析手法の比較2018

    • Author(s)
      飯田 紗也香,竹本 有紀,石川 由羽,高田 雅美,城 和貴
    • Organizer
      情報処理学会数理モデル化と問題解決研究会
    • Related Report
      2018 Annual Research Report
  • [Presentation] 近代文語体と現代口語体の自動翻訳への試み2018

    • Author(s)
      林 英里香,竹本 有紀,石川 由羽,高田 雅美,城 和貴
    • Organizer
      情報処理学会数理モデル化と問題解決研究会
    • Related Report
      2018 Annual Research Report
  • [Presentation] Automatic Font Generation For Early-Modern Japanese Printed Books2018

    • Author(s)
      Yuki Takemoto
    • Organizer
      International Conference on Paralel and Distributed Systems and Applications 2018
    • Related Report
      2017 Annual Research Report
    • Int'l Joint Research

URL: 

Published: 2017-04-28   Modified: 2021-02-19  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi