• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Integration of Crowdsourcing and Machine Learning for Large-scale Transcription of Pre-modern Historical Manuscripts

Research Project

Project/Area Number 18K18338
Research Category

Grant-in-Aid for Early-Career Scientists

Allocation TypeMulti-year Fund
Review Section Basic Section 90020:Library and information science, humanistic and social informatics-related
Research InstitutionNational Museum of Japanese History

Principal Investigator

Hashimoto Yuta  国立歴史民俗博物館, 大学共同利用機関等の部局等, 助教 (10802712)

Project Period (FY) 2018-04-01 – 2020-03-31
Project Status Completed (Fiscal Year 2019)
Budget Amount *help
¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000)
Fiscal Year 2019: ¥520,000 (Direct Cost: ¥400,000、Indirect Cost: ¥120,000)
Fiscal Year 2018: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Keywordsクラウドソーシング / 翻刻 / IIIF / くずし字 / 歴史資料 / 文字認識 / OCR / 機械学習 / 古典籍 / 古文書
Outline of Final Research Achievements

This research program aimed to develop a efficient method for transcribing pre-modern Japanese documents written with cursive characters (kuzushiji). Although the program initially planned to develop an OCR technology for kuzushiji on its own, this field has rapidly advanced over the past few years against the author's expectation.
This led the author to collaborate with the AI researchers who study the OCR technology for kuzushiji, rather than to compete with them. Through this collaboration, the author launched in 2019 a new version of "Minna de Honkoku", a crowdsourced transcription platform that supports automatic recognition of kuzushiji. Since its launch, 2.5 million characters have been transcribed on this platform by more than 800 participants.

Academic Significance and Societal Importance of the Research Achievements

AI認識に対応した「みんなで翻刻」は、300日の短期間で250万字ものテキスト化を成し遂げた。本研究成果の直接的な意義のひとつは、AIの支援を通じて市民による翻刻作業の効率化が実際に可能であることを実際に示したことにある。
より大きな観点での成果は、技術の適切な組み合わせによって、①人文学研究者、②市民、③AI技術(およびその研究者)の三者が互恵的な関係を築くことが可能であると示したことにある。AI技術の発展が人文学研究と市民参加型研究の将来にもたらす影響について、これまで様々な議論がなされてきたが、本研究の成果は重要な参考事例のひとつになるはずである。

Report

(3 results)
  • 2019 Annual Research Report   Final Research Report ( PDF )
  • 2018 Research-status Report
  • Research Products

    (6 results)

All 2019 2018 Other

All Presentation (4 results) (of which Int'l Joint Research: 3 results,  Invited: 1 results) Book (1 results) Remarks (1 results)

  • [Presentation] Digital Humanities Research in National Museum of Japanese History2019

    • Author(s)
      Yuta Hashimoto
    • Organizer
      The International Conference for Museums of Language & Writing 2019
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research / Invited
  • [Presentation] Honkoku2: Towards a Large-scale Transcription of Pre-modern Japanese Manuscripts2019

    • Author(s)
      Yuta Hashimoto
    • Organizer
      The 9th Conference of Japanese Association for Digital Humanities (JADH2019)
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Minna De Honkoku: Learning-Driven Crowdsourced Transcription Of Pre-Modern Japanese Earthquake Records2018

    • Author(s)
      Yuta Hashimoto, et al.
    • Organizer
      Digital Humanities 2018
    • Related Report
      2018 Research-status Report
    • Int'l Joint Research
  • [Presentation] 日本語文献史料の構造化記述のための軽量マークアップ言語の開発2018

    • Author(s)
      橋本雄太, 宮川真弥
    • Organizer
      人文科学とコンピューターシンポジウム2018
    • Related Report
      2018 Research-status Report
  • [Book] デジタルアーカイブ・ベーシックス22019

    • Author(s)
      今村文彦 監修/鈴木親彦 責任編集
    • Total Pages
      208
    • Publisher
      勉誠出版
    • ISBN
      9784585202820
    • Related Report
      2019 Annual Research Report
  • [Remarks] みんなで翻刻

    • URL

      https://honkoku.org/

    • Related Report
      2019 Annual Research Report

URL: 

Published: 2018-04-23   Modified: 2021-02-19  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi