Recognizing Japanese brush script in image
Project/Area Number |
16K12545
|
Research Category |
Grant-in-Aid for Challenging Exploratory Research
|
Allocation Type | Multi-year Fund |
Research Field |
Library and information science/Humanistic social informatics
|
Research Institution | National Institute of Japanese Literature |
Principal Investigator |
Nomoto Tadashi 国文学研究資料館, 研究部, 准教授 (20321557)
|
Co-Investigator(Kenkyū-buntansha) |
相田 満 国文学研究資料館, 研究部, 准教授 (00249921)
|
Research Collaborator |
Terasawa Kengo
|
Project Period (FY) |
2016-04-01 – 2019-03-31
|
Project Status |
Completed (Fiscal Year 2018)
|
Budget Amount *help |
¥3,380,000 (Direct Cost: ¥2,600,000、Indirect Cost: ¥780,000)
Fiscal Year 2018: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Fiscal Year 2017: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2016: ¥1,690,000 (Direct Cost: ¥1,300,000、Indirect Cost: ¥390,000)
|
Keywords | くずし字 / 画像検索 / 文字画像認識 / 深層学習 / 文字認識 / 日本古典籍 / 毛筆画像解析 / 画像処理 / 人口知能 / 情報検索 / 毛筆画像 |
Outline of Final Research Achievements |
The goal of the present work is to develop an approach that enables the recognition of Japanese bush scripts in image without resorting to OCRs or hand annotated labels. To this end, we considered three approaches: (1) an approach which maps a modern Kanji into a corresponding Kuzushi-ji, which comes in a variety of shapes and forms, and uses the latter to identify a Kuzushi-ji character we are interested in; (2) an alternative approach where we use a modern Kanji in place of Kuzushi-ji to do the identification; (3) finally, one which leverages CycleGan to generate a pseudo Kuzushi-ji which we use as a query to match against the image. The experiments found that the first method, one which relies on the mapping of a modern Kanji into a possible Kuzushi-ji performed significantly better than the rest, suggesting that the recognition of Kuzushi-ji character has benefitted greatly from the use of the mapping.
|
Academic Significance and Societal Importance of the Research Achievements |
デジタル技術の発展に伴い国内の歴史的典籍が大量にデジタル化されアーカイブされている.それらのほとんどは画像形式で保存されているため,キーワードによる自由な検索ができず,コンテンツの再利用や知財化へ向けた取組みの大きな障壁になっている.手動あるいはOCRによる翻刻を用いた検索なども提案されているが実用の域に達していない.この点において本件は有用な貢献が期待できる.
|
Report
(4 results)
Research Products
(6 results)