2016 Fiscal Year Final Research Report
Implementation of supporting system and environment for auto-extracting texts from early-modern printed books
Project/Area Number |
26280119
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Partial Multi-year Fund |
Section | 一般 |
Research Field |
Library and information science/Humanistic social informatics
|
Research Institution | Nara Women's University |
Principal Investigator |
Joe Kazuki 奈良女子大学, 生活環境科学系, 教授 (90283928)
|
Co-Investigator(Kenkyū-buntansha) |
高田 雅美 奈良女子大学, 生活環境科学系, 講師 (20397574)
|
Research Collaborator |
KIMEZAWA Tsukasa 国立国会図書館西館, 電子図書館課, 書士
|
Project Period (FY) |
2014-04-01 – 2017-03-31
|
Keywords | 近代書籍用OCR / 文字認識 / 特徴量 / アンサンブル学習 |
Outline of Final Research Achievements |
In this research, we implemented a supporting system and environment for auto-extracting texts from early-modern printed books. Apart from the current DTP, early-modern printed character recognition requires picture images of early-modern printed books for learning samples. When we collect up to 1000 types characters, the task is not so difficult while when it reaches to about 2000, the task is almost impossible. So we implemented an early-modern printed character recognition system with inefficient learning samples to apply early-modern printed books. The system detects unrecognizable character types to ask user for the correct type. The correctly recognized characters are given to the learning samples so that the recognition system is improved.
|
Free Research Field |
パターン認識
|