Establishment of Reading Support System of Ancient Documents aided by Handwritten Character Recognition Technology
Project/Area Number |
14580432
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Osaka Electro-Communication University |
Principal Investigator |
UMEDA Michio Osaka Electro-Communication University, Faculty of Information Science & Arts, Professor, 総合情報学部, 教授 (30213490)
|
Project Period (FY) |
2002 – 2004
|
Project Status |
Completed (Fiscal Year 2004)
|
Budget Amount *help |
¥3,300,000 (Direct Cost: ¥3,300,000)
Fiscal Year 2004: ¥700,000 (Direct Cost: ¥700,000)
Fiscal Year 2003: ¥1,100,000 (Direct Cost: ¥1,100,000)
Fiscal Year 2002: ¥1,500,000 (Direct Cost: ¥1,500,000)
|
Keywords | ancient documents / character recognition / character segmentation / character spotting / feature extraction / neural network / document reading / expert system / 細線化 |
Research Abstract |
This research presents a character segmentation and spotting method of ancient documents. In the segmentation method, the result of character recegnition process is utilized to cope with the cursive scripts and the mutual encroachment of characters which are peculiar to the ancient documents. In the spotting method, the previously designated characters are only extracted from the characters string. As an early segmentation, the characters string pattern is divided into the same cennected component by using the labelling processing. The area composed of the same component is surrounded with a rectangle and each character pattern is segmented each other by using the shape of rectangle such as height and width. Next, the individual character recognition technology is applied to the segmented pattern. From the recognition result, the rectangle failed in the segmentation is picked up and the re-segmantation is applied to the string contains this rectangle. Therefore, it is expected that the string is divided at the best position. On the other hand the neural network which corresponds to the previously designated character is prepared. The difference between input and output values of the network applied to the segmented pattern is calculated and the pattern which satisfies the condition is extracted as a spotting result. From the extraction experiment applied to 615 characters strings, the correct spotting rate of 94.22% was obtained to 5 designated characters by using the re-segmentation process, but the rate was 87.58% without the re-segmentation process. A reading support system of ancient documents for beginners was established by using the segmentation and spotting method.
|
Report
(4 results)
Research Products
(10 results)