Research on Automatic Character Recognition of Wooden Blocked Tibetan Manuscripts
Project/Area Number |
06680382
|
Research Category |
Grant-in-Aid for General Scientific Research (C)
|
Allocation Type | Single-year Grants |
Research Field |
情報システム学(含情報図書館学)
|
Research Institution | Tohoku Institute Of Technology |
Principal Investigator |
KOJIMA Masami Tohoku Institute of Technology, Faculty of Engineering, Associate Professor, 工学部, 助教授 (60085420)
|
Co-Investigator(Kenkyū-buntansha) |
KAWAZOE Yoshiyuki Tohoku University, Faculty of Engineering, Professor, 金属材料研究所, 教授 (30091672)
|
Project Period (FY) |
1994 – 1995
|
Project Status |
Completed (Fiscal Year 1995)
|
Budget Amount *help |
¥2,000,000 (Direct Cost: ¥2,000,000)
Fiscal Year 1995: ¥700,000 (Direct Cost: ¥700,000)
Fiscal Year 1994: ¥1,300,000 (Direct Cost: ¥1,300,000)
|
Keywords | Wooden blocked / Tibetan manuscripts / Euclidean distance with deferential weight / Similar characters / Character recognition / Object oriented design / Buddhist texts / Character segmentation / 文字自動認識 / オブジェクト指向文字辞書 / 潰れ文字 / 繋り文字 |
Research Abstract |
There are a number of important Buddhist texts written in Tibetan wooden blocked manuscripts which are expected to be efficiently studied by applying Object Oriented Technology. This research has originated from the desire to facilitate the work in cording and compiling these wooden blocked manuscripts into Romanized form to encourage Buddhist literature studies by present day advanced computer assistance. Many Tibetan characters are similar with each other. We try that similar characters are recognized themselves for wooden blocked Tibetan manuscripts by using Euclidean distance with deferential weight. A 96 % recognition rate has been achieved by the present Object Oriented procedure for 530 closed data. It is important to separate each syllable from wooden block Tibetan manuscripts to recognize automatically the Tibetan characters. However, 75 % of wooden blocked characters have connected structures. Therefore, 70 % character segmentation rate is achieved. For printed Tibetan characters, a 99.5 % recognition rate has already been achieved by present Object Oriented procedure. First, image data is digitized and taken into a personal computer by using an image scanner, and automatically recognized. It is taken about 5 seconds to recognize one Tibetan character automatically. We are presently studding to apply this already developed method to recognize the block Tibetan manuscripts.
|
Report
(3 results)
Research Products
(5 results)