|Budget Amount *help
¥2,000,000 (Direct Cost : ¥2,000,000)
Fiscal Year 1992 : ¥2,000,000 (Direct Cost : ¥2,000,000)
A new approach to reducing image noises which disturb the optical character recognition has been studied. A peculiarity of the study is to use information about color to improve classification of "true" letters from image noises such as red letters, paper, pseudo-letters which are written on the reverse side of translucent papers and so on. Japanese original classical books written by the Chinese black ink on the white Japanese classical papers were selected as the research samples.
The results are as follows :
(1) Characteristics of Color Distribution : Original images were digitized by the color image scanner (100dpi, 256 gray-levels/R,G,B). and each picture cells are represented as 3-dimensional vector in the RGB-chromaticity coordinates then analyzed. The characteristics of the color distribution are, (a) many of the picture cells have the color distribution along with the line of R=G=B, (b)red letters have the different color distribution from (a), (c) brightness histograms of R,G and B colors are almost bimodal.
(2) Classification of Images : (a) The characteristic of (a) and (b) in (1) are useful to distinguish red letters from another images. (b) The discriminant threshold selection method (Ohtu's method) was applied to each brightness histograms to determine thresholds between black letters and paper segments. This method can classify both segments sharply, but it is inclined to slices off the peripheral picture cells of the "true" black letters. (c) The cluster analysis was introduced to classify "true" black letters and paper segments more precisely, which gives better result.
This study verify usefulness of the color information to eliminate image noise.