1991 Fiscal Year Final Research Report Summary
Study of French Text Analysis Programming (About the Balzac's Novels)
Project/Area Number |
02801060
|
Research Category |
Grant-in-Aid for General Scientific Research (C)
|
Allocation Type | Single-year Grants |
Research Field |
仏語・仏文学
|
Research Institution | Saitama University |
Principal Investigator |
KURIU Kazuo Saitama Univ., FAC. of Liberal Arts, Professor, 教養学部, 教授
|
Project Period (FY) |
1990 – 1991
|
Keywords | Balzac / "La Comedie Humaine" / Concordance / French Literature / Frantext / Textual Database / String Treatment |
Research Abstract |
As in the preceding year, I continued to make a concordance of Balzac's "La Comedie Humaine". (1)This year especially, I made frequent use of OCR(Optical Character Recognition)system. Some assistants made my job easier. Then now I possess almost all of the work input, which has over than 11, 000 pages in the Pleiade Edition : short, medium and long novels, distributed in more than 3 hundred floppy disquettes. (2)OCR made non-negligible amount of mis-readings, so I must re-read and correct them. This time-consuming work is however, almost finished, : I have now no more than 500 pages to re-read. (3)The computer programs in order to make concordance are working quite well, including cutting out of keywords with context before and after, sorting them in alphabetical order, etc. (4)In this way. I will have soon a very voluminous result, that is to say several hundred millions of bytes in 520 disquettes. (5)For personal use. that is all, but how, under what form to offer to be used by balzacians ? Three possibilities : first, to make it one of the databases at Scientific Information Center ; second, to print it out and put it in a university library, or, third, to publish it in CD-ROM. In any of cases, I will need a public financial encouragement. Parallely to that, I tried to use a French National Database called "FRANTEXT". I talked about this in the annual congress of the Society of French language and literature in 1990 and published it in the Faculty reports in 1991. French responsibles of the database and I are now discussing about how to use my corpus from France.
|
Research Products
(2 results)