2016 Fiscal Year Final Research Report
Research on visualization and information extraction from ancient Mongolian historical documents
Project/Area Number |
26730166
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Multi-year Fund |
Research Field |
Library and information science/Humanistic social informatics
|
Research Institution | Ritsumeikan University |
Principal Investigator |
|
Project Period (FY) |
2014-04-01 – 2017-03-31
|
Keywords | historical documents / traditional Mongolian / name entity extraction / digital library / machine learning |
Outline of Final Research Achievements |
In this research, we proposed a named entity extraction method for digitized ancient Mongolian documents. Named entities such as personal names and place names were extracted by employing Support Vector Machine that aims to reduce the labor-intensive analysis on historical text. Using the extracted results, we built a digital edition of a Mongolian historical manuscript written in traditional Mongolian script. The Text Encoding Initiative guidelines was adopted to encode the named entities, commentaries and transliterations. A web-based prototype was developed for digital humanities scholarship. The proposed prototype can display and search traditional Mongolian text and its transliteration in Latin letters along with the highlighted named entities and the scanned images of the source manuscript. We believe the proposed system will have a social significance for digging the hidden knowledge from ancient Mongolian historical documents that is not available in modern Mongolian documents.
|
Free Research Field |
Digital Humanities
|