Research on visualization and information extraction from ancient Mongolian historical documents
Project/Area Number |
26730166
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Multi-year Fund |
Research Field |
Library and information science/Humanistic social informatics
|
Research Institution | Ritsumeikan University |
Principal Investigator |
|
Project Period (FY) |
2014-04-01 – 2017-03-31
|
Project Status |
Completed (Fiscal Year 2016)
|
Budget Amount *help |
¥3,380,000 (Direct Cost: ¥2,600,000、Indirect Cost: ¥780,000)
Fiscal Year 2016: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
Fiscal Year 2015: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000)
Fiscal Year 2014: ¥1,040,000 (Direct Cost: ¥800,000、Indirect Cost: ¥240,000)
|
Keywords | historical documents / traditional Mongolian / name entity extraction / digital library / machine learning / named entity extraction |
Outline of Final Research Achievements |
In this research, we proposed a named entity extraction method for digitized ancient Mongolian documents. Named entities such as personal names and place names were extracted by employing Support Vector Machine that aims to reduce the labor-intensive analysis on historical text. Using the extracted results, we built a digital edition of a Mongolian historical manuscript written in traditional Mongolian script. The Text Encoding Initiative guidelines was adopted to encode the named entities, commentaries and transliterations. A web-based prototype was developed for digital humanities scholarship. The proposed prototype can display and search traditional Mongolian text and its transliteration in Latin letters along with the highlighted named entities and the scanned images of the source manuscript. We believe the proposed system will have a social significance for digging the hidden knowledge from ancient Mongolian historical documents that is not available in modern Mongolian documents.
|
Report
(4 results)
Research Products
(23 results)