研究課題/領域番号 |
26730166
|
研究機関 | 立命館大学 |
研究代表者 |
バトジャルガル ビルゲ 立命館大学, 総合科学技術研究機構, 研究員 (30725396)
|
研究期間 (年度) |
2014-04-01 – 2017-03-31
|
キーワード | historical documents / traditional Mongolian / named entity extraction / digital library / machine learning |
研究実績の概要 |
In the AY2015, we have proposed a named entity extraction method for digitized ancient Mongolian historical documents by using features of traditional Mongolian script. The proposed method extracts personal names and place names by employing machine learning techniques for aiming to reduce the labor-intensive analysis on historical text. In the proposed approach, an ancient Mongolian corpus gets tokenized, each token gets annotated and gold standard annotations are prepared for inputting into machine learning algorithms for learning. The proposed method learns the extraction rules of personal names and place names from annotated training corpora, and then extracts personal names and place names from ancient Mongolian texts by using machine learning algorithms. We also tagged the generational or dynastic information, an inherited or life-time title of nobility, or a traditional descriptive phrase or nick-names.
Moreover, we are creating the digital representations of ancient Mongolian historical documents: 1) To encode contextual information for formalizing and representing explicit information about context; 2) To encode ancient words, which were misspelled or written differently than ancient orthography, along with their modern orthography while preserving the writing of original manuscripts; and 3)To represent editorial markup, commentaries, alterations, revisions, corrections, transcriptions and interpretations.
Ongoing research results and achievements have been published in parts in two International conference papers.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
Our research has been conducted according to the research plan. As planned, the achievements in the AY2015 will allow advancing my research towards to the research goal in developing an automatic named entity extraction method by employing machine learning techniques that aim to reduce the labor-intensive annotation on historical text.
Several important features of traditional Mongolian script for distinguishing the named entities are researched and identified. Such features might be possible clues to the proposed method for distinguishing named entities.
|
今後の研究の推進方策 |
In the AY2016, based on the research results obtained in the past, we will develop web-based systems to visualize historical figures, ancient place names, various different commentaries, transcriptions, annotations and interpretations and make them available on the Internet. Users’ evaluation will also be conducted by experts and humanities researchers.
Research assistance of experts and students are necessary on a part-time basis to 1) evaluate the proposed system 2) conduct experiments and 3) analyze the experimental results. The proposed system will be evaluated by 1) conducting experiments and calculating standard measures such as precision, recall and F-measure; 2) user evaluation among experts and users who have tried the proposed system. We plan to conduct evaluations at the National University of Mongolia and Ritsumeikan University of Japan. We are also planning to perform evaluations by several experts. Feedback from the researchers will be received in a timely manner. Further improvements of the system will be carried out based on the evaluation results and user feedbacks. Research achievements and results will be presented at the domestic and international conferences.
When we got satisfied results, we want to expand the proposed method to other historical documents not only in Mongolian.
|
次年度使用額が生じた理由 |
In AY2015. I was dedicated myself and my efforts to the research activities that requires less budgets. However, in the final year - AY2016, my research activities will require more budget than AY2015.
|
次年度使用額の使用計画 |
I will use the remaining budget for next years’ research activities that requires more budgets. These activities include 1) to develop web-based systems and create websites; 2) to hire experts and students on a part-time basis to a) evaluate the proposed system b) conduct experiments, and c) analyze the experimental results; as well as 3) to make business trips to a) meet and obtain feedback, advices, evaluations from the researchers at the National University of Mongolia, and b) present research achievements and results at the domestic and international conferences.
|