研究実績の概要 |
This research proposes a comprehensive data extraction and analysis method for legal documents in the Mongolian language. Recently, many modern Mongolian legal documents have been made publicly available in digital formats. However, analyses of these legal documents have not been done mainly due to the lack of Mongolian Natural Language Processing (NLP) tools that can handle modern Mongolian legal documents. A reliable computerized analysis is necessary, which requires developing an innovative method for analyzing Mongolian legal documents. Manual reading and analyzing documents are not effective, on a massive scale. There are increasing demands from researchers and lawyers to perform analysis of legal documents on a massive scale with prompt and accurate results. The proposed method aims to analyze Mongolian legal documents by utilizing deep learning techniques.
In the FY2021, the following tasks have been performed: 1) collecting and preparing training datasets, and 2) demonstrating existing deep learning models. Approximately 11,500 modern Mongolian legal documents including Mongolian laws and decrees of government organizations were prepared. Mongolian language resources including corpus of 100K part-of-speech tagged words, English-Mongolian law dictionary, “Law” category news from 75K dataset, 700M words of news corpus, 220K personal names, 90K clan/family names, and 192K company names were prepared.
Moreover, existing BERT-based deep learning models were demonstrated for classifying modern Mongolian legal documents and preliminary experiments were conducted.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
3: やや遅れている
理由
The planned business trips and surveys that were expected to be conducted in Mongolia were delayed significantly and the feedbacks and data were not obtained as planned due to the COVID-19 situations. Travel restrictions due to the COVID-19, the entry prohibition to the University, and inaccessibility to research facilities were slowing down this research. Thus, in the FY2021, I was dedicated myself and my efforts to the research activities that requires less budgets such as collecting and preparing training datasets. I was able to download several Mongolian legal documents from the public domains via Internet. Some budget remains have occurred due to the slight delays because of the COVID-19 restrictions and bans.
|
今後の研究の推進方策 |
In the FY2022, business trips for 1) conducting surveys and evaluations among overseas users, and 2) obtaining analyses and feedbacks from face-to-face meetings that were delayed due to the COVID19, will be conducted.
Development of the proposed method will also be continued and I will train a deep learning model for Mongolian legal documents. Continuous experiments will be conducted to improve the proposed method. Assistance from subject matter experts and feedback from the researchers are necessary in a timely manner. Ongoing research results will be reported in a timely manner and achievements will be presented at the domestic and international conferences.
In the FY2022, I will also develop a web-based system and make it available on the Internet.
|
次年度使用額が生じた理由 |
The remaining budget have occurred due to restrictions and bans of the COVID-19. I will use the remaining budget for next year’s research for 1) conducting surveys and evaluations among overseas users, and 2) obtaining analyses and feedbacks from face-to-face meetings, that were delayed due to the COVID19.
|