2021 年度実施状況報告書

Research on analyzing Mongolian legal documents using deep learning

研究課題

研究課題/領域番号	21K12600
研究機関	立命館大学
研究代表者	バトジャルガルビルゲサイハン立命館大学, 衣笠総合研究機構, 研究員 (30725396)
研究期間 (年度)	2021-04-01 – 2025-03-31
キーワード	deep learning / legal documents / text classification / Mongolian
研究実績の概要	This research proposes a comprehensive data extraction and analysis method for legal documents in the Mongolian language. Recently, many modern Mongolian legal documents have been made publicly available in digital formats. However, analyses of these legal documents have not been done mainly due to the lack of Mongolian Natural Language Processing (NLP) tools that can handle modern Mongolian legal documents. A reliable computerized analysis is necessary, which requires developing an innovative method for analyzing Mongolian legal documents. Manual reading and analyzing documents are not effective, on a massive scale. There are increasing demands from researchers and lawyers to perform analysis of legal documents on a massive scale with prompt and accurate results. The proposed method aims to analyze Mongolian legal documents by utilizing deep learning techniques. In the FY2021, the following tasks have been performed: 1) collecting and preparing training datasets, and 2) demonstrating existing deep learning models. Approximately 11,500 modern Mongolian legal documents including Mongolian laws and decrees of government organizations were prepared. Mongolian language resources including corpus of 100K part-of-speech tagged words, English-Mongolian law dictionary, “Law” category news from 75K dataset, 700M words of news corpus, 220K personal names, 90K clan/family names, and 192K company names were prepared. Moreover, existing BERT-based deep learning models were demonstrated for classifying modern Mongolian legal documents and preliminary experiments were conducted.
現在までの達成度 (区分)	現在までの達成度 (区分) 3: やや遅れている理由 The planned business trips and surveys that were expected to be conducted in Mongolia were delayed significantly and the feedbacks and data were not obtained as planned due to the COVID-19 situations. Travel restrictions due to the COVID-19, the entry prohibition to the University, and inaccessibility to research facilities were slowing down this research. Thus, in the FY2021, I was dedicated myself and my efforts to the research activities that requires less budgets such as collecting and preparing training datasets. I was able to download several Mongolian legal documents from the public domains via Internet. Some budget remains have occurred due to the slight delays because of the COVID-19 restrictions and bans.
今後の研究の推進方策	In the FY2022, business trips for 1) conducting surveys and evaluations among overseas users, and 2) obtaining analyses and feedbacks from face-to-face meetings that were delayed due to the COVID19, will be conducted. Development of the proposed method will also be continued and I will train a deep learning model for Mongolian legal documents. Continuous experiments will be conducted to improve the proposed method. Assistance from subject matter experts and feedback from the researchers are necessary in a timely manner. Ongoing research results will be reported in a timely manner and achievements will be presented at the domestic and international conferences. In the FY2022, I will also develop a web-based system and make it available on the Internet.
次年度使用額が生じた理由	The remaining budget have occurred due to restrictions and bans of the COVID-19. I will use the remaining budget for next year’s research for 1) conducting surveys and evaluations among overseas users, and 2) obtaining analyses and feedbacks from face-to-face meetings, that were delayed due to the COVID19.

研究成果
(4件)

すべて 2022 2021

すべて雑誌論文 (3件) (うち国際共著 1件、査読あり 1件、オープンアクセス 3件) 学会発表 (1件)

[雑誌論文] 日本の歴史的書類におけるくずし字の認識 ――国際ARCセミナー・レビュー2022
- 著者名/発表者名
  バトジャルガル　ビルゲサイハン
- 雑誌名
  
  紀要アート・リサーチ
  
  巻: 22-2号ページ: 2
- オープンアクセス
[雑誌論文] A Prototypical Network-Based Approach for Low-Resource Font Typeface Feature Extraction and Utilization2021
- 著者名/発表者名
  Li Kangying、Batjargal Biligsaikhan、Maeda Akira
- 雑誌名
  
  Data
  
  巻: 6 ページ: 134～134
- DOI
  10.3390/data6120134
- 査読あり / オープンアクセス / 国際共著
[雑誌論文] ARCポータルデータベースの機械判読可能形式データへの変換API開発2021
- 著者名/発表者名
  バトジャルガル　ビルゲサイハン、津田光弘、山路正憲、金子貴昭
- 雑誌名
  
  紀要「アート・リサーチ」テクニカルサポート通信
  
  巻: 22-1号ページ: 11
- オープンアクセス
[学会発表] A Yet Another Trial to Apply Deep Learning Technologies to the Digitized Images of the Databases of the Art Research Center Owned Materials2021
- 著者名/発表者名
  Biligsaikhan Batjargal
- 学会等名
  The 85th International ARC Seminar (Webinar), Art Research Center, Ritsumeikan University, Japan

2021 年度 実施状況報告書

Research on analyzing Mongolian legal documents using deep learning

研究代表者

バトジャルガル ビルゲサイハン 立命館大学, 衣笠総合研究機構, 研究員 (30725396)

現在までの達成度 (区分)

理由

研究成果

[雑誌論文] 日本の歴史的書類におけるくずし字の認識 ――国際ARCセミナー・レビュー2022

著者名/発表者名

雑誌名

[雑誌論文] A Prototypical Network-Based Approach for Low-Resource Font Typeface Feature Extraction and Utilization2021

著者名/発表者名

雑誌名

DOI

[雑誌論文] ARCポータルデータベースの機械判読可能形式データへの変換API開発2021

著者名/発表者名

雑誌名

[学会発表] A Yet Another Trial to Apply Deep Learning Technologies to the Digitized Images of the Databases of the Art Research Center Owned Materials2021

著者名/発表者名

学会等名

2021 年度実施状況報告書

バトジャルガルビルゲサイハン立命館大学, 衣笠総合研究機構, 研究員 (30725396)