2020 年度実施状況報告書

Multilingual Knowledge Discovery in Digital Cultural Collections

研究課題

研究課題/領域番号	20K20135
研究機関	立命館大学
研究代表者	SONG Yuting 立命館大学, 情報理工学部, 助教 (50849388)
研究期間 (年度)	2020-04-01 – 2023-03-31
キーワード	Word embeddings / MT evaluation / Metadata translation / Entity recognition / Relation extraction
研究実績の概要	This year we focused on improving bilingual word embeddings models and collecting datasets of metadata records. First, we proposed a method to improve the accuracy of Japanese-English bilingual word embeddings. Second, we did preliminary attempts to evaluate machine translations on translating ukiyo-e metadata records from Japanese to English. In addition, in order to conduct further experiments, we collected English human translations of Japanese ukiyo-e metadata records by using a crowdsourcing platform. Moreover, the machine translations of ukiyo-e metadata records were evaluated by both Japanese and English native speakers through a crowdsourcing platform (Lancers). Overall, the project has been smoothly conducted step by step according to the research proposal.
現在までの達成度 (区分)	現在までの達成度 (区分) 2: おおむね順調に進展している理由 The project progress is going smoothly as planned. We have proposed a method to improve Japanese-English word embedding. Besides, we have evaluated the performance of online machine translation systems (i.e., Google Translator, Microsoft Translate, DeepL Translator) on translating Japanese ukiyo-e metadata to English. In addition, we have collected Japanese-English metadata records for future research. What's more, we have investigated the current neural network based models of entity and relation extraction, which can be applied to the dataset of ukiyo-e metadata in the next year.
今後の研究の推進方策	For future work, we will focus on developing neural network based methods for learning multilingual representations of metadata and extracting named entities from Japanese and English textual metadata in cultural collections. We will also manually annotated named entities in metadata records, which are essential for training and evaluating entity extraction models.
次年度使用額が生じた理由	We will use the budget to purchase hardware such as GPUs to be able to conduct research based on deep neural networks. Besides, some funds will be spent on crowdsourcing jobs for data annotations. Finally, we will attend the conferences to disseminate research results.

研究成果
(5件)

すべて 2021 2020

すべて雑誌論文 (1件) (うち査読あり 1件) 学会発表 (4件) (うち国際学会 2件)

[雑誌論文] Learning Japanese-English Bilingual Word Embeddings by Using Language Specificity2020
- 著者名/発表者名
  Song Yuting、Batjargal Biligsaikhan、Maeda Akira
- 雑誌名
  
  International Journal of Asian Language Processing
  
  巻: 3 ページ: 14 pages
- DOI
  10.1142/S2717554520500149
- 査読あり
[学会発表] Linking Ukiyo-e Records across Languages: An Application of Cross-Language Record Linkage Techniques to Digital Cultural Collections2021
- 著者名/発表者名
  Yuting Song, Biligsaikhan Batjargal, and Akira Maeda
- 学会等名
  The 5th Anniversary International Symposium of Asia-Japan Research at Ritsumeikan University
[学会発表] Joint Entity and Relation Extraction from Clinical Records Using Pre-trained Language Model2021
- 著者名/発表者名
  FANG Xintao, SONG Yuting, Maeda Akira
- 学会等名
  第13回データ工学と情報マネジメントに関するフォーラム（DEIM2021）
[学会発表] A Preliminary Attempt to Evaluate Machine Translations of Ukiyo-e Metadata Records2020
- 著者名/発表者名
  Yuting Song, Biligsaikhan Batjargal, and Akira Maeda
- 学会等名
  The 22nd International Conference on Asia-Pacific Digital Libraries
- 国際学会
[学会発表] Finding Identical Ukiyo-e Prints across Databases in Japanese, English and Dutch2020
- 著者名/発表者名
  Yuting Song, Biligsaikhan Batjargal, and Akira Maeda
- 学会等名
  Digital Humanities 2020
- 国際学会

2020 年度 実施状況報告書

Multilingual Knowledge Discovery in Digital Cultural Collections

研究代表者

SONG Yuting 立命館大学, 情報理工学部, 助教 (50849388)

現在までの達成度 (区分)

理由

研究成果

[雑誌論文] Learning Japanese-English Bilingual Word Embeddings by Using Language Specificity2020

著者名/発表者名

雑誌名

DOI

[学会発表] Linking Ukiyo-e Records across Languages: An Application of Cross-Language Record Linkage Techniques to Digital Cultural Collections2021

著者名/発表者名

学会等名

[学会発表] Joint Entity and Relation Extraction from Clinical Records Using Pre-trained Language Model2021

著者名/発表者名

学会等名

[学会発表] A Preliminary Attempt to Evaluate Machine Translations of Ukiyo-e Metadata Records2020

著者名/発表者名

学会等名

[学会発表] Finding Identical Ukiyo-e Prints across Databases in Japanese, English and Dutch2020

著者名/発表者名

学会等名

2020 年度実施状況報告書