研究課題/領域番号 |
22J13719
|
配分区分 | 補助金 |
研究機関 | 京都大学 |
研究代表者 |
毛 卓遠 京都大学, 情報学研究科, 特別研究員(DC2)
|
研究期間 (年度) |
2022-04-22 – 2024-03-31
|
キーワード | multilingual translation / low-resource translation / multilingual embedding / model efficiency |
研究実績の概要 |
In the past year, we focused on improving the efficiency of multilingual sentence representation learning and exploring novel methods for improving multilingual machine translation. Both research promotes the research for multilingual / low-resource neural machine translation. (1) We proposed an efficient and effective method for training and presented the work in 言語処理学会 2023. On the other hand, we proposed knowledge distillation for compressing a large model, and it has been accepted to EACL 2023 main conference, which leads to efficient model inference. With the above achievements, the process of collecting parallel sentences for training translation systems will be accelerated. Specifically, the model training phase can be accelerated by 4 - 16 times, and the model inference phase can achieve 2.5 - 5 times speedup with further faster speed on downstream tasks. (2) We explored novel ways to improve the multilingual translation system with a word-level contrastive learning technique and obtained better translation quality for low-resource language pairs, which was accepted by NAACL 2022 findings. We also explained the improvements by showing the relationship between BLEU scores and sentence retrieval performance of the NMT encoder, which motivates that future work can focus on further improving the encoder’s retrieval performance in many-to-many NMT and contrastive objective’s feasibility in a massively multilingual scenario.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
2: おおむね順調に進展している
理由
We almost finished the intended plans in the past year, including proposing novel methods for training multilingual neural machine translation systems and exploring the corpus construction for multilingual / low-resource neural machine translation. However, as recent work on large language models (GPT) show that the scale of the model and training data is essential, we adjusted our original plan of constructing corpora ourselves. Instead, we focused on the efficiency of the methods for constructing new training data, for which we proposed two methods, respectively, for improving the training efficiency and inference efficiency. Therefore, the current research progress is good, with only an appropriate adjustment on one specific sub-plan.
|
今後の研究の推進方策 |
In the following year, we will focus on improving the translation quality for more language pairs, especially for zero-shot neural machine translation. Specifically, we will first explore the optimal model setting for training large-scale multilingual neural machine translation systems. Subsequently, we will explore ways to improve the translation quality for zero-resource language pairs by training intermediate language-agnostic sentence representations within the encoder-decoder model architecture. Moreover, we will submit our previous efficient and effective sentence representation learning method for journal review and advertise our existing work in international conferences to promote the progress of multilingual / low-resource machine translation. Furthermore, with the emergence of the GPT-like large language models, we plan to add a new research topic as a sub-project into this series of translation research. Specifically, we will explore how to prompt large language models to perform well for any desired translation direction. We plan to utilize our proposed multilingual sentence representation techniques to generate robust translation task-specific prompts for large language models.
|