研究開始時の研究の概要 |
With globalization's progress, the demand for automatic multilingual language understanding and translation increases dramatically in many scenes. We aim to tackle the technical barriers in low-resource machine translation (LMT) and design a robust multilingual translation system that supports a large number of the languages, including several low-resource languages. (low-resource language: languages that we do not have sufficient data resources to conduct the translation model training)
|
研究実績の概要 |
In the last fiscal year, we developed a state-of-the-art lightweight sentence embedding model, LEALLA. With this pre-trained sentence-level semantic model, new parallel corpora could be constructed more efficiently using this pre-trained sentence embedding model. We also analyzed the Transformer model architecture for low-resource translation and published a paper to the top conference. Finally, we packed up all the work into a thesis. In general, this research embarks on a comprehensive exploration of multilingual representation learning, especially for low-resource translation, addressing the three identified challenges within this domain: (1) To address the high computational demand accompanying the expansion of multilingual model language coverage, we proposed an efficient and effective multilingual sentence embedding (MSE) model. We also introduced a new knowledge distillation method for training lightweight MSE. (2) To tackle the challenge of data scarcity in low-resource languages, we proposed new pre-training objectives for low-resource NMT. Additionally, we introduced word-level contrastive learning for low-resource NMT utilizing statistical word alignments. We also introduced AlignInstruct to enhance translation accuracy in low-resource languages for large language models. (3) To address the limitations in Transformer architecture for zero-shot NMT, we initially proposed a new Transformer architecture that constructs interlingual representations on top of the Transformer encoder. We also comprehensively examined the effects of layer normalization in zero-shot NMT.
|