2023 Fiscal Year Annual Research Report
Unifying Pre-training and Multilingual Semantic Representation Learning for Low-resource Neural Machine Translation
Project/Area Number |
22KJ1843
|
Allocation Type | Multi-year Fund |
Research Institution | Kyoto University |
Principal Investigator |
毛 卓遠 京都大学, 情報学研究科, 特別研究員(DC2)
|
Project Period (FY) |
2023-03-08 – 2024-03-31
|
Keywords | low-resource translation / sentence embedding |
Outline of Annual Research Achievements |
In the last fiscal year, we developed a state-of-the-art lightweight sentence embedding model, LEALLA. With this pre-trained sentence-level semantic model, new parallel corpora could be constructed more efficiently using this pre-trained sentence embedding model. We also analyzed the Transformer model architecture for low-resource translation and published a paper to the top conference. Finally, we packed up all the work into a thesis. In general, this research embarks on a comprehensive exploration of multilingual representation learning, especially for low-resource translation, addressing the three identified challenges within this domain: (1) To address the high computational demand accompanying the expansion of multilingual model language coverage, we proposed an efficient and effective multilingual sentence embedding (MSE) model. We also introduced a new knowledge distillation method for training lightweight MSE. (2) To tackle the challenge of data scarcity in low-resource languages, we proposed new pre-training objectives for low-resource NMT. Additionally, we introduced word-level contrastive learning for low-resource NMT utilizing statistical word alignments. We also introduced AlignInstruct to enhance translation accuracy in low-resource languages for large language models. (3) To address the limitations in Transformer architecture for zero-shot NMT, we initially proposed a new Transformer architecture that constructs interlingual representations on top of the Transformer encoder. We also comprehensively examined the effects of layer normalization in zero-shot NMT.
|