Papers
Topics
Authors
Recent
Search
2000 character limit reached

General2Specialized LLMs Translation for E-commerce

Published 6 Mar 2024 in cs.CL and cs.AI | (2403.03689v2)

Abstract: Existing Neural Machine Translation (NMT) models mainly handle translation in the general domain, while overlooking domains with special writing formulas, such as e-commerce and legal documents. Taking e-commerce as an example, the texts usually include amounts of domain-related words and have more grammar problems, which leads to inferior performances of current NMT methods. To address these problems, we collect two domain-related resources, including a set of term pairs (aligned Chinese-English bilingual terms) and a parallel corpus annotated for the e-commerce domain. Furthermore, we propose a two-step fine-tuning paradigm (named G2ST) with self-contrastive semantic enhancement to transfer one general NMT model to the specialized NMT model for e-commerce. The paradigm can be used for the NMT models based on LLMs. Extensive evaluations on real e-commerce titles demonstrate the superior translation quality and robustness of our G2ST approach, as compared with state-of-the-art NMT models such as LLaMA, Qwen, GPT-3.5, and even GPT-4.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. Qwen Technical Report. arXiv preprint arXiv:2309.16609 (2023).
  2. Rachel Bawden and François Yvon. 2023. Investigating the Translation Performance of a Large Multilingual Language Model: the Case of Bloom. arXiv preprint arXiv:2303.01911 (2023).
  3. No Language Left Behind: Scaling Human-centered Machine Translation. arXiv preprint arXiv:2207.04672 (2022).
  4. Beyond English-centric Multilingual Machine Translation. The Journal of Machine Learning Research (2021), 4839–4886.
  5. Is ChatGPT a good translator? Yes with GPT-4 as the Engine. arXiv preprint arXiv:2301.08745 (2023).
  6. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
  7. Chin-Yew Lin. 2004. Rouge: A Package for Automatic Evaluation of Summaries. In Text summarization branches out. 74–81.
  8. Multilingual Denoising Pre-training for Neural Machine Translation. Transactions of the Association for Computational Linguistics (2020), 726–742.
  9. Crosslingual Generalization through Multitask Finetuning. In Association for Computational Linguistics. 15991–16111.
  10. OpenAI. 2022. Introducing ChatGPT. OpenAI blog (2022).
  11. Matt Post. 2018. A Call for Clarity in Reporting BLEU Scores. In Proceedings of the Third Conference on Machine Translation: Research Papers. 186–191.
  12. Bloom: A 176b-parameter Open-access Multilingual Language Model. arXiv preprint arXiv:2211.05100 (2022).
  13. Neural Machine Translation of Rare Words with Subword Units. In Association for Computational Linguistics. 1715–1725.
  14. Llama: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971 (2023).
  15. Attention is all you need. Advances in neural information processing systems (2017).
  16. R-drop: Regularized Dropout for Neural Networks. Advances in Neural Information Processing Systems (2021), 10890–10905.
  17. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. In Conference of the North American Chapter of the Association for Computational Linguistics. 483–498.
  18. Opt: Open Pre-trained Transformer Language Models. arXiv preprint arXiv:2205.01068 (2022).
  19. Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis. arXiv preprint arXiv:2304.04675 (2023).
Citations (4)

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.