DeMPT: Decoding-enhanced Multi-phase Prompt Tuning for Making LLMs Be Better Context-aware Translators
Abstract: Generally, the decoder-only LLMs are adapted to context-aware neural machine translation (NMT) in a concatenating way, where LLMs take the concatenation of the source sentence (i.e., intra-sentence context) and the inter-sentence context as the input, and then to generate the target tokens sequentially. This adaptation strategy, i.e., concatenation mode, considers intra-sentence and inter-sentence contexts with the same priority, despite an apparent difference between the two kinds of contexts. In this paper, we propose an alternative adaptation approach, named Decoding-enhanced Multi-phase Prompt Tuning (DeMPT), to make LLMs discriminately model and utilize the inter- and intra-sentence context and more effectively adapt LLMs to context-aware NMT. First, DeMPT divides the context-aware NMT process into three separate phases. During each phase, different continuous prompts are introduced to make LLMs discriminately model various information. Second, DeMPT employs a heuristic way to further discriminately enhance the utilization of the source-side inter- and intra-sentence information at the final decoding phase. Experiments show that our approach significantly outperforms the concatenation method, and further improves the performance of LLMs in discourse modeling.
- Target-side augmentation for document-level machine translation. In Proceedings of ACL, pages 10725–10742.
- G-transformer for document-level machine translation. In Proceedings of ACL, pages 3442–3455.
- BigScience. 2022. Bloom: A 176b-parameter open-access multilingual language model. Computing Research Repository, arXiv:2211.05100.
- Google. 2022. Palm: Scaling language modeling with pathways. J. Mach. Learn. Res., 24:240:1–240:113.
- Improving evaluation of document-level machine translation quality estimation. In Proceedings of EACL, pages 356–361.
- LoRA: Low-rank adaptation of large language models. In Proceedings of ICLR.
- Xinyu Hu and Xiaojun Wan. 2023. Exploring discourse structure in document-level machine translation. In Proceedings of EMNLP, pages 13889–13902.
- Does neural machine translation benefit from larger context? Computing Research Repository, arXiv:1704.05135.
- BlonDe: An automatic evaluation metric for document-level machine translation. In Proceedings of NAACL, pages 1550–1565, Seattle, United States.
- Dynamic context selection for document-level neural machine translation via reinforcement learning. In Proceedings of EMNLP, pages 2242–2254.
- Marzena Karpinska and Mohit Iyyer. 2023. Large language models effectively leverage document-level context for literary translation, but critical errors persist. In Proceedings of WMT, pages 419–451.
- Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of EMNLP, pages 388–395.
- Attention focusing for neural machine translation by bridging source and target embeddings. In Proceedings of ACL, pages 1767–1776.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of EMNLP, pages 3045–3059.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of ACL-IJCNLP, pages 4582–4597, Online.
- P-Transformer: Towards Better Document-to-Document Neural Machine Translation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31:3859–3870.
- Enhancing document-level translation of large language model via translation mixed-instructions. Computing Research Repository, arXiv:2401.08088.
- P-tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. In Proceedings of ACL, pages 61–68.
- Gpt understands, too. Computing Research Repository, arXiv:2103.10385.
- Encouraging lexical translation consistency for document-level neural machine translation. In Proceedings of EMNLP, pages 3265–3277.
- Modeling consistency preference via lexical chains for document-level neural machine translation. In Proceedings of EMNLP, pages 6312–6326.
- Selective attention for context-aware neural machine translation. In Proceedings of NAACL, pages 3092–3102.
- MetaAI. 2023a. Llama 2: Open foundation and fine-tuned chat models. Computing Research Repository, arXiv:2307.09288.
- MetaAI. 2023b. Llama: Open and efficient foundation language models. ArXiv, abs/2302.13971.
- Document-level neural machine translation with hierarchical attention networks. In Proceedings of EMNLP, pages 2947–2954.
- OpenAI. 2023. Gpt-4 technical report. Computing Research Repository, arXiv:2303.08774.
- fairseq: A fast, extensible toolkit for sequence modeling. In Proceedings of NAACL-HLT: Demonstrations.
- Matt Post. 2018. A call for clarity in reporting BLEU scores. In Proceedings of WMT, pages 186–191.
- Zero: Memory optimizations toward training trillion parameter models. In SC20: Proceedings of High Performance Computing, Networking, Storage and Analysis, pages 1–16.
- COMET: A neural framework for MT evaluation. In Proceedings of EMNLP, pages 2685–2702.
- Rethinking document-level neural machine translation. In Findings of ACL, pages 3537–3548.
- MSP: Multi-stage prompting for making pre-trained language models better translators. In Proceedings of ACL, pages 6131–6142.
- Attention is all you need. In Proceedings of NIPS, pages 5998–6008.
- Context-aware monolingual repair for neural machine translation. In Proceedings of EMNLP-IJCNLP, pages 877–886.
- When a good translation is wrong in context: Context-aware machine translation improves on deixis, ellipsis, and lexical cohesion. In Proceedings of ACL, pages 1198–1212.
- Context-aware neural machine translation learns anaphora resolution. In Proceedings of ACL, pages 1264–1274.
- Document-level machine translation with large language models. In Proceedings of EMNLP, pages 16646–16661.
- One model to learn both: Zero pronoun prediction and translation. In Proceedings of EMNLP-IJCNLP, pages 921–930.
- Exploiting cross-sentence context for neural machine translation. In Proceedings of EMNLP, pages 2826–2831.
- Adapting large language models for document-level machine translation. Computing Research Repository, arXiv:2401.06468.
- Yangjian Wu and Gang Hu. 2023. Exploring prompt engineering with GPT language models for document-level machine translation: Insights and findings. In Proceedings of the Eighth Conference on Machine Translation, pages 166–169.
- Improving the transformer translation model with document-level context. In Proceedings of EMNLP, pages 533–542.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.