Code-Based English Models Surprising Performance on Chinese QA Pair Extraction Task
Abstract: In previous studies, code-based models have consistently outperformed text-based models in reasoning-intensive scenarios. When generating our knowledge base for Retrieval-Augmented Generation (RAG), we observed that code-based models also perform exceptionally well in Chinese QA Pair Extraction task. Further, our experiments and the metrics we designed discovered that code-based models containing a certain amount of Chinese data achieve even better performance. Additionally, the capabilities of code-based English models in specified Chinese tasks offer a distinct perspective for discussion on the philosophical "Chinese Room" thought experiment.
- On the Cross-lingual Transferability of Monolingual Representations // Annual Meeting of the Association for Computational Linguistics. 2019.
- Qwen Technical Report // ArXiv. 2023. abs/2309.16609.
- DeepSeek LLM: Scaling Open-Source Language Models with Longtermism // ArXiv. 2024.
- Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision. 2023.
- ChatGLM2 Team . ChatGLM2-6B: An Open Bilingual Chat LLM. 2023.
- Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca // arXiv preprint arXiv:2304.08177. 2023.
- QLoRA: Efficient Finetuning of Quantized LLMs // ArXiv. 2023. abs/2305.14314.
- CodeBERT: A Pre-Trained Model for Programming and Natural Languages // ArXiv. 2020. abs/2002.08155.
- Fu Hao Yao; Peng, Khot Tushar. How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources // Yao Fu’s Notion. Dec 2022.
- Scaling Laws for Neural Language Models // ArXiv. 2020. abs/2001.08361.
- Lample Guillaume, Conneau Alexis. Cross-lingual Language Model Pretraining // ArXiv. 2019. abs/1901.07291.
- MLQA: Evaluating Cross-lingual Extractive Question Answering // ArXiv. 2019. abs/1910.07475.
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks // ArXiv. 2020. abs/2005.11401.
- CLEVA: Chinese Language Models EVAluation Platform // ArXiv. 2023. abs/2308.04813.
- Lin Chin-Yew. ROUGE: A Package for Automatic Evaluation of Summaries // Text Summarization Branches Out. Barcelona, Spain: Association for Computational Linguistics, VII 2004. 74–81.
- Moural Josef. John Searle: The Chinese Room Argument. 2003.
- RWKV: Reinventing RNNs for the Transformer Era // Conference on Empirical Methods in Natural Language Processing. 2023.
- Code Llama: Open foundation models for code // arXiv preprint arXiv:2308.12950. 2023.
- DRCD: a Chinese Machine Reading Comprehension Dataset // ArXiv. 2018. abs/1806.00920.
- A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document Summarization // Conference on Empirical Methods in Natural Language Processing. 2023.
- RoFormer: Enhanced Transformer with Rotary Position Embedding // ArXiv. 2021. abs/2104.09864.
- A Length-Extrapolatable Transformer // ArXiv. 2022. abs/2212.10554.
- Stanford Alpaca: An Instruction-following LLaMA model. 2023.
- Representing Numbers in NLP: a Survey and a Vision // ArXiv. 2021. abs/2103.13136.
- LLaMA: Open and efficient foundation language models // arXiv preprint arXiv:2302.13971. 2023.
- Probing Pretrained Language Models for Lexical Semantics // Conference on Empirical Methods in Natural Language Processing. 2020.
- Code4Struct: Code Generation for Few-Shot Event Structure Prediction // Annual Meeting of the Association for Computational Linguistics. 2022.
- Baichuan 2: Open Large-scale Language Models // ArXiv. 2023. abs/2309.10305.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.