Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model
Abstract: In the realm of LLMs, the nuanced linguistic and cultural intricacies of Traditional Chinese, as spoken in Taiwan, have been largely overlooked. This paper introduces Taiwan LLM, a pioneering LLM that specifically caters to the Traditional Chinese language, with a focus on the variant used in Taiwan. Leveraging a comprehensive pretraining corpus and instruction-finetuning datasets, we have developed a model that not only understands the complexities of Traditional Chinese but also embodies the cultural context of Taiwan. Taiwan LLM represents the first of its kind, a model that is not only linguistically accurate but also culturally resonant with its user base. Our evaluations demonstrate that Taiwan LLM achieves superior performance in understanding and generating Traditional Chinese text, outperforming existing models that are predominantly trained on Simplified Chinese or English. The open-source release of Taiwan LLM invites collaboration and further innovation, ensuring that the linguistic diversity of Chinese speakers is embraced and well-served. The model, datasets, and further resources are made publicly available to foster ongoing research and development in this field.
- 2023. Fasteval.
- Together AI. 2023. Releasing 3b and 7b redpajama-incite family of models including base, instruction-tuned and chat models.
- Open llm leaderboard. https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard.
- Sahil Chaudhary. 2023. Code alpaca: An instruction-following llama model for code generation.
- Vicuna: An Open-Source chatbot impressing GPT-4 with 90%* ChatGPT quality.
- Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. See https://vicuna. lmsys. org (accessed 14 April 2023).
- Free dolly: Introducing the world’s first truly open instruction-tuned llm.
- Tri Dao. 2023. FlashAttention-2: Faster attention with better parallelism and work partitioning.
- Alpacafarm: A simulation framework for methods that learn from human feedback.
- Chain-of-thought hub: A continuous effort to measure large language models’ reasoning performance.
- Advancing the evaluation of traditional chinese language models: Towards a comprehensive benchmark suite.
- Mistral 7B.
- OpenAssistant Conversations–democratizing large language model alignment. arXiv preprint arXiv:2304.07327.
- Yen-Ting Lin and Yun-Nung Chen. 2023. LLM-eval: Unified multi-dimensional automatic evaluation for open-domain conversations with large language models. In Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023), pages 47–58, Toronto, Canada. Association for Computational Linguistics.
- The flan collection: Designing data and methods for effective instruction tuning. In Proceedings of the 40 th International Conference on Machine Learning.
- Mosaic ML. 2023. Introducing mpt-7b: A new standard for open-source, commercially usable llms.
- The refinedweb dataset for falcon llm: Outperforming curated corpora with web data, and web data only.
- Instruction tuning with GPT-4. arXiv preprint arXiv:2304.03277.
- Zero: Memory optimizations toward training trillion parameter models.
- Proximal policy optimization algorithms.
- Alpaca: A strong, replicable instruction-following model. Stanford Center for Research on Foundation Models. https://crfm. stanford. edu/2023/03/13/alpaca. html, 3(6):7.
- Xwin-Lm Team. 2023. Xwin-LM.
- Llama 2: Open foundation and Fine-Tuned chat models.
- TRL: Transformer reinforcement learning.
- Self-Instruct: Aligning language models with Self-Generated instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13484–13508, Toronto, Canada. Association for Computational Linguistics.
- Super-naturalinstructions: Generalization via declarative instructions on 1600+ nlp tasks. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5085–5109.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
- Wizardlm: Empowering large language models to follow complex instructions. arXiv preprint arXiv:2304. 12244.
- Judging LLM-as-a-Judge with MT-Bench and chatbot arena.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.