Papers
Topics
Authors
Recent
Search
2000 character limit reached

Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew

Published 25 Sep 2023 in cs.CL | (2309.14568v1)

Abstract: We present DictaLM, a large-scale LLM tailored for Modern Hebrew. Boasting 7B parameters, this model is predominantly trained on Hebrew-centric data. As a commitment to promoting research and development in the Hebrew language, we release both the foundation model and the instruct-tuned model under a Creative Commons license. Concurrently, we introduce DictaLM-Rab, another foundation model geared towards Rabbinic/Historical Hebrew. These foundation models serve as ideal starting points for fine-tuning various Hebrew-specific tasks, such as instruction, Q&A, sentiment analysis, and more. This release represents a preliminary step, offering an initial Hebrew LLM model for the Hebrew NLP community to experiment with.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. Representations and architectures in neural sentiment analysis for morphologically rich languages: A case study from Modern Hebrew. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2242–2252, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
  2. Dan Bareket and Reut Tsarfaty. 2021. Neural Modeling for Named Entities and Morphology (NEMO2). Transactions of the Association for Computational Linguistics, 9:909–928.
  3. Heq: a large and diverse hebrew reading comprehension benchmark.
  4. Dan Hendrycks and Kevin Gimpel. 2023. Gaussian error linear units (gelus).
  5. Omri Keren and Omer Levy. 2021. Parashoot: A hebrew question answering dataset. In Proceedings of the 3rd Workshop on Machine Reading for Question Answering, pages 106–112.
  6. Neural machine translation of rare words with subword units. CoRR, abs/1508.07909.
  7. Vitaly Shalumov and Harel Haskey. 2023. Hero: Roberta and longformer hebrew language models. arXiv:2304.11077.
  8. Noam Shazeer. 2020. Glu variants improve transformer.
  9. Roformer: Enhanced transformer with rotary position embedding.
  10. Attention is all you need. CoRR, abs/1706.03762.
  11. Improving low compute language modeling with in-domain embedding initialisation.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.