Papers
Topics
Authors
Recent
Search
2000 character limit reached

Building a Lemmatizer and a Spell-checker for Sorani Kurdish

Published 27 Sep 2018 in cs.CL | (1809.10763v1)

Abstract: The present paper aims at presenting a lemmatization and a word-level error correction system for Sorani Kurdish. We propose a hybrid approach based on the morphological rules and a n-gram LLM. We have called our lemmatization and error correction systems Peyv and R^en^us respectively, which are the first tools presented for Sorani Kurdish to the best of our knowledge. The Peyv lemmatizer has shown 86.7% accuracy. As for R^en^us, using a lexicon, we have obtained 96.4% accuracy while without a lexicon, the correction system has 87% accuracy. As two fundamental text processing tools, these tools can pave the way for further researches on more natural language processing applications for Sorani Kurdish.

Citations (17)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.