Papers
Topics
Authors
Recent
Search
2000 character limit reached

YAD: Leveraging T5 for Improved Automatic Diacritization of Yorùbá Text

Published 28 Dec 2024 in cs.CL | (2412.20218v1)

Abstract: In this work, we present Yor`ub\'a automatic diacritization (YAD) benchmark dataset for evaluating Yor`ub\'a diacritization systems. In addition, we pre-train text-to-text transformer, T5 model for Yor`ub\'a and showed that this model outperform several multilingually trained T5 models. Lastly, we showed that more data and larger models are better at diacritization for Yor`ub\'a

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.