Optimality of UTF-8 as a byte-level output representation for ASR
Determine whether UTF-8 byte encoding is optimal as a byte-level output representation for end-to-end automatic speech recognition (ASR) systems.
References
While UTF-8 has proven to be an effective output representation for ASR, it is unclear whether it is optimal.
— Optimizing Byte-level Representation for End-to-end ASR
(2406.09676 - Hsiao et al., 2024) in Section 1 (Introduction)