DeLog: An Efficient Log Compression Framework with Pattern Signature Synthesis
Abstract: Parser-based log compression, which separates static tem- plates from dynamic variables, is a promising approach to exploit the unique structure of log data. However, its perfor- mance on complex production logs is often unsatisfactory. This performance gap coincides with a known degradation in the accuracy of its core log parsing component on such data, motivating our investigation into a foundational yet unverified question: does higher parsing accuracy necessarily lead to better compression ratio? To answer this, we conduct the first empirical study quanti- fying this relationship and find that a higher parsing accuracy does not guarantee a better compression ratio. Instead, our findings reveal that compression ratio is dictated by achiev- ing effective pattern-based grouping and encoding, i.e., the partitioning of tokens into low entropy, highly compressible groups. Guided by this insight, we design DeLog, a novel log com- pressor that implements a Pattern Signature Synthesis mecha- nism to achieve efficient pattern-based grouping. On 16 public and 10 production datasets, DeLog achieves state-of-the-art compression ratio and speed.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.