Papers
Topics
Authors
Recent
Search
2000 character limit reached

IDEALEM: Statistical Similarity Based Data Reduction

Published 16 Nov 2019 in cs.DB, cs.IT, and math.IT | (1911.06980v1)

Abstract: Many applications such as scientific simulation, sensing, and power grid monitoring tend to generate massive amounts of data, which should be compressed first prior to storage and transmission. These data, mostly comprised of floating-point values, are known to be difficult to compress using lossless compression. A few compression methods based on lossy compression have been proposed to compress this seemingly incompressible data. Unfortunately, they are all designed to minimize the Euclidean distance between the original data and the decompressed data, which fundamentally limits compression performance. We recently proposed a new class of lossy compression based on statistical similarity, called IDEALEM, which was also provided as a software package. IDEALEM has demonstrated its performance by reducing data volume much more than state-of-the-art compression methods while preserving unique patterns of data. IDEALEM can operate in two different modes depending on the stationarity of input data. This paper presents compression performance analyses of these two modes, and investigates the difference between two transform techniques targeted for non-stationary data. This paper also discusses the data reconstruction quality of IDEALEM using spectral analysis and shows that important frequency components in application domain are well preserved. We expand the capability of IDEALEM by adding a new min/max check that facilitates preserving significant patterns lasting only for a brief duration which were previously hard to capture. This min/max check also accelerates the encoding process significantly. Experiments show IDEALEM preserves significant patterns in the original data with faster encoding time.

Citations (1)

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.