Evaluating high‑order multivariate dependencies learned by tabular generators

Develop evaluation methodologies to assess whether tabular synthetic data generators capture complex, high‑order multivariate relationships between features, beyond univariate fidelity metrics.

Background

The paper primarily evaluates statistical fidelity using univariate metrics (e.g., inverse KL, KS, chi‑square), which are standard but may not reflect a generator’s capacity to reproduce multivariate feature dependencies.

The authors highlight that some generators can score well on univariate tests while failing to model high‑order relationships, and explicitly identify the lack of robust multivariate evaluation as an open research question.

References

However, evaluating their ability to learn more complex, high-order, relationships between features remains an open research question, which we leave for future work.

TabEBM: A Tabular Data Augmentation Method with Distinct Class-Specific Energy-Based Models  (2409.16118 - Margeloiu et al., 2024) in Limitations and Future Work, Section 4 (Discussion & Related Work)