Papers
Topics
Authors
Recent
Search
2000 character limit reached

Simulating Complex Crossectional and Longitudinal Data using the simDAG R Package

Published 2 Jun 2025 in stat.ME and stat.CO | (2506.01498v1)

Abstract: Generating artificial data is a crucial step when performing Monte-Carlo simulation studies. Depending on the planned study, complex data generation processes (DGP) containing multiple, possibly time-varying, variables with various forms of dependencies and data types may be required. Simulating data from such DGP may therefore become a difficult and time-consuming endeavor. The simDAG R package offers a standardized approach to generate data from simple and complex DGP based on the definition of structural equations in directed acyclic graphs using arbitrary functions or regression models. The package offers a clear syntax with an enhanced formula interface and directly supports generating binary, categorical, count and time-to-event data with arbitrary dependencies, possibly non-linear relationships and interactions. It additionally includes a framework to conduct discrete-time based simulations which allows the generation of longitudinal data on a semi-continuous time-scale. This approach may be used to generate time-to-event data with both recurrent or competing events and possibly multiple time-varying covariates, which may themselves have arbitrary data types. In this article we demonstrate the vast amount of features included in simDAG by replicating the DGP of multiple real Monte-Carlo simulation studies.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.