Bayesian Nonparametric Causal Inference for High-Dimensional Nutritional Data via Factor-Based Exposure Mapping
Abstract: Diet plays a crucial role in health, and understanding the causal effects of dietary patterns is essential for informing public health policy and personalized nutrition strategies. However, causal inference in nutritional epidemiology faces several challenges: (i) high-dimensional and correlated food/nutrient intake data induce massive treatment levels; (ii) nutritional studies are interested in latent dietary patterns rather than single food items; and (iii) the goal is to estimate heterogeneous causal effects of these dietary patterns on health outcomes. We address these challenges by introducing a sophisticated exposure mapping framework that reduces the high-dimensional treatment space via factor analysis and enables the identification of dietary patterns. We also extend the Bayesian Causal Forest to accommodate three ordered levels of dietary exposure, better capturing the complex structure of nutritional data and enabling estimation of heterogeneous causal effects. We evaluate the proposed method through extensive simulations and apply it to a multi-center epidemiological study of Hispanic/Latino adults residing in the US. Using high-dimensional dietary data, we identify six dietary patterns and estimate their causal link with two key health risk factors: body mass index and fasting insulin levels. Our findings suggest that higher consumption of plant lipid-antioxidant, plant-based, animal protein, and dairy product patterns is associated with reduced risk.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.