Flowco: Rethinking Data Analysis in the Age of LLMs

Published 18 Apr 2025 in cs.HC, cs.AI, cs.PL, and stat.CO | (2504.14038v1)

Abstract: Conducting data analysis typically involves authoring code to transform, visualize, analyze, and interpret data. LLMs are now capable of generating such code for simple, routine analyses. LLMs promise to democratize data science by enabling those with limited programming expertise to conduct data analyses, including in scientific research, business, and policymaking. However, analysts in many real-world settings must often exercise fine-grained control over specific analysis steps, verify intermediate results explicitly, and iteratively refine their analytical approaches. Such tasks present barriers to building robust and reproducible analyses using LLMs alone or even in conjunction with existing authoring tools (e.g., computational notebooks). This paper introduces Flowco, a new mixed-initiative system to address these challenges. Flowco leverages a visual dataflow programming model and integrates LLMs into every phase of the authoring process. A user study suggests that Flowco supports analysts, particularly those with less programming experience, in quickly authoring, debugging, and refining data analyses.

Abstract PDF Upgrade to Chat

Summary

The paper introduces Flowco, a system that integrates a visual dataflow programming model with LLM-enhanced code synthesis to overcome limitations of traditional notebooks.
The paper details a multi-abstraction approach where each analysis node links high-level summaries, detailed requirements, and executable Python code.
The paper validates Flowco via user studies, demonstrating that even novice programmers can effectively author analyses with improved dependency tracking and automated error detection.

This paper introduces Flowco, a mixed-initiative system designed to address challenges in data analysis authoring, particularly those arising from the limitations of computational notebooks and the integration of LLMs. Traditional notebooks, while useful for exploration, suffer from issues like lack of modularity, difficulty tracking dependencies, reproducibility problems, and potential state staleness due to non-linear execution. While LLMs can generate analysis code, they introduce challenges in prompt engineering, code validation, and managing code variability. Existing integrations often exacerbate notebook problems without ensuring robustness.

Flowco tackles these issues by combining a visual dataflow programming model with deep LLM integration. Users construct analyses by drawing dataflow graphs where nodes represent computational steps (e.g., loading data, clustering, plotting) and edges define the flow of data and dependencies. This visual model inherently promotes modularity, abstraction, and clear dependency tracking, aligning with how analysts often conceptualize workflows.

LLMs are integrated into every stage of the Flowco authoring process:

Graph Creation/Editing: Users can create and modify nodes and edges visually or via an "Ask Me Anything!" (AMA) chat interface.
Code Synthesis: Flowco translates the graph structure and node specifications into executable Python code using an LLM. This process is modular, focusing on one node at a time.
Abstraction Layers: Each node has multiple abstraction levels: a high-level summary label, detailed prose requirements, and the synthesized code. Users can interact at any level, and Flowco uses the LLM to maintain consistency between them.
Validation & Verification: Flowco supports user-defined assertion checks (expressed in prose, including quantitative checks on data and qualitative/visual checks on plots) and unit tests for nodes. The LLM helps suggest checks/tests and translates them into executable validation code.
Error Detection & Repair: Flowco automatically detects syntax errors, runtime errors, and type mismatches in generated code. It employs the LLM to attempt automatic repairs, first locally within the node and escalating to global graph analysis if necessary.
Explanation & Exploration: The AMA chat agent allows users to query the system about the data, graph, code, or results, leveraging the LLM's ability to inspect the graph and run code snippets for analysis.

Flowco's architecture ensures reliability through modular synthesis, consistency checking across abstraction layers, explicit dependency management via the graph, and built-in validation features.

The paper evaluates Flowco through diverse example analyses (e.g., clustering, linear regression, multiverse analysis, logistic regression with cross-validation) demonstrating its expressiveness. A user study with 12 data science students found that participants, including those with limited programming experience, could successfully author analyses using Flowco. They found the dataflow model helpful for organization and understanding dependencies, preferring it over traditional notebooks for managing complexity. The integrated LLM support was seen as more trustworthy and efficient than using standalone LLMs like ChatGPT. While latency and potential scalability issues for very complex graphs were noted, participants generally found Flowco easy to learn and beneficial, particularly for exploratory analysis and for users new to programming.

The main contributions include the dataflow programming model with abstraction layers tailored for LLM interaction, the Flowco system itself, LLM-based techniques for graph design, implementation, and validation, and empirical validation showing its effectiveness.

Markdown Report Issue