- The paper introduces Flowco, a system that integrates a visual dataflow programming model with LLM-enhanced code synthesis to overcome limitations of traditional notebooks.
- The paper details a multi-abstraction approach where each analysis node links high-level summaries, detailed requirements, and executable Python code.
- The paper validates Flowco via user studies, demonstrating that even novice programmers can effectively author analyses with improved dependency tracking and automated error detection.
This paper introduces Flowco, a mixed-initiative system designed to address challenges in data analysis authoring, particularly those arising from the limitations of computational notebooks and the integration of LLMs. Traditional notebooks, while useful for exploration, suffer from issues like lack of modularity, difficulty tracking dependencies, reproducibility problems, and potential state staleness due to non-linear execution. While LLMs can generate analysis code, they introduce challenges in prompt engineering, code validation, and managing code variability. Existing integrations often exacerbate notebook problems without ensuring robustness.
Flowco tackles these issues by combining a visual dataflow programming model with deep LLM integration. Users construct analyses by drawing dataflow graphs where nodes represent computational steps (e.g., loading data, clustering, plotting) and edges define the flow of data and dependencies. This visual model inherently promotes modularity, abstraction, and clear dependency tracking, aligning with how analysts often conceptualize workflows.
LLMs are integrated into every stage of the Flowco authoring process:
- Graph Creation/Editing: Users can create and modify nodes and edges visually or via an "Ask Me Anything!" (AMA) chat interface.
- Code Synthesis: Flowco translates the graph structure and node specifications into executable Python code using an LLM. This process is modular, focusing on one node at a time.
- Abstraction Layers: Each node has multiple abstraction levels: a high-level summary label, detailed prose requirements, and the synthesized code. Users can interact at any level, and Flowco uses the LLM to maintain consistency between them.
- Validation & Verification: Flowco supports user-defined assertion checks (expressed in prose, including quantitative checks on data and qualitative/visual checks on plots) and unit tests for nodes. The LLM helps suggest checks/tests and translates them into executable validation code.
- Error Detection & Repair: Flowco automatically detects syntax errors, runtime errors, and type mismatches in generated code. It employs the LLM to attempt automatic repairs, first locally within the node and escalating to global graph analysis if necessary.
- Explanation & Exploration: The AMA chat agent allows users to query the system about the data, graph, code, or results, leveraging the LLM's ability to inspect the graph and run code snippets for analysis.
Flowco's architecture ensures reliability through modular synthesis, consistency checking across abstraction layers, explicit dependency management via the graph, and built-in validation features.
The paper evaluates Flowco through diverse example analyses (e.g., clustering, linear regression, multiverse analysis, logistic regression with cross-validation) demonstrating its expressiveness. A user study with 12 data science students found that participants, including those with limited programming experience, could successfully author analyses using Flowco. They found the dataflow model helpful for organization and understanding dependencies, preferring it over traditional notebooks for managing complexity. The integrated LLM support was seen as more trustworthy and efficient than using standalone LLMs like ChatGPT. While latency and potential scalability issues for very complex graphs were noted, participants generally found Flowco easy to learn and beneficial, particularly for exploratory analysis and for users new to programming.
The main contributions include the dataflow programming model with abstraction layers tailored for LLM interaction, the Flowco system itself, LLM-based techniques for graph design, implementation, and validation, and empirical validation showing its effectiveness.