Papers
Topics
Authors
Recent
Search
2000 character limit reached

SwarmFoam: Intelligent CFD Simulation

Updated 19 January 2026
  • SwarmFoam is a multi-agent CFD simulation framework that integrates LLMs for parsing multi-modal inputs and automating OpenFOAM case setups.
  • It orchestrates six specialized agents to manage tasks from geometry extraction and file generation to intelligent error diagnosis and correction.
  • The system employs Retrieval-Augmented Generation to enhance context-rich file creation, achieving an 84% pass rate and reducing token usage significantly.

SwarmFoam is a multi-agent system for intelligent Computational Fluid Dynamics (CFD) simulation built atop OpenFOAM and powered by multiple types of LLMs. It orchestrates six specialized agents to parse and translate multi-modal input (image and text), generate and correct OpenFOAM case files, execute simulations, diagnose errors, and produce scientific visualizations. SwarmFoam is distinguished by dedicated modules for Multi-Modal Perception, Intelligent Error Correction, and Retrieval-Augmented Generation (RAG), allowing it to handle complex geometry and simulation tasks that are challenging for prior LLM-driven CFD frameworks. Experimental evaluation demonstrates robust adaptability across simulation inputs, achieving a pass rate of 84% over 25 test cases (Yang et al., 12 Jan 2026).

1. System Composition and Agent Roles

SwarmFoam is architected as a six-agent system, each agent mapped to a distinct phase of the simulation pipeline. Agents communicate through a shared message bus and interact with modular LLM interfaces.

Agent Functional Role Example Actions
Observer Input decomposition, perception ObservePicture, DivideTask
Architect Case structure selection SetupFramework
InputWriter File generation/correction FirstWrite, CorrectFile
Runner Simulation execution RunFoamCase
Reviewer Error identification/correction HandleError, EndMark
ParaMaster Post-processing/visualization WriteCode, RunCode

The system explicitly embeds each agent’s context into standardized prompt templates when invoking LLMs, maintaining consistency in both file format and simulation instructions.

2. LLM Support and Prompt Construction

SwarmFoam utilizes two categories of LLMs:

  • Text-only LLMs (Type 1 interface, e.g., Deepseek-R1, Deepseek-V3, gemini-2.5-pro, gpt-5-pro, gpt-4o) for natural language-driven subtasks.
  • Multi-Modal LLMs (Type 2 interface, e.g., gemini-2.5-flash, Qwen-series, Llama-4-series) for combined image and text input.

Agents select and initialize an LLM according to the modality required for their task. Prompts are constructed using “standard templates” with placeholders for task descriptions, physical and geometric information, and reference text snippets retrieved via RAG. The LLMs are set to temperature =0.01= 0.01 to enforce strict output syntactic constraints required by OpenFOAM, followed by post-processing to remove extraneous tokens.

3. Multi-Modal Perception Pipeline

Image and text data fusion for geometry and physical property extraction is central to SwarmFoam’s flexibility. Two methods were tested; the pre-parsing method was chosen via ablation:

Method 1 (Pre-Parsing):

  • Tokenize and embed text (vtextv_{\text{text}}) and image (vimagev_{\text{image}}) using HuggingfaceEmbeddings.
  • Concatenate embeddings, input them to a multi-modal LLM using a detailed ObserverPicture prompt.
  • Extract structured "Geometric description" (e.g., dimensions, vertex coordinates, face labels) and "Physical description" (e.g., fluids, boundary conditions).
  • Pass these textual outputs to InputWriter for generating blockMeshDict and related files.

Method 2 (Direct Utilization):

  • InputWriter sends raw image and text directly to MM-LLM to generate blockMeshDict.
  • Ablation revealed severe degradation: 150.8% increase in iterations, 54% drop in pass rate.

The selection of Method 1 resulted in improved preservation of spatial and physical information, supporting complex OpenFOAM case preparation.

4. Intelligent Error Correction Mechanism

SwarmFoam’s Reviewer agent implements a first-error-priority algorithm for error handling during simulation:

1
2
3
4
5
6
7
8
9
Algorithm 1: Error Diagnosis and Correction
Require: Message history H, Configuration C
Ensure: Error diagnosis D, Error file path F
...
14: first_err ← errors[0]
15: (Dparsed,Fparsed) ← Parse_Error_Log(first_err)
16: error_type ← Classify_Error_Type(LLM_Agent, first_err, files)
...
24: return D,F

Only the first error in the simulation log is evaluated, classified as either “format error” or “missing file,” triggering file regeneration via InputWriter. This process iterates until either simulation success or the threshold kmax=20k_\text{max} = 20 is reached. This strategy yields marked reductions in token usage compared to prior frameworks.

5. Retrieval-Augmented Generation (RAG) Framework

Six local help documents (case structures, solver guidance, etc.) are indexed using a chunking and embedding process:

  • Indexing: Documents split into text chunks, embedded via HuggingfaceEmbeddings, stored in a local vector DB.
  • Querying: Agents query the vector DB with the current prompt; the top-kk nearest chunks (cosine similarity) are retrieved and appended under <Reference information> in the agent’s template.

The RAG mechanism provides low-latency, context-rich inferences for architecture selection and file generation, enhancing case fidelity and robustness.

6. Performance Evaluation and Metrics

Evaluation used 25 cases (10 natural language, 15 multi-modal), encompassing diverse CFD regimes (incompressible, multiphase, combustion, MHD):

  • Iterations:

Iterations=1m+ni=1m+nki,kikmax=20\text{Iterations} = \frac{1}{m+n} \sum_{i=1}^{m+n} k_i,\quad k_i\le k_{\max}=20

  • Token Usage:

TokenUsage=1m+ni=1m+n(Tini+Tthinki+Touti)\text{TokenUsage} = \frac{1}{m+n} \sum_{i=1}^{m+n}(T^{\text{in}_i} + T^{\text{think}_i} + T^{\text{out}_i})

  • Pass Rate:

PassRate=nm+n\text{PassRate} = \frac{n}{m+n}

  • Cost:

Cost=110000(m+n)i=1m+n(PinTini+PthinkTthinki+PoutTouti)\text{Cost} = \frac{1}{10\,000(m+n)}\sum_{i=1}^{m+n}\Big(P_{\text{in}}T^{\text{in}_i} + P_{\text{think}}T^{\text{think}_i} + P_{\text{out}}T^{\text{out}_i}\Big)

SwarmFoam achieved 84% pass rate overall (80% NL, 86.7% multi-modal). Relative to Foam-Agent, SwarmFoam matched pass rate while reducing token usage by 83.85%, attributed to the first-error-priority correction loop.

7. Representative Workflow Example

A prototypical run consists of:

  1. User submits a text instruction ("Please use pisoFoam to solve the flow field after airflow passes obstacle...") and an image (schematic with geometric specifications).
  2. Observer Agent parses geometry and physical parameters (domain, obstacle location, flow variables, inlet BCs, time parameters).
  3. Architect Agent retrieves matching case structure and emits file generation instructions.
  4. InputWriter Agent generates each file: blockMeshDict, controlDict, fvSchemes, fvSolution, transportProperties, velocity/pressure files, Allrun, leveraging RAG and LLM synthesis.
  5. Runner Agent executes Allrun; if error encountered (e.g., missing blockMeshDict), Reviewer Agent diagnoses and InputWriter generates the missing/corrected file.
  6. Iterative correction continues until simulation is successful.
  7. On completion and post-processing request, ParaMaster Agent produces ParaView visualizations via pvpython scripts, yielding publication-ready output images.

This workflow highlights SwarmFoam’s capacity for dual-modality case specification, automated error resolution, and end-to-end CFD pipeline automation (Yang et al., 12 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SwarmFoam.