First-Order Formulas (FOFs)

Updated 14 January 2026

First-order formulas (FOFs) are finite syntactic constructs in first-order logic, built from relational, function, and constant symbols with quantifiers and Boolean connectives.
They utilize prenex normal form and algebraic frameworks to provide clear semantics and efficient invariant inference methods in formal verification.
Advanced techniques like orthogonal-slice enumeration and abstract interpretation reduce the search space and simplify variable dependencies for practical applications.

A first-order formula (FOF) is a finite syntactic object in first-order logic (FOL), constructed from a fixed signature of relation, function, and constant symbols, together with variables, Boolean connectives, and quantifiers. FOFs represent the core building blocks of classical model theory, logic-based program analysis, and automated invariant inference. Their algebraic, semantic, and computational properties are central to foundations of mathematics, computer science, and formal verification.

1. Syntactic and Semantic Foundations

A finite, many-sorted first-order signature is a tuple $\Sigma = (C, R, F, S)$ , where $C$ is a finite set of constant symbols, $R$ is a finite set of predicates (each with arity $k$ and sort-vector), $F$ is a finite set of function symbols, and $S$ is a finite set of sorts. Terms are built inductively: $t ::= c$ $(c \in C)$ $|$ $x$ $(x$ a variable $)$ $|$ $f(t_1, ..., t_k)$ $(f \in F)$ . Atoms are $p(t_1, ..., t_k)$ or equalities $t_1 = t_2$ . Literals are atoms or their negations, and formulas are built by closing literals under conjunction, disjunction, and quantification ( $\forall x{:}s$ , $\exists x{:}s$ ). Without loss of expressiveness for many applications, attention often restricts to prenex normal form:

$Q_1 x_1{:}s_1 \dots Q_m x_m{:}s_m .\, M(x_1, ..., x_m),$

with $M$ a Boolean combination of literals (Yang et al., 7 Jan 2026, Frenkel et al., 2024).

The Tarski semantics for FOFs interprets formulas in a structure $\sigma = (U, I)$ , where $U$ is a nonempty domain and $I$ interprets each symbol of $\Sigma$ . An assignment $\mu: V \rightarrow U$ maps variables to elements of $U$ , and $(\sigma, \mu) \models \varphi$ is defined inductively (Frenkel et al., 2024).

2. Algebraic Structure and Universal Characterization

FOFs can be seen as operations on relations, forming multisorted first-order algebras. The algebraic framework introduces a sort for each arity, with sorts interpreted as "the set of all $n$ -ary relations" on a base set. The algebraic signature includes constants $0_n, 1_n$ , binary operations $\vee_n$ , $\wedge_n$ , unary $\neg_n$ , existential projections $\exists_n$ , and variable reindexing via substitutions $\alpha: n \to k$ . Optional extensions allow universal quantifiers and explicit equality (Valby, 2014).

The universal axioms (bounded distributive lattice, substitution compatibility, compositionality, negation, existential projections, and so on) precisely characterize subalgebras embeddable into concrete first-order algebras. Reducts (e.g., quantifier-free, positive-existential, positive-quantifier-free fragments) are uniformly handled by omitting relevant operation-symbols and the corresponding axioms (Valby, 2014).

The completeness theorem in this algebraic setting states that every algebra satisfying these axioms embeds into a genuine first-order algebra of relations, and models of a theory correspond to algebra-homomorphisms from the free first-order algebra modulo the theory congruence (Valby, 2014).

3. Synthesis and Enumeration of First-Order Formulas

FOF synthesis is central to invariant inference for transition systems and other domains with logical structure. The synthesis problem seeks, for a finite, syntactically bounded search space $\Omega_0$ of closed FOFs over $\Sigma$ , and a set of structures $\sigma = \{M_1, ..., M_k\}$ , the maximally precise subset $\Phi \subseteq \Omega_0$ such that:

For all $M \in \sigma$ , $M \models \phi$ for each $\phi \in \Phi$ .
$\phi$ is not a tautology ( $\exists M \notin \sigma$ s.t. $M \not\models \phi$ ).
No $\phi$ strictly entails another in $\Phi$ (Yang et al., 7 Jan 2026).

Modern implementations use Answer Set Programming (ASP) to enumerate candidate FOFs, encoding quantifier prefix rules, atom generation, DNF cube assignments, and integrity constraints. Each answer set corresponds to a unique formula in a constrained DNF, typically of the form:

$\bigwedge_{i=1}^e \exists x_i \;\Bigl(\bigvee_{C=1}^{\ell}\;\bigwedge_{(P,A,Pol): \text{lit\_in}(C,P,A,Pol)} (\pm P(A))\Bigr).$

Orthogonal-slices enumeration partitions the search space into "clause" and "full" DNF dimensions, supporting incremental candidate pruning and efficient slicing, which yields major reductions in enumeration complexity (e.g., from $O(2^n)$ down to $O(2^k) + O(2^m)$ for $k \ll n$ candidates) (Yang et al., 7 Jan 2026).

4. Formula Properties: Variable Dependence and Simplification

A critical property of FOFs is variable (non-)dependence, formalized as follows: for formula $\varphi(x_1, ..., x_n)$ and constraint $\psi(x_1, ..., x_n)$ in a structure $\mathcal{M}$ , $\varphi$ is non-dependent on $x_i$ provided $\psi$ iff, whenever $\mathcal{M}, s \models \psi$ :

$\mathcal{M}, s[x_i \mapsto a] \models \varphi \iff \mathcal{M}, s[x_i \mapsto b] \models \varphi$

for all $a, b \in M$ (Lefever et al., 28 Jan 2025). This notion is closed under Boolean connectives and quantifiers. The main application of non-dependence is the syntactic simplification of convoluted formulas, especially those arising from mechanized theory translations, by safely pulling out and eliminating quantifiers when non-dependence is provable under side conditions (Lefever et al., 28 Jan 2025).

Explicitly, for formulas constructed by translation that introduce blocks of quantifiers over variables that do not affect the truth value provided some constraints, quantifier-pull-out and reduction are justified, enabling simplification of both logical and domain-specific FOFs (Lefever et al., 28 Jan 2025).

5. Abstract Domains and Efficient Symbolic Manipulation

The scalability and automation of reasoning with FOFs require efficient data structures and algorithms for manipulating large sets of formulas. In the context of abstract interpretation, sets of FOFs (closed under particular quantifier alternation patterns) can be represented as antichains of canonical formulas, minimal with respect to a syntactic subsumption relation $\sqsubseteq_L$ that under-approximates semantic entailment (Frenkel et al., 2024). Canonicalization proceeds by sorted ordering, duplicate-dropping in conjunctions/disjunctions, and normalizing quantifier blocks.

Abstract join (meet in the lattice of upward-closed sets) is implemented by a "weakening" operation: given a formula $\phi$ and state $s$ , $W_L(\phi, s)$ computes minimal formulas $\psi \sqsupseteq_L \phi$ such that $s \models \psi$ ; in-place updates enable highly efficient fixpoint computation. Practical algorithms (e.g., LSet) support symbolic handling of spaces with $|L| \sim 10^{11}$ without explicit enumeration, as demonstrated on quantified invariants for the Paxos protocol (Frenkel et al., 2024).

6. Rank, Degree, and Genericity of First-Order Formulas

FOFs can be further classified by their expressive and structural complexity within classes of theories, structures, or isomorphism types. The Sudoplatov–Morley-style rank $\mathrm{RS}(T)$ and degree $\mathrm{ds}(T)$ generalize traditional Morley rank/degree to arbitrary definable sets of theories. For a property $P \subseteq \mathrm{Ty}$ and formula $\phi$ ,

$\mathrm{rank}_P(\phi) := \mathrm{RS}(P_\phi), \qquad \mathrm{deg}_P(\phi) := \mathrm{ds}(P_\phi),$

where $P_\phi = \{T \in P \mid \phi \in T\}$ (Sudoplatov, 2021). The (rank, degree) of $\phi$ measures the depth and width of definable branching induced by $\phi$ within $P$ .

A formula is $P$ -generic if it achieves maximal rank and, when the rank is finite, maximal degree. This framework enables a fine-grained analysis of FOFs and their role in stratifying spaces of models/theories, with applications to classification theory, finite model theory, and expressiveness quantification (Sudoplatov, 2021).

7. Applications, Integration, and Extensions

FOFs are foundational for invariant inference in program analysis and distributed protocol verification. State-of-the-art synthesis frameworks integrate data-driven FOF synthesis (e.g., by ASP) with symbolic abstract interpretation based on canonical formula representations, enabling highly scalable and extensible analyses. Orthogonal slice-based enumeration and aggressive pruning can be composed with other inference tools (e.g., Flyvy, DuoAI) for composability and modular optimizations, with substantial reductions in both candidate set sizes and invariant description length without sacrificing completeness (Yang et al., 7 Jan 2026).

Algebraic perspectives tie together syntactic descriptions, model-theoretic properties, and computational manipulation, while ongoing work on variable dependence, rank-degree stratification, and practical symbolic abstraction continues to expand the utility and scalability of first-order formula reasoning across logic, verification, and knowledge representation (Frenkel et al., 2024, Valby, 2014, Lefever et al., 28 Jan 2025, Sudoplatov, 2021).