FO(SUM) & IFP(SUM): Logic with Summation
- FO(SUM) is a logical framework that extends first-order logic by introducing a summation operator to aggregate numerical values based on logical conditions.
- IFP(SUM) enhances FO(SUM) with an inflationary fixed-point operator, enabling recursive, unbounded aggregations crucial for applications like neural network evaluation on graphs.
- The approach underpins efficient query evaluation in algebraic difference fields and recursive Datalog, balancing expressiveness with tractable computational complexity.
First-order logic with summation (FO(SUM)) and its recursive extension IFP(SUM) are foundational logics for querying weighted finite structures, notably enabling direct expression of numerical aggregation and iterative computation on graphs, difference fields, and machine learning models. FO(SUM) extends classical first-order logic by introducing a summation operator within terms, yielding aggregate values based on logical conditions. IFP(SUM) further empowers FO(SUM) with the inflationary fixed-point operator, facilitating unbounded recursion over aggregatesāthis is crucial for expressing computations like neural network evaluation across arbitrary depth and complex recursive summation in algebraic domains.
1. Syntax and Semantics of FO(SUM) and IFP(SUM)
FO(SUM) augments standard first-order logic over a finite weighted vocabulary with built-in numeric constants, operations , and ordering . Its grammar distinguishes formulas and terms. The critical extension is the term-forming operator: This denotes the sum over all tuples in the universe satisfying , aggregating .
IFP(SUM) introduces the operator: Here, is an intensional symbol updated via at each iteration. The inflationary semantics fix values of once they become non-. This process, monotone and terminating on finite universes, yields the least inflationary fixed-pointāensuring the definability of recursive aggregations (Grohe, 14 Jan 2026).
2. FO(SUM) and IFP(SUM) in Algebraic Difference Fields
In the context of P-recursive difference fields, FO(SUM) generalizes indefinite summation over sequences defined by linear recurrence relations. An element in the field represents rational functions in both and the indeterminates , which encode the recursion shifts. The summation process, formalized as finding such that for a given , proceeds by:
- Factoring the denominator into "normal" and "special" polynomials (special if irreducible divides some shift , otherwise normal).
- Predicting the normal denominator part via a gcd formula derived from dispersion properties.
- Addressing the numerator via a degree-bounded polynomial ansatz, yielding a coefficient system solved by linear algebra over .
- Handling special denominator factors heuristically or by structural theorems (Galois-theoretic arguments show linearity in specific domains).
This methodology enables indefinite summation in nontrivial differences fields, closely paralleling the tower construction for parallel integration (Chen et al., 2024).
3. Recursive Aggregation and Datalog: Fixpoint Semantics
Recursive extensions of FO(SUM), notably IFP(SUM), directly relate to least-fixpoint semantics in recursive Datalog programs with aggregates. Aggregates like SUM, MIN, MAX, and COUNT are incorporated via monotonic "msum" and "mcount" constructions over set-containment lattices: with monotonic mappings guaranteeing unique least fixpoints by KnasterāTarski.
Optimization exploits monotonicity: partial sums are generated (via msum), and standard SUM is recovered by group-wise max. Implementation leverages semi-naĆÆve differential evaluation and aggregate-pushing, where final max constraints are enforced in recursive rules, yielding efficient computation and performance scalingādemonstrated by BigDatalog on large industrial graphs (Zaniolo et al., 2017).
4. Expressiveness and Illustrative Applications
FO(SUM) is restricted to express queries of bounded recursion depth. IFP(SUM) extends this to unbounded recursion, crucial for evaluating deep feed-forward neural networks and recursive queries on weighted graphs. Notable examples include:
- Squaring-on-a-path: For a directed path , an IFP(SUM) query generates doubly-exponential values via nested summation and squaring.
- Neural network evaluation: IFP(SUM) succinctly expresses node-wise value computation with ReLU activation using recursive aggregation over in-neighbors, allowing precise definition of network output irrespective of depth.
However, IFP(SUM) does not capture all polynomial-time queries in the classical sense; certain model-agnostic queries on FNNs are not definable even with binary weights. Over ordered weighted structures, GrƤdelāMeer established that IFP(SUM) captures P in the BlumāShubāSmale real-RAM model. The scalar fragment sIFP(SUM)āforbidding intensional symbols in products or quotientsāretains polynomial-time evaluation on Turing machines (Grohe, 14 Jan 2026).
5. Computational Complexity of Query Evaluation
For closed FO(SUM) expressions on rational weighted structures, the data complexity is in uniform . This is formalized:
Theorem 4.10: For every closed FO(SUM) expression , its evaluation can be computed by a dlogtime-uniform family of threshold circuits of bounded depth and polynomial size (Grohe, 14 Jan 2026).
In contrast, unrestricted IFP(SUM) may iterate to doubly-exponential valuesānot feasible in polynomial time for general inputs. Restriction to sIFP(SUM) ensures data complexity in P:
Theorem 5.11: The data complexity of sIFP(SUM) is in P (Grohe, 14 Jan 2026).
A plausible implication is that sIFP(SUM) represents the tractable fragment of IFP(SUM), guiding practical query design in database and ML contexts.
6. Extensions, Limitations, and Connections to Related Frameworks
The indefinite summation algorithm for P-recursive extensions is incomplete only in predicting the special denominator part; normal denominator prediction is guaranteed by a polynomial gcd formula. Special factor enumeration is generally feasible only when bounded (e.g., linear in indeterminates due to Galois structure). For recursive Datalog with SUM, monotonic aggregates and max-pushing achieve superior performance and scalability compared to naĆÆve stratified evaluation, as evidenced empirically.
FO(SUM) and IFP(SUM) are part of a broader landscape connecting descriptive complexity, algebraic summation, and fixpoint computation. Their limitationsāexpressiveness (bounded in classical sense), dependency on structural constraints for tractability, and incomplete handling of special factorsāmotivate continued investigation into recursion, aggregation, and query language power across domains (Chen et al., 2024, Zaniolo et al., 2017, Grohe, 14 Jan 2026).