Object-Centric FPR Evaluation

Updated 7 February 2026

The paper introduces an object-centric FPR evaluation that quantifies the fraction of false positive context-activity pairs allowed by the model but not observed in logs.
It explains an algorithmic procedure involving context extraction, event grouping, breadth-first replay, and set comparison for precise FPR calculation.
The study highlights practical recommendations to mitigate challenges like state-space explosion and log noise through sampling and object abstraction.

Object-centric false positive rate (FPR) evaluation is a rigorous approach to quantifying the quality of process models—specifically object-centric Petri nets—in settings where multiple, interacting case notions are present. Unlike traditional process mining, which relies on a single case notion and sequential events, object-centric frameworks accommodate concurrent, interdependent objects, necessitating more sophisticated metrics for assessing model precision and error rates. In this context, FPR reflects the proportion of possible (context, activity) pairs allowed by the process model that are not actually observed in the event log, thereby indicating the model’s tendency to permit unobserved (false positive) behavior (Adams et al., 2021).

1. Formal Definitions of Object-Centric Precision and FPR

Both the log and the model are characterized in terms of "one-step continuations" under a given context. An object-centric Petri net $N = (P, T, F, \ell)$ uses places colored by object types, variable arcs, an initial marking $M_{\mathrm{init}}$ , and a set of final markings. The model’s behavioral language $\mathcal{L}(N)$ is defined as the set of all pairs $(\mathit{ctx}, a)$ where, after replaying some binding sequence $\sigma$ , a marking $M$ is reached with context $\mathit{ctx}_\sigma = \mathit{ctx}$ , and some enabled transition $(t, b)$ labeled $a$ can fire.

The log language $\mathcal{L}(L)$ is the set of all pairs $M_{\mathrm{init}}$ 0 such that event $M_{\mathrm{init}}$ 1 in the object-centric event log $M_{\mathrm{init}}$ 2 occurs in the context $M_{\mathrm{init}}$ 3.

False positives are those (context, activity) pairs allowed by the model but not observed in the log: $M_{\mathrm{init}}$ 4 Precision is then defined as: $M_{\mathrm{init}}$ 5 FPR is the complement, giving: $M_{\mathrm{init}}$ 6 An event-wise average is typically computed, where for each event $M_{\mathrm{init}}$ 7: $M_{\mathrm{init}}$ 8

$M_{\mathrm{init}}$ 9

The per-event precision is then $\mathcal{L}(N)$ 0, with FPR per-event as its complement, averaged over the event set $\mathcal{L}(N)$ 1 (Adams et al., 2021).

2. Algorithmic Procedure for FPR Calculation

The recommended computational workflow involves:

Context Extraction: Compute each event’s context $\mathcal{L}(N)$ 2 by constructing the event–object graph and extracting multisets of object prefixes as described in Definition 3 of the referenced paper.
Event Grouping by Context: Aggregate events that share the same context, collating the corresponding visible binding sequences from the log.
Breadth-First Replay: For each context and its binding sequence, perform a breadth-first replay on the object-centric net. This involves:
- Initializing from $\mathcal{L}(N)$ 3, enqueueing $\mathcal{L}(N)$ 4.
- While the pending binding list is nonempty, fire all silent transitions exhaustively. For each enabled binding, proceed forward and re-enqueue. When the visible binding list is empty, collect all enabled, visible transitions at each marking.
Set Comparison: For each context, compare the set of model-enabled activities with the set of log-enabled activities to compute per-context (event) precision and FPR.

Data structures essential for efficiency include:

Hash maps from contexts to event lists.
Queues of replay states ( $\mathcal{L}(N)$ 5).
Sets to store enabled activities per context.

Computational complexity is $\mathcal{L}(N)$ 6 for context building in the worst case; replay complexity is exponential in the number of silent loops and combinatorial object-variable bindings (Adams et al., 2021).

3. Illustrative Example

A core exemplification involves the "flight-and-baggage" log ( $\mathcal{L}(N)$ 7) and the discovered net ( $\mathcal{L}(N)$ 8). For a representative event $\mathcal{L}(N)$ 9 (“Lift off” for plane p1):

The computed context comprises prefixes for both plane and baggage objects.
Replay of this context in the net results in four reachable markings due to silent transitions.
At each marking, both "Lift off" and "Pick up @ dest" are model-enabled, while only "Lift off" is observed in the log given this context.
Hence, for $(\mathit{ctx}, a)$ 0, en_L( $(\mathit{ctx}, a)$ 1) = {"Lift off"}, en_N( $(\mathit{ctx}, a)$ 2) = {"Lift off", "Pick up @ dest"}, yielding per-event precision 0.5, per-event FPR 0.5.
Aggregation across $(\mathit{ctx}, a)$ 3 yields overall fitness 1.0 (complete replayability) and precision ≈ 0.89 (Adams et al., 2021).

4. Interpretability, Sensitivity, and Limitations

This object-centric FPR metric generalizes traditional escaping-edges precision in the single-case setting, admitting the interpretation that the FPR quantifies the fraction of model-allowed continuations not seen in the log. For logs with a single object type and one object per event, the definitions reduce to standard escaping-edges measures.

Notably, contexts that cannot be replayed on the model are skipped in the averaging, which may undermine the reliability of the metric, especially under high log noise or process deviations. A plausible implication is that extensive skipping makes the metric less representative of the model’s overall precision.

Limitations center on state-space explosion from silent transition loops and variable-arc combinatorics, which can render full replay computationally infeasible on large or highly nondeterministic models. The prevalence of generic or re-deployed objects (e.g., equipment used across disparate cases) can create uninformative long-prefix contexts, deflating measured precision.

Approximate replay or random binding sampling is proposed as mitigation, as is object abstraction to avoid overly generic objects. Object-centric alignments for contexts that fail strict replay are suggested as an area for further extension (Adams et al., 2021).

5. Relation to Traditional Quality Metrics

When restricted to a single case notion, the object-centric FPR and precision collapse to their traditional analogues used in process mining, such as the escaping-edges metric. However, object-centric formalism uniquely accounts for multiple, interacting case notions and their induced context dependencies, which are not captured by single-case metrics. This supports fairer and more insightful evaluation of models discovered from real-world processes where multiple entities interact.

6. Practical Recommendations and Best Practices

Effective empirical application of object-centric FPR computation necessitates several practices:

Pre-filtering or abstraction of ubiquitous objects to prevent artificial context inflation.
Restriction or sampling of silent transition closures in highly looped models to manage computational demands.
Supplementing skipped contexts with object-centric conformance alignments where feasible, enhancing metric robustness.
Transparent reporting of both fitness and precision (or FPR), as well as the proportion of skipped contexts.

These practices allow for sound, reproducible quantification of false-positive behavior in object-centric process mining, enabling principled evaluation and improvement of process discovery techniques (Adams et al., 2021).

Markdown Report Issue Upgrade to Chat

References (1)

Precision and Fitness in Object-Centric Process Mining (2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Object-Centric FPR Evaluation.