Parallel Semantics PDG (PS-PDG)

Updated 7 February 2026

Parallel Semantics PDG is a framework that extends traditional PDGs by incorporating hierarchical nodes and explicit node traits to accurately model parallel constructs.
It introduces mechanisms for annotated data-selector semantics on edges and supports parallel semantic variables to handle reductions and privatization.
Empirical evaluations demonstrate that PS-PDG expands scheduling options and reduces critical path lengths by up to twofold, improving parallel compiler optimization.

The Parallel Semantics Program Dependence Graph (PS-PDG) is a formal extension of the Program Dependence Graph (PDG) designed to model, analyze, and optimize explicitly parallel intermediate representations (IRs) in modern compilers. While classical PDGs capture the minimal set of control and data constraints needed to guarantee semantic-equivalence in sequential programs, the PS-PDG introduces additional mechanisms to handle the semantics of parallel constructs, including fork–join, task and loop parallelism, reductions, atomics, and scoped or hierarchical contexts. This decouples the description of semantic constraints from the parallel execution plan, enabling compilers to generate a broader set of valid parallel schedules while strictly preserving program correctness (Homerding et al., 2024).

1. Formal Definition

The PS-PDG is formally defined as a 4-tuple

$\mathrm{PS\text{-}PDG} = (\mathcal{N},\,\mathcal{E},\,\mathcal{V},\,\mathcal{A})$

where:

$\mathcal{N}$ : Set of nodes. Each node is either a Plain node (grouping IR instructions) or a Hierarchical node (grouping child nodes), annotated with traits $T \subseteq \{\mathsf{orderless},\,\mathsf{singular},\,\mathsf{atomic}\} \times \mathcal{K}$ , where $\mathcal{K}$ is the set of context labels.
$\mathcal{E}$ $E$ : Set of edges partitioned into:
- Directed edges ( $\mathcal{E}_\rightarrow \subseteq \mathcal{N} \times \mathcal{N} \times \Sigma \times \mathcal{K}$ ), where $\Sigma = \{\mathsf{AnyProd},\,\mathsf{LastProd},\,\mathsf{AllCons}\}$ specifies the data-instance relationship between producer and consumer.
- Undirected edges ( $\mathcal{E}_\leftrightarrow \subseteq \mathcal{N} \times \mathcal{N} \times \mathcal{K}$ ), enforcing mutual exclusion but not relative ordering.
$\mathcal{V}$ : Set of parallel semantic variables. Each variable is annotated as privatizable (with possible reduction function) or reducible (user-supplied binary operator) in a given context.
$\mathcal{A} \subseteq \mathcal{V} \times \mathcal{N} \times \{\mathsf{use}, \mathsf{def}\}$ : Use/def relations connecting variables and nodes.

Context labels $\mathcal{K}$ index different semantic scopes (e.g., particular loops or paralell regions), allowing properties and constraints to be localized.

2. Motivating Extensions and Comparison with Sequential PDG

The PS-PDG systematically extends the classical PDG to support explicit parallelism. The chief innovations are as follows:

Hierarchical Nodes: Allow formation of higher-level regions (such as OpenMP critical or task blocks), supporting region-level semantics.
Node Traits: Annotate nodes with atomic, orderless, or singular properties. This encodes, for instance, atomicity (critical/atomic regions), unordered execution (parallel sections), or single-execution constraints (OpenMP single).
Context Sensitivity: Enables dependence and constraints to be local within specific regions/loops, not globally.
Undirected Mutual Exclusion Edges: Capture “may not run in parallel” without requiring a happens-before order.
Data-Selector Semantics on Edges: Allow specification whether any, last, or all producer instances may provide data to a consumer (enabling accurate modeling of reductions and lastprivate).
Parallel Semantic Variables and Use/Def Annotation: Annotate variables as privatizable or reducible, supporting explicit modeling of threadprivate state and reductions.

The following table (exact content from (Homerding et al., 2024)) summarizes the differences:

Feature	PDG	PS-PDG
Node Granularity	single inst.	Inst‐set / region (HN)
Atomic/Critical	N/A	$\mathsf{atomic}$ trait
Loop-carried Indep.	N/A	$\mathsf{orderless}$ trait w/ context
Single-execute	N/A	$\mathsf{singular}$ trait
Context sensitivity	global	per-region contexts $k \in \mathcal{K}$
Must-precede edges	only $\rightarrow$	$\rightarrow$ with $\sigma$ and $\{\}$ undirected
Reductions	unsupported	$\mathcal{V}$ : reducible(var, f)
Privatization	unsupported	$\mathcal{V}$ : privatizable(var)

These extensions collectively allow the PS-PDG to precisely express the legal set of parallelizations admitted by modern IRs—functionality out of reach for the classical PDG (Homerding et al., 2024).

3. Semantic Invariants and Scheduling Correctness

A schedule $S$ derived from a PS-PDG must observe several invariants:

Directed-Edge Ordering: For any directed edge $(n_p, n_c, \sigma, k)$ , every dynamic instance $d_c$ of $n_c$ must wait for some producer $d_p$ (selected by $\sigma$ ) in context $k$ to complete.
Undirected-Edge Mutual Exclusion: For any undirected edge $\{n_i, n_j, k\}$ , no two instances may overlap in execution under context $k$ .
Atomicity: If node $n$ is atomic in context $k$ , all dynamic instances of $n$ under $k$ execute without interleaving.
Singularity: If singular, at most one dynamic instance exists in context $k$ .
Reduction Correctness: For reducible variables, the reduction operation must match sequential semantics.
Privatization Cleanup: Privatized variables must either be properly reduced or discarded at parallel region boundaries.

These invariants define the legal set of parallel execution plans (schedules) that preserve the original semantics (Homerding et al., 2024).

4. Example: OpenMP Parallel Loop with Reduction and Critical Section

Consider the OpenMP kernel:

#pragma omp parallel for reduction(+:sum)
for(int i=0; i<N; i++){
  #pragma omp critical
  sum += A[i];
}

In the PS-PDG, this structure is represented as:

Nodes: $n_F$ (the for loop), $n_A$ (the addition, atomic in $k_{\mathrm{crit}}$ ).
Directed edge: $(n_F, n_A, \mathsf{AnyProd}, k_F)$ .
Undirected edge: $\{n_A, n_A, k_{\mathrm{crit}}\}$ .
Variable: sum, reducible with $f_+$ in $k_F$ ; $\mathsf{use}$ and $\mathsf{def}$ edges to $n_A$ .

This encoding grants the compiler more freedom than a classic PDG, where a data-dependence would enforce strict serialization. The PS-PDG enables:

Tree-reduction or fan-in parallelism.
Non-sequential scheduling of additions (due to $\mathsf{AnyProd}$ , no producer ordering enforced).
Possible elimination of the critical section by using a software reduction.

This illustrates PS-PDG’s ability to encode parallel semantics compactly while enlarging the set of valid schedules (Homerding et al., 2024).

5. Quantitative Evaluation and Optimization Benefits

Empirical evaluation, implemented as an extension to the NOELLE LLVM-based auto-parallelizer, demonstrates substantial benefits:

Exploration of Parallelization Options: PS-PDG increases the number of legal parallelization plans—on NAS C-benchmarks, PDG alone sees 12.3 options/loop, Jensen et al. workshare analysis raises this to 18.7, while PS-PDG enables 43.5 (over 3× increase).
Reductions in Critical Path Length: On an ideal unbounded-core CPU, PS-PDG reduces critical path by a factor 1.82× on average (up to 2.1×) compared to the programmer-supplied plan, whereas the classic PDG gives only 1.14×. See the following table:

Benchmark	# Options (PDG)	# Options (PS-PDG)	CritPath Speedup (PDG)	CritPath Speedup (PS-PDG)
CG	8	28	1.12×	2.05×
IS	14	55	1.08×	1.85×
FT	10	35	1.20×	1.65×
MG	7	21	1.05×	1.50×
SP	5	18	1.10×	1.40×
Average	12.3	43.5	1.14×	1.82×

PS-PDG thus substantially enlarges the legal scheduling space and enables significantly shorter critical paths while maintaining strict semantic equivalence (Homerding et al., 2024).

6. Significance and Implications

PS-PDG addresses the primary deficiency of existing parallel IRs and PDG-based optimization frameworks: the inability to explicitly capture the minimum necessary constraints on parallel execution for semantic equivalence. By precisely expressing not only control and data dependences but also fine-grained parallel traits and context-sensitive constraints, PS-PDG forms a foundational tool for parallelizing compilers.

A plausible implication is the facilitation of advanced optimization strategies—such as aggressive reduction tree scheduling, elimination or transformation of atomic/critical sections, and safe exploitation of orderless or singular regions—all with provable semantic preservation.

PS-PDG’s explicit separation of semantic constraints (“what must happen”) from the concrete parallel plan (“how it happens”) gives compilers broad latitude to retarget code to complex heterogeneous architectures or adapt to evolving parallel programming constructs, while providing strong correctness guarantees (Homerding et al., 2024).

7. Conclusion

The Parallel Semantics Program Dependence Graph (PS-PDG) generalizes the classical PDG to fully encompass the requirements of explicitly parallel IRs, introducing hierarchical nodes, fine-grained node traits, context-sensitive constraints, advanced edge semantics, and explicit support for reductions and privatization. By fundamentally extending the representational power of dependence graphs, the PS-PDG enables compilers to robustly optimize and schedule parallel programs, substantially expanding the space of provably correct schedules and enabling reductions in critical path length by up to a factor of two compared to classic approaches—all while strictly maintaining the original parallel semantics (Homerding et al., 2024).

Markdown Report Issue Upgrade to Chat

References (1)

The Parallel Semantics Program Dependence Graph (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Parallel Semantics PDG (PS-PDG).