Papers
Topics
Authors
Recent
Search
2000 character limit reached

LTL Model Checking for Self-Modifying Code

Updated 15 February 2026
  • The paper introduces a self-modifying pushdown system (SM-PDS) that extends classical PDS to capture dynamic code mutations for LTL model checking.
  • It employs a symbolic saturation method and reduces the LTL checking problem to the emptiness problem for self-modifying Büchi pushdown systems (SM-BPDS) with proven EXPTIME-completeness.
  • Experimental evaluations demonstrate that the tool achieves 100% detection rate in malware benchmarks while outperforming traditional PDS-based analyzers in speed and resource usage.

Self-modifying code refers to programs capable of altering their own instructions dynamically during execution, a feature extensively leveraged in malware to obfuscate behaviors and evade analysis. The verification challenge posed by the evolving nature of self-modifying code necessitates formal models capable of capturing both stack-based control flow and dynamic program mutation. One approach, as formalized in (Touili et al., 2019), extends pushdown systems (PDS) to the class of self-modifying pushdown systems (SM-PDS) and addresses the problem of Linear Temporal Logic (LTL) model checking for such systems. The developed framework further establishes a reduction to the emptiness problem for self-modifying Büchi pushdown systems (SM-BPDS), with algorithms and tool support validated on self-modifying malware benchmarks.

1. Formal Model: Self-Modifying Pushdown Systems

A Self-Modifying Pushdown System (SM-PDS) is defined as a quadruple P=(P,Γ,Δ,Δc)\mathcal{P} = (P, \Gamma, \Delta, \Delta_c) where PP is a finite set of control-locations, Γ\Gamma is a finite stack alphabet, Δ(P×Γ)×(P×Γ)\Delta \subseteq (P \times \Gamma) \times (P \times \Gamma^*) denotes standard pushdown rules, and ΔcP×Δ×Δ×P\Delta_c \subseteq P \times \Delta \times \Delta \times P encodes self-modifying rules.

The semantics operate over configurations c=(p,w,θ)c = (\langle p, w \rangle, \theta), where pPp \in P, wΓw \in \Gamma^*, and θΔΔc\theta \subseteq \Delta \cup \Delta_c is the current active phase (set of rules). Execution proceeds via two step types:

  • Standard rule: (p,γu,θ)P(p,wu,θ)(\langle p, \gamma u \rangle, \theta) \Longrightarrow_{\mathcal{P}} (\langle p', w' u \rangle, \theta) for r=p,γp,wθΔr = \langle p, \gamma \rangle \hookrightarrow \langle p', w' \rangle \in \theta \cap \Delta.
  • Self-modification: (p,u,θ)P(p,u,θ)(\langle p, u \rangle, \theta) \Longrightarrow_{\mathcal{P}} (\langle p', u \rangle, \theta') if r=p#(r1,r2)pθΔcr = p\#(r_1, r_2)p' \in \theta \cap \Delta_c with r1θr_1 \in \theta and θ=(θ{r1}){r2}\theta' = (\theta \setminus \{r_1\}) \cup \{r_2\}.

Setting Δc=\Delta_c = \varnothing yields a classical PDS.

2. Büchi Acceptance and Emptiness for SM-BPDS

The notion of Büchi acceptance is incorporated by extending SM-PDS to SM-BPDS: BP=(P,Γ,Δ,Δc,G)\mathcal{BP} = (P, \Gamma, \Delta, \Delta_c, G), with GPG \subseteq P specifying Büchi-accepting control-locations. Runs are infinite sequences of configurations, and acceptance is defined by infinitely recurring visits to locations in GG.

Head and Repetition: A head is a tuple (p,γ,θ)(\langle p, \gamma \rangle, \theta). A head is repeating if, for some stack-suffix vv, (p,γ,θ)BPr(p,γv,θ)(\langle p, \gamma \rangle, \theta) \Rightarrow^r_{\mathcal{BP}} (\langle p, \gamma v \rangle, \theta) traversing at least one configuration with control-location in GG.

The emptiness problem for SM-BPDS amounts to deciding whether a repeating head is reachable from a given initial configuration.

3. Reduction of LTL Model Checking to Emptiness

For an SM-PDS P\mathcal{P} and labeling function ν:P2At\nu: P \to 2^{At} (where AtAt is a set of atomic propositions), a configuration satisfies an LTL formula φ\varphi if some run projected through ν\nu yields a model of φ\varphi. Following the automata-theoretic approach, a nondeterministic Büchi automaton Bφ=(Q,2At,η,q0,F)\mathcal{B}_{\varphi} = (Q, 2^{At}, \eta, q_0, F) is constructed for φ\varphi. The product system, denoted BPφ\mathcal{BP}_{\varphi}, has states P×QP \times Q and transitions formed on-the-fly from SM-PDS and Bφ\mathcal{B}_{\varphi}, including correct treatment of Δc\Delta_c rules.

Correctness (Theorem 4): A configuration of P\mathcal{P} satisfies φ\varphi if and only if in the product SM-BPDS, the corresponding initial configuration admits an infinite accepting run. Thus, LTL model-checking for SM-PDS reduces in polynomial time to the SM-BPDS emptiness problem (Touili et al., 2019).

4. Algorithmic Approach: The Saturation Method and Complexity

The emptiness problem is addressed via a finite head-reachability graph G=(V,{0,1},E)\mathcal{G} = (V, \{0,1\}, E), where vertices are possible heads (p,γ,θ)(\langle p, \gamma \rangle, \theta) and edge labels indicate whether GG is visited during the path. Cycles with at least one “1”-labeled edge correspond to repeating heads.

Computation: A symbolic saturation (fixed-point) algorithm builds automata recognizing predecessor heads, annotated with Büchi bits. Transitions—either from standard rules or self-modifications—are added until a global fixpoint is reached, operating polynomially in P|P| and Γ|\Gamma| but exponentially in Δ+Δc|\Delta| + |\Delta_c|.

Complexity Bound: The overall time for deciding SM-PDS LTL model checking is 2O(φ+Δ+Δc)poly(P,Γ)2^{O(|\varphi| + |\Delta| + |\Delta_c|)} \cdot \operatorname{poly}(|P|, |\Gamma|), placing the problem in EXPTIME. The result is EXPTIME-complete.

5. Implementation and Practical Procedures

A prototype tool implements the SM-PDS LTL model checking approach, with the following architecture:

  • Disassembly and abstraction: Binaries are disassembled (via Jakstab), control-flow recovered, and a conservative approximation of indirect jump targets is computed to build the SM-PDS (P,Γ,Δ,Δc)(P, \Gamma, \Delta, \Delta_c) instance.
  • Automata construction: Büchi automata for LTL properties are generated using LTL2BA.
  • Symbolic product formation: Product SM-BPDS is built on the fly to avoid materializing exponential numbers of phases.
  • Saturation procedure: A fixpoint computation over symbolic automata determines head reachability, with stack and phase transitions stored in adjacency lists.
  • Cycle detection: 1-labeled cycles in the head-reachability graph are sought via Tarjan’s SCC algorithm augmented to detect Büchi visits.

The incremental symbolic nature of the algorithm avoids explicit enumeration of all 2Δ+Δc2^{|\Delta| + |\Delta_c|} phases, yielding practical efficiency superior to worst-case complexity in typical scenarios (Touili et al., 2019).

6. Experimental Evaluation: Malware Detection and Benchmarking

The tool was benchmarked in three principal settings:

Setting Samples/Remarks Outcome
SM-PDS LTL checker vs. PDS+Moped Synthetic PDSs (Δ=255,Δc=8|\Delta|=255, |\Delta_c|=8) SM-PDS tool runs in <1<1 second, PDS+Moped needs minutes/hours; Moped often exhausts memory.
Self-modifying malware detection 892 binaries (VirusShare, MalShare, VX-Heaven, NGVCK, benign XP) 100% detection of matching malware, 0 false positives (benign marked safe); runtime <10<10 min
Comparison with commercial antivirus 205 fresh NGVCK self-modifying worms No commercial AV detects all 205; tool detects 100% [Table IV, (Touili et al., 2019)]

Included LTL properties formalize patterns such as registry-key injection, data-stealing, spy-worm activity, and appending virus behavior. Modeling self-modification in the SM-PDS semantics allows correct reachability analysis, which classical PDS-based (or static CFG) checkers cannot achieve when code mutation is present.

7. Illustrative Example: Encoding and Verification Workflow

As an explicit illustration, a toy self-modifying binary is mapped as follows. Addresses represent control-locations; for example, a mov [0x2], 0x0c instruction replaces the rule for push 0x9 at address $0x2$ with that for jmp 0x9,correspondingtoa, corresponding to a\Delta_crulerulep_3\#(r_{\mathsf{push\ 0x9}}, r_{\mathsf{jmp\ 0x9}})p_4$. Stack symbols correspond to return addresses. The resulting SM-PDS, leveraging this explicit mutation semantics, allows the LTL checker to detect paths—such as to aCopyFileA` call introduced dynamically—that would otherwise be missed if self-modifying effects were ignored. This capacity to analyze dynamic behavioral patterns is critical for sound detection in security-centric applications (Touili et al., 2019).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to LTL Model Checking for Self-Modifying Code.