Mechanized semantics for the Clight subset of the C language

Published 23 Jan 2009 in cs.PL | (0901.3619v1)

Abstract: This article presents the formal semantics of a large subset of the C language called Clight. Clight includes pointer arithmetic, "struct" and "union" types, C loops and structured "switch" statements. Clight is the source language of the CompCert verified compiler. The formal semantics of Clight is a big-step operational semantics that observes both terminating and diverging executions and produces traces of input/output events. The formal semantics of Clight is mechanized using the Coq proof assistant. In addition to the semantics of Clight, this article describes its integration in the CompCert verified compiler and several ways by which the semantics was validated.

Abstract PDF Upgrade to Chat

Citations (197)

View on Semantic Scholar

Summary

Mechanized Semantics for the Clight Subset of the C Language

The paper authored by Sandrine Blazy and Xavier Leroy focuses on the formal semantics of Clight, a substantial subset of the C language that forms the source language for the CompCert verified compiler. This work provides an intricate formalization of the Clight language using big-step operational semantics, mechanized with the Coq proof assistant, and explores its integration with the CompCert compiler. This paper contributes to the detailed understanding and verification of C programs and serves as a bridge to verified compilation.

Key Components of the Clight Language

Clight encompasses a wide range of C language features, such as pointer arithmetic, structure and union types, loops, and structured switch statements. However, it excludes unstructured control flows like goto statements. The formal semantics defines how Clight evaluates expressions, executes statements, and handles program execution in terms of terminating and potentially diverging behaviors.

Evaluation and Execution Semantics

The Clight semantics are outlined as follows:

Expressions: The evaluation distinguishes between l-value and r-value contexts, abiding by language rules for resolving operators and ensuring proper type handling. Unary and binary operations are undertaken with well-formalized support for type-dependent behaviors and conversions.
Statements: Execution of statements handles basic control flows including sequencing, conditionals, loops, and function calls, emphasizing deterministic execution owing to the purity of expressions. The semantics elegantly manage statement outcomes such as normal completion, breaks, and returns.

These semantics are characterized within a mechanized framework using Coq, facilitating rigorous formal verification efforts.

CompCert Compiler Integration

The Clight language is integral to the CompCert compiler front-end, which translates it to the Cminor intermediate representation, further transformed by the back-end to PowerPC assembly code. The transformation process involves resolving operator overloading and managing variable placement, all while preserving semantic equivalence, meticulously verified with Coq. This proves the power of mechanized semantics to ensure the correctness of complex compiler transformations.

Validation and Trustworthiness

The authors acknowledge the challenges in ensuring that the formal Clight semantics align with practical expectations of C standards. They propose:

Manual Reviews: Although the complex Coq specifications pose challenges, conversion to inference rule formats could aid expert reviews.
Proven Correctness: The use of semantics-preserving translations and equivalence proofs between different semantic styles (e.g., operational and axiom-based) serve to validate both the Clight semantics and the CompCert compiler.
Executable Semantics: Plans for developing a reference interpreter that directly follows the operational semantics further enhance validation efforts, allowing execution and testing of Clight specifications.

Related Work

The formalization of C semantics mechanized in proof assistants like Coq and HOL is a relatively new endeavor. Cholera and other such semantics focus on achieving correctness certification for different language subsets. Compared to these, Clight aims for a manageable yet comprehensive semantics useful in certified compilation within robust frameworks like CompCert.

Implications and Future Directions

The mechanized semantics of Clight not only aid in verified compiler design but also hold potential for impacting static analysis, program provers, and formal verification of C programs. Future research could explore enriching Clight with constructs like goto and enhancing its memory model, thereby broadening its application scope in formal verification tools. Furthermore, extending user-friendly language subsets while maintaining formal rigor remains a hopeful pursuit in mechanized language semantics.