Abstraction de traces en analyse statique et transformation de programmes

In the last decades, software took a growing importance into all kinds of systems. For instance, the design of the hardware and software parts of the command of transportation system typically represents 30 % to 40 % of the cost of the whole development; moreover, most of this fraction is due to the testing and debugging stages. Moreover, complex systems require complex, intricate and large software. As an example, the typical size of current designs for fly-by-wire control systems ranges from 100 000 to 1 000 000 LOCs (lines of code), whereas it used to be typically 10 times smaller 10 years ago.

The consequence of this increasing complexity is that the probability for bugs and failures is dramatically increased, unless the development process is extremely rigorous. Moreover, the consequences of a software failure range from an insignificant imprecision in the computation to the worse unexpected behavior such as the crash of a whole system; in particular, it may cause great human or economic damage, and thus, is not acceptable.

The risk for bugs to occur and to cause major damage is not theoretical. We could cite many examples of famous bugs. For instance, an integer overflow arising in a low importance task caused both the main and the backup control systems to shut down, 30 seconds after the take-off of the Ariane 501 launcher in 1996, resulting in the destruction of the rocket [ea96]. The imprecision in floating point computations caused the failure of a Patriot missile launch in 1992 and dozens of deaths. Even when they do not result in a dramatic failure, bugs may cause tremendous over-costs: for example, multiple issues in the development of the baggage handling system of the Denver airport resulted in a two years schedule overrun and a $ 116 million budget overrun, in 1995. Many other “software horror stories”can be found, e.g. in http://www.cs.tau.ac.il/~nachumd/horror.html, ranging from the most peculiar to the most dramatic reports.

As a consequence of the importance of software, we notice an increasing interest in software verification methods.

In the last thirty years a large range of software verification techniques have been developed, so as to tackle various applications.

Properties: First, let us summarize the most common properties, to be checked by software verification systems.

Safety properties express that programs “should not go wrong”. In particular, the absence of runtime-errors or the absence of undefined behaviors are safety properties. Obviously, such properties are of a great interest, when checking the design of critical systems, such as flight control systems. In particular, the failure of the Ariane 5 launch is the result of the violation of a simple safety property about the integer conversions. A common approach to check such properties consists in computing such an over-approximation of the real behavior of the programs, and to use this approximation in order to check that the programs never break some safety conditions.

Another family of crucial properties are resource usage properties. Indeed, embedded software are usually real-time programs, so that the overuse of memory or time resources would result in the loss of the system. Functional properties state that the system should perform some actions under some conditions. For instance, liveness properties state that a program should eventually achieve some “good condition”. Security properties assert that non-authorized users should not be able to acquire any information about private computations or to corrupt any critical process. In this thesis, we focus on safety properties, and we more particularly attempt at proving the absence of runtime errors. Though, most of the algorithms described in this thesis would apply to other problems.

Verification methods: The verification of software designs used to consist mainly in testing and debugging methods. The idea is to run a program with various (randomly or manually generated) sets of inputs, and to check that the properties of interest are not violated in the “test runs”. However, the drawback of these solutions is that the number of possible real executions is nearly infinite and all situations cannot be tested. Moreover, the cost of testing is cumbersome; in particular, a change in the program should be tested exhaustively.

As a consequence, automatic, formal methods were proposed, so as to increase the level of confidence in the results and to cut down the cost. The principle of Abstract Interpretation [CC77] based Static Analysis is to elaborate a model of the execution for programs, then to choose an over-approximation of the program behaviors, and last, to derive analyzers for computing such an overapproximation automatically. This method is sound, but not complete: in case the over-approximation satisfies all safety conditions, then the program is proved correct; otherwise, we should investigate the reasons for the failure in order to prove the safety, and conclude either that there is a real bug, or that the abstraction should be refined. In the last few years, several successful static analyzers were implemented so as to verify memory properties [LAS00], the absence of runtime errors [BCC+02, BCC+03a], the absence of buffer overruns [DRS03], the correctness of pointer operations [VB04].

Model Checking is a technique based on the abstraction of a program into a (e.g., boolean) model and on the application of SAT-solving methods so as to determine whether dangerous states are accessible. Modern developments in this area allowed for a refinement of the model [CGJ+00] if the checking phase fails to establish the property of interest. A major difficulty of this approach is to synthesize the model from the program and the property to check; in particular the size of the model is critical for the checking phase to be practical. These techniques have been applied in many areas, such as hardware verification [DSC98], software verification [BR01]…

Another approach consists in using Theorem Proving methods, in order to prove the correctness conditions of the program. The definition of the model is critical (if the model is wrong, the proof of correctness with respect to the model is useless), and is difficult to automatize. Furthermore, the automation of the proofs can be a major problem, whereas manual proofs incur major costs. Moreover, the adaptation of proofs for modified programs may also turn out to be very costly, compared to fully automatic methods. A major achievement of this approach was the generation of certified code following the B method [Abr89] for the most critical parts of the control system of the “Meteor” line of Paris subway, despite a tremendous cost.

Perspectives: At this point, the use of formal methods in the development and verification of critical systems is not standard, even though we notice a growing interest in these areas.

In the last few years, various quality and safety standards have been elaborated for the most critical applications. For instance, the DO178-B regulation [TCoA99] requires the observance of strict rules in the design of software for aircrafts: the level of criticality of each part of the code should be determined, the relation between the result of successive development stages should be established, the safety of the most critical sub-systems should be ensured and verified —if possible, formally. As a consequence, we notice an increasing need for verification tools able to tackle large critical applications. Moreover, it seems that verification methods should apply to the real code (the validation of a model may not be considered a sufficient guarantee). In particular, the scalability of the analyses is crucial, due to the growing size of the applications. Last, it is utterly important that the methods integrate naturally in the development process. Indeed, the verification should help in the design of better software, and not impede the development. In practice, the verification method should preferably be automatic and provide immediately usable results (readable invariants, help in the alarm investigation process).

Le rapport de stage ou le pfe est un document d’analyse, de synthèse et d’évaluation de votre apprentissage, c’est pour cela chatpfe.com propose le téléchargement des modèles complet de projet de fin d’étude, rapport de stage, mémoire, pfe, thèse, pour connaître la méthodologie à avoir et savoir comment construire les parties d’un projet de fin d’étude.

Table des matières

1 Introduction
1.1 Software Verification
1.1.1 Need for Software Verification
1.1.2 Current Trends in Software Verification
1.1.3 Context of the Thesis
1.2 Outline of the Thesis
1.2.1 Traces Abstractions
1.2.2 Trace Partitioning
1.2.3 Alarms Diagnosis
1.2.4 Certification of Assembly Code
2 Semantics and Abstraction
2.1 Basic Mathematical Notations
2.2 Syntax and Semantics of a Simple Language
2.2.1 Syntax
2.2.2 Semantics
2.2.3 A Simple Language
2.2.4 Extension with Procedures
2.2.5 Extension to full C
2.2.6 Under-Specified Behaviors
2.3 Abstract Interpretation
2.3.1 Notion of Abstraction
2.3.2 Semantics as Fixpoints and Semantic Approximation
2.3.3 Enforcing Termination
2.3.4 Program Transformations
3 Abstractions of Sets of Traces
3.1 Static Analysis
3.1.1 The Abstraction
3.1.2 Abstract Interpretation of a Simple Semantics
3.1.3 Numerical Abstract Domains
3.1.4 Under-Specified Behaviors in the Standard Describing the Source Language
3.2 Denotational Abstraction
3.2.1 Denotational Semantics
3.2.2 Functions “From-To”
3.2.3 Functions “Along Paths”
3.2.4 Composition
3.2.5 Static Analysis
3.2.6 Symbolic Representation
3.3 Backward Semantics and Analysis
3.3.1 Backward Semantics
3.3.2 Backward Static Analysis
3.4 Projection Abstractions
3.4.1 Variables Projection
3.4.2 Control States Projection
3.4.3 General Case
3.4.4 Fixpoint-based Definition
3.5 Hierarchies of Abstractions