C Policy Framework

From FSL
Jump to: navigation, search

The C Policy Framework, or CPF, is a framework for rapidly developing analysis policies used for debugging and verification of C programs. Policies are developed using the Maude rewrite engine; parsing of C programs is performed mainly using a modified version of CIL, which is compiled using OCaml. Policies are differentiated by different policy tags; each policy has its own annotation language used to define specifications in annotations. Policies are actually checked using a policy-specific abstract semantics for C, which makes use of a large core of functionality shared across multiple policies.

Contents

Running CPF Online

NOTE: We fixed the problem with getting CIL to compile with the 3.11 version of OCaml installed on our server, and have now been able to update it to the same version available for download below. Note that the download version doesn't work with 3.11 either, although the changes needed to Makefile.in are described online here, just scroll down to see Xavier Leroy's fix. We plan to test this with earlier versions of OCaml to see if adding this fix breaks CIL in OCaml versions < 3.11. So far, the UNITS policy has been updated to work with the new changes, and we should have the NOTNULL policy available soon.

Online interfaces are now available for both the UNITS and NOTNULL policies. These allow for experimentation with the policies without actually having to install the software locally on your own machine. For both policies, you can either enter your own program, with annotations, or select from one of the existing examples. Near-term improvements will include adding more examples, hopefully with some descriptive text to indicate what each tests, and more concise output, since the current output is quite verbose.

Installation

We are currently coming up with improved methods for installing the CPF. Currently, you have to load OCaml, download our custom CIL, and build this custom version. We haven't tested this much yet, and are working on ways to improve it, which should be available soon. We plan to have an online parser available at some point in the future -- this will let you hand a C source file to the parser online, which will then hand back the generated Maude verification tasks. Until then, the easiest way to interact with the CPF is with one of the online interfaces linked above.

If you want to install locally, create a directory for CPF and copy all the linked files below into that directory. You then need to extract the CIL Source and CPF Utility Source. For CIL, you should be able to enter the cil directory and run "./configure ; ./make"; for the CPF Utility Source, you still need to go into each subdirectory and build the file manually, giving it the same name as the .C file but with no extension (unit2cpf.C builds to unit2cpf, just use the command g++ -o unit2cpf unit2cpf.C). To process a file, you should first use "./parseunits file.c" for the UNITS policy and "./parsenn file.c" for the NOTNULL policy, which will generate a file called output.maude. To actually perform the analysis, run "maude unit-policy.maude < output.maude" for the UNITS policy or "maude notnull-policy.maude < output.maude" for the NOTNULL policy.

Note that you will probably get some warnings from Maude about operators and preregularity. We are trying to remove these, but they are an artifact of the way we have modeled C's syntax and can safely be ignored.

Available Files

NOTE: There has been a fair amount of restructuring both in the policies and especially in CIL. So far everything should be updated except for the NotNull policy, which is currently being updated. Note that support for type invariants is still being added. Some of these changes are quite new (for instance, invariant support), so please contact us if you find any problems.

Examples can be found on the online interface pages. The easiest way to grab current examples is to go to one of the interfaces, select the file you are interested in, and either run it from there or copy the program text. A full set of examples will be posted here soon as well.

Current Versions of Supporting Software

The C Policy Framework (CPF) currently uses:

  • Maude 2.4 (2.2 and 2.3 should work)
  • OCaml 3.10.2 (older recent versions should work, 3.11 has problems as noted above -- basically, whatever works for CIL works for us)
  • CIL 1.3.6, part of the distribution given above, modified for CPF
  • Perl (any recent version)
  • a C++ compiler (we use g++, others should work fine)

Recent Changes

  • Annotations in CPF now are more standard, with symbols used more as functions or policy-generic predefined values starting with @, like @unit, @none, etc., and symbols used as policy-specific values starting with $, like $meter, $notnull, etc. Other identifiers are assumed to be names declared in the program text, like x, or p->next.
  • Instead of requiring the clever use of assumes and asserts for loops, an invariant statement is now available. This uses the same syntax as an assume statement.
  • A modifies clause is now available. This replaces the prior conservative practice of locking all globals and formals against changes. Any C object listed as being modified is given a new, system-generated "random" value using the havoc directive, which is similar to how other systems handle the equivalent situation. Note that we are currently adding support for aliasing annotations, since, now that variables listed as modified are not locked, the analysis could be unsound in certain cases with aliasing, as our current assumption is that all formal parameters and global variables are NOT aliased.
  • For anyone looking through the CIL code, annotations are now parsed in a lightweight manner, with the parsed form then used during the inlining process. This replaces the use of string replace techniques, which were becoming cumbersome because of the number of cases to maintain, especially as new annotations were added.

Handling Challenging C Constructs

As with many environments for analyzing C code, some constructs in C can make analysis difficult. One current limitation is that we assume type safety in the analysis, so pointer arithmetic is treated conservatively (warnings in the case of the UNITS policy, assumption of $null in the NOTNULL policy). Beyond that, two challenging issues are aliasing and the use of global variables. We are currently working to better integrate whole-program analysis results into CIL so we can provide supporting information for the CPF policy semantics, but this obviously conflicts with the goal of having a modular analysis, so this is left as an option. Currently, we have the following limitations related to these features:

  • We do not currently track aliasing at function boundaries, so we assume that the global variables and formal parameters of a function all start the function in distinct memory locations. This is only an issue when a global or formal is listed as being modified by a function, in which case the user is allowed to change it in the function body. Other globals or formals are locked against changes, with warnings issued when a user attempts to overwrite them. Note that support is being added for aliasing annotations, which should remove this problem.
  • Annotations are considered "complete", in the sense that once a precondition or postcondition for a specific policy is added to a function it is assumed that no other preconditions or postconditions are needed. This makes it possible to leave off a postcondition in cases where it is not needed. In cases where no pre- or post-conditions are specified for a function, a warning is issued at the call site, since the call could then make modifications not visible to the caller (this can be tightened in the future to occur just for those functions where we don't have the definition, but we currently do this for all called functions). This is especially an issue for library functions, since in those cases we don't have the code and have to rely on the given conditions, but there could be many cases where users have not annotated them. If the return value of the function is captured by the caller but no postcondition is present, we use a special value representing an unknown return value. The processing of this value is then handled in a policy-appropriate manner (in the UNITS policy, an unknown integer return value will yield a "fresh" unit which is incompatible with other units in those operations, like addition, where units have to be the same). NOTE: this also applies to the new modifies clause usable within CPF. If at least one other annotation is provided, but no modifies clause is given, this is equivalent to "@modifies(ALL): @none". The alternative, that the call modifies all formals and globals (potentially), can quickly increase the number of false positives.
  • For function pointers, we use alias analysis to determine which functions could be referenced by the pointer. If only one function could be referenced, we treat the call through the pointer as a normal function call and use the same inlining procedure. If multiple functions could be referenced, we currently issue a warning, since we cannot determine the preconditions and postconditions to check. Also, like in cases where no postcondition is specified, we return an unknown value if that value is captured. Note that the determination of which function a function pointer can reference is made using the results of an alias analysis, but this is not necessarily valid where a function is analyzed in isolation. To handle this, it is possible to pass a flag to the analysis that disallows using alias analysis results; this will then always result in a warning when a function pointer is used to make a function call. NOTE: Alias analysis is currently disabled in the parsing script, but can be easily re-enabled by passing the appropriate flag to CIL. Alias analysis is enabled in the online version.
  • setjmp and longjmp are currently not handled, with no plans to add this in the near future.

Other Design Decisions and Limitations

Beyond those mentioned above, there are several other design decisions or tool limitations that are important to be aware of:

  • Pre and post-conditions for library functions (pow, sqrt, etc) can be added by annotating function prototypes. You can see several examples of this in the online examples. One limitation is that, at this time, prototypes of a function with annotations must occur before prototypes for the same function without, so they must be included directly in the source file to be checked before the include that brings the same prototype in.
  • Varargs functions can be used, but do not support pre- and post-conditions -- this is being modified to allow them to support conditions on the non-varying part of the argument list.
  • In the UNITS policy, all array elements are assumed to have the same unit. This can cause false positives in algorithms which work "in place" over an array, which in practice may lead to unit-safe programs where different parts of arrays have different units. For the NOTNULL policy, we make a similar assumption -- an array of pointers will either have all pointers null, or all pointers notnull.
  • In the UNITS policy, only constants can be used in unit expressions as exponents; i.e., meter^3 is fine, meter^x, where x is an int var, is not.

Current Tasks

  • Integrate alias analysis results
  • Finish building test suite to verify coverage of C syntax and semantics
  • Finish annotating syntax and semantics with cross-references into the C standard (mainly done for declarations right now)

Available Publications

The RULE'08 publication highlights the UNITS policy, while the tech report provides a more general view of the C Policy Framework, including some information on an earlier version of the NOTNULL policy, a higher-level overview of the UNITS policy, and more discussion of the CPF.


Mark Hills, Feng Chen and Grigore Rosu
RULE'08, ENTCS, to appear, 2008
Abstract. Many C programs assume the use of implicit domain-specific information. A common example is units of measurement, where values can have both a standard C type and an associated unit. However, since there is no way in the C language to represent this additional information, violations of domain-specific policies, such as unit safety violations, can be difficult to detect. In this paper we present a static analysis, based on the use of an abstract C semantics defined using rewriting logic, for the detection of unit violations in C programs. In contrast to typed approaches, the analysis makes use of annotations present in C comments on function headers and in function bodies, leaving the C language unchanged. Initial evaluation results show that performance scales well, and that errors can be detected without imposing a heavy annotation burden.
PDF, RULE'08 slides, RULE'08, BIB

Mark Hills, Feng Chen and Grigore Rosu
Technical Report UIUCDCS-R-2008-2931, January 2008
Abstract. Many programs make implicit assumptions about data. Common assumptions include whether a variable has been initialized or can only contain non-null references. Domain-specific examples are also common, with many scientific programs manipulating values with implicit units of measurement. However, in languages like C, there is no language facility for representing these assumptions, making violations of these implicit program policies challenging to detect. In this paper, we present a framework for pluggable policies for the C language. The core of the framework is a shared rewrite-logic semantics of C designed for symbolic execution of C programs, and an annotation engine allowing for annotations across multiple policies to be added to C programs. Policies are created by providing a policy specific language, usable in annotations, and policy specific values and semantics. This provides a method to quickly develop new policies, taking advantage of the existing framework components. To illustrate the use of the framework, two case studies in policy development are presented: a basic null pointer analysis, and a more comprehensive analysis to verify unit of measurement safety.
PDF, TR@UIUC, BIB

Personal tools
Namespaces

Variants
Actions
Navigation