Test
From FSL
Contents |
2012
- On Safety Properties and Their Monitoring
- Grigore Rosu
- SACS, Volume 22(2), pp 327-365. 2012
- Abstract. This paper addresses the problem of runtime verification from a foundational perspective, answering questions like Is there a consensus among the various definitions of a safety property? (Answer: Yes), How many safety properties exist? (Answer: As many as real numbers), How difficult is the problem of monitoring a safety property? (Answer: Arbitrarily complex), Is there any formalism that can express all safety properties? (Answer: No), etc. Various definitions of safety properties as sets of execution traces have been proposed in the literature, some over finite traces, others over infinite traces, yet others over both finite and infinite traces. By employing cardinality arguments and a novel notion of persistence, this paper first establishes the existence of bijective correspondences between the various notions of safety property. It then shows that safety properties can be characterized as always past properties. Finally, it proposes a general notion of monitor, which allows to show that safety properties correspond precisely to the monitorable properties, and then to establish that monitoring a safety property is arbitrarily hard.
- PDF, MOP, DOI, SACS, BIB
- Checking Reachability using Matching Logic
- Grigore Rosu and Andrei Stefanescu
- OOPSLA'12, ACM, pp 555-574. 2012
- Abstract. This paper presents a verification framework that is parametric in a (trusted) operational semantics of some programming language. The underlying proof system is language-independent and consists of eight proof rules. The proof system is proved partially correct and relatively complete (with respect to the programming language configuration model). To show its practicality, the generic framework is instantiated with a fragment of C and evaluated with encouraging results.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, OOPSLA'12, BIB
- A Truly Concurrent Semantics for the K Framework Based on Graph Transformations
- Traian Florin Serbanuta and Grigore Rosu
- ICGT'12, LNCS 7562, pp 294-310. 2012
- Abstract. This paper gives a truly concurrent semantics with sharing of resources for the K semantic framework, an executable (term-)rewriting-based formalism for defining programming languages and calculi. Akin to graph rewriting rules, the K (rewrite) rules explicitly state what can be concurrently shared with other rules. The desired true concurrency is obtained by translating the K rules into a novel instance of term-graph rewriting with explicit sharing, and then using classical concurrency from the double-pushout (DPO) approach to graph rewriting. The resulting parallel term-rewriting relation is proved sound, complete, and serializable with respect to the jungle rewriting flavor of term-graph rewriting, and, therefore, also to term rewriting.
- PDF, Slides(PDF), K, DOI, ICGT'12, BIB
- Maximal Causal Models for Sequentially Consistent Systems
- Traian Florin Serbanuta and Feng Chen and Grigore Rosu
- RV'12, LNCS 7687, pp 136-150. 2012
- Abstract. This paper shows that it is possible to build a maximal and sound causal model for concurrent computations from a given execution trace. It is sound, in the sense that any program which can generate a trace can also generate all traces in its causal model. It is maximal (among sound models), in the sense that by extending the causal model of an observed trace with a new trace, the model becomes unsound: there exists a program generating the original trace which cannot generate the newly introduced trace. Thus, the maximal sound model has the property that it comprises all traces which all programs that can generate the original trace can also generate. The existence of such a model is of great theoretical value as it can be used to prove the soundness of non-maximal, and thus smaller, causal models.
- PDF, Slides(PDF), JPredictor, DOI, RV'12, BIB
- From Hoare Logic to Matching Logic Reachability
- Grigore Rosu and Andrei Stefanescu
- FM'12, LNCS 7436, pp 387-402. 2012
- Abstract. Matching logic reachability has been recently proposed as an alternative program verification approach. Unlike Hoare logic, where one defines a language-specific proof system that needs to be proved sound for each language separately, matching logic reachability provides a *language-independent* and *sound* proof system that directly uses the trusted operational semantics of the language as axioms. Matching logic reachability thus has a clear practical advantage: it eliminates the need for an additional semantics of the same language in order to reason about programs, and implicitly eliminates the need for tedious soundness proofs. What is not clear, however, is whether matching logic reachability is as powerful as Hoare logic. This paper introduces a technique to mechanically translate Hoare logic proof derivations into equivalent matching logic reachability proof derivations. The presented technique has two consequences: first, it suggests that matching logic reachability has no theoretical limitation over Hoare logic; and second, it provides a new approach to prove Hoare logics sound.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, FM'12, BIB
- Executing Formal Semantics with the K Tool
- David Lazar and Andrei Arusoaie and Traian Florin Serbanuta and Chucky Ellison and Radu Mereuta and Dorel Lucanu and Grigore Rosu
- FM'12, LNCS 7436, pp 267-271. 2012
- Abstract. This paper describes the K tool, a system for formally defining programming languages. Formal definitions created using the K tool automatically yield an interpreter for the language, as well as program analysis tools such as a state-space explorer. The modularity of K and the design of the tool allow one semantics to be used for several applications.
- PDF, Slides(PDF), K, DOI, FM'12, BIB
- Making Runtime Monitoring of Parametric Properties Practical
- Dongyun Jin
- PhD Thesis, University of Illinois at Urbana-Champaign. August 2012
- Abstract. Software reliability has become more important than ever in recent years, as a wide spectrum of software solutions are being used on various platforms. To this end, runtime monitoring is one of the most promising and feasible solutions for enhancing software reliability. In particular, runtime monitoring of parametric properties (parametric monitoring) has been receiving growing attention for its suitability in object-oriented systems. Despite many parametric monitoring approaches that have been proposed recently, they are still not widely used in real applications, implying that parametric monitoring is not sufficiently practical yet. In this dissertation, three perspectives for better practicality of parametric monitoring are proposed: expressiveness, efficiency, and scalability. A number of techniques on all three perspectives are developed and integrated to the JavaMOP framework, which is a formalism-independent, extensible runtime monitoring framework for parametric properties. One limitation in expressing parametric properties is that the first event must alway initiate all parameters. This limitation is removed in the proposed work to improve expressiveness of parametric monitoring. Further, a new logical formalism, PTCaRet, is introduced for describing properties of the call stack. As for efficiency, the `enable set optimization', the `indexing cache', and the `monitor garbage collection' are proposed for optimizing creation of monitors, access to monitors, and termination of monitors, respectively. In addition, several scalable parametric monitoring techniques are introduced. These techniques, for the first time, allow a large number of simultaneous parametric specifications to be monitored efficiently. The optimization techniques presented in this dissertation were implemented into the JavaMOP framework, yielding JavaMOP 3.0, the latest and most efficient version of JavaMOP. Thorough evaluations show that these techniques can improve runtime performance of JavaMOP by 3 times on average, and up to 63 times in some cases; as for memory usage, by 3 times on average. While Tracematches and the previous version of JavaMOP crashed on several cases due to out of memory errors, the newer version of JavaMOP did not crash on any case during the evaluations. Considering that the previous version of JavaMOP was one of the most efficient parametric monitoring frameworks in terms of runtime performance, the results presented in the dissertation can be argued significant.
- PDF, MOP, DOI, BIB
- Efficient, Expressive, and Effective Runtime Verification
- Patrick O'Neil Meredith
- PhD Thesis, University of Illinois at Urbana-Champaign. August 2012
- Abstract. Runtime Verification is a quickly growing technique for providing many of the guarantees of formal verification, but in a manner that is scalable. It useful information available from actual runs of programs to make verification decisions, rather than the purely static information used in formal verification. One of the main facets of Runtime Verification is runtime monitoring, where safety properties are checked against the execution of a program during (or in some cases after) its run. Prior work on efficient monitoring focused primarily on finite state properties. Non-finite state techniques existed, but added orders of magnitude of runtime overhead on the monitored system. The vast majority of runtime monitoring has also been limited to the application domain, with violations of safety properties only found on the actual trace of a given program. This thesis describes research that demonstrates that various logical formalisms, including those more powerful than finite logics, can be efficiently monitored in multiple monitoring domains. The demonstrated monitoring domains run the gamut from the application level with the Java programming language, to monitoring traces \emph{predicted} from a given run of a program, to hardware based monitors designed to ensure proper peripheral operation. The logical formalisms include multicategory finite state machines, extended regular expressions, past-time linear temporal logic with optimization for hardware based monitors, context-free grammars, linear temporal logic with both past and future operators, and string rewriting. This combination of domains and logical formalisms show that monitoring can be both expressive and efficient, regardless of the expressive power of the logical formalism, and that monitoring can be used not only for flat traces generated by software applications, but also in predictive traces and a hardware context.
- PDF, MOP, DOI, BIB
- Reachability Logic
- Grigore Rosu and Andrei Stefanescu and Stefan Ciobaca and Brandon Moore
- Technical Report http://hdl.handle.net/2142/32952, Jul 2012
- Abstract. This paper introduces *reachability logic*, a language-independent seven-rule proof system for deriving reachability properties of systems. The key ingredients of *reachability logic* are its sentences, which are called *reachability rules* and generalize the transitions of operational semantics and the Hoare triples of axiomatic semantics, and the *Circularity* proof rule, which generalizes invariant proof rules for iterative and recursive constructs in axiomatic semantics. The target transition system is described as a set of reachability rules, which are taken as axioms in a reachability logic proof. Typical definition styles which can be read as collections of reachability rules include conventional small-step and big-step operational semantics. The reachability logic proof system is shown sound (in the sense of partial correctness) and relatively complete. The soundness result has also been formalized in Coq, allowing to convert reachability logic proofs into proof certificates depending only on the operational semantics and the unavoidable domain reasoning. Reachability logic thus eliminates the need to independently define an axiomatic and an operational semantics for each language, and the non-negligible effort to prove the former sound and complete w.r.t the latter.
- PDF, Reachability Logic, DOI, BIB
- Towards a Unified Theory of Operational and Axiomatic Semantics
- Grigore Rosu and Andrei Stefanescu
- ICALP'12, LNCS 7392, pp 351-363. 2012
- Abstract. This paper presents a nine-rule *language-independent* proof system that takes an operational semantics as axioms and derives program reachability properties, including ones corresponding to Hoare triples. This eliminates the need for language-specific Hoare-style proof rules to verify programs, and, implicitly, the tedious step of proving such proof rules sound for each language separately. The key proof rule is *Circularity*, which is coinductive in nature and allows for reasoning about constructs with repetitive behaviors (e.g., loops). The generic proof system is shown sound and has been implemented in the MatchC verifier.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, ICALP'12, BIB
- A Formal Semantics of C with Applications
- Chucky Ellison
- PhD Thesis, University of Illinois. July 2012
- Abstract. This dissertation shows that complex, real programming languages can be completely formalized in the K Framework, yielding interpreters and analysis tools for testing and bug detection. This is demonstrated by providing, in K, the first complete formal semantics of the C programming language. With varying degrees of effort, tools such as interpreters, debuggers, and model-checkers, together with tools that check for memory safety, races, deadlocks, and undefined behavior are then generated from the semantics. Being executable, the semantics has been thoroughly tested against the GCC torture test suite and successfully passes 99.2\% of 776 test programs. The semantics is also evaluated against popular analysis tools, using a new test suite in addition to a third-party test suite. The semantics-based tool performs at least as well or better than the other tools tested.
- PDF, K, DOI, BIB
- Test-Case Reduction for C Compiler Bugs
- John Regehr and Yang Chen and Pascal Cuoq and Eric Eide and Chucky Ellison and Xuejun Yang
- PLDI'12, ACM, pp 335-346. 2012
- Abstract. To report a compiler bug, one must often find a small test case that triggers the bug. The existing approach to automated test-case reduction, delta debugging, works by removing substrings of the original input; the result is a concatenation of substrings that delta cannot remove. We have found this approach less than ideal for reducing C programs because it typically yields test cases that are too large or even invalid (relying on undefined behavior). To obtain small and valid test cases consistently, we designed and implemented three new, domain-specific test-case reducers. The best of these is based on a novel framework in which a generic fixpoint computation invokes modular transformations that perform reduction operations. This reducer produces outputs that are, on average, more than 25 times smaller than those produced by our other reducers or by the existing reducer that is most commonly used by compiler developers. We conclude that effective program reduction requires more than straightforward delta debugging.
- PDF, Slides(PPTX), C-Reduce, DOI, PLDI'12, BIB
- JavaMOP: Efficient Parametric Runtime Monitoring Framework
- Dongyun Jin and Patrick O'Neil Meredith and Choonghwan Lee and Grigore Rosu
- ICSE'12, IEEE, pp 1427-1430. 2012
- Abstract. Runtime monitoring is a technique usable in all phases of the software development cycle, from initial testing, to debugging, to actually maintaining proper function in production code. Of particular importance are parametric monitoring systems, which allow the specification of properties that relate objects in a program, rather than only global properties. In the past decade, a number of parametric runtime monitoring systems have been developed. Here we give a demonstration of our system, JavaMOP. It is the only parametric monitoring system that allows multiple differing logical formalisms. It is also the most efficient in terms of runtime overhead, and very competitive with respect to memory usage.
- PDF, MOP, DOI, ICSE'12, BIB
- Security-Policy Monitoring and Enforcement with JavaMOP
- Soha Hussein and Patrick Meredith and Grigore Rosu
- PLAS'12, ACM, pp 3:1-3:11. 2012
- Abstract. Software security attacks represent an ever growing problem. One way to make software more secure is to use Inlined Reference Monitors (IRMs), which allow security specifications to be inlined inside a target program to ensure its compliance with the desired security specifications. The IRM approach has been developed primarily by the security community. Runtime Verification (RV), on the other hand, is a software engineering approach, which is intended to formally encode system specifications within a target program such that those specifications can be later enforced during the execution of the program. Until now, the IRM and RV approaches have lived separate lives; in particular RV techniques have not been applied to the security domain, being used instead to aid program correctness and testing. This paper discusses the usage of a formalism-generic RV system, JavaMOP, as a means to specify IRMs, leveraging the careful engineering of the JavaMOP system for ensuring secure operation of software in an efficient manner.
- PDF, MOP, DOI, PLAS'12, BIB
- Defining the Undefinedness of C
- Chucky Ellison and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/30780, April 2012
- Abstract. This paper investigates undefined behavior in C and offers a few simple techniques for operationally specifying such behavior formally. A semantics-based undefinedness checker for C is developed using these techniques, as well as a test suite of undefined programs. The tool is evaluated against other popular analysis tools, using the new test suite in addition to a third-party test suite. The semantics-based tool performs at least as well or better than the other tools tested.
- PDF, K, DOI, BIB
- Scalable Parametric Runtime Monitoring
- Dongyun Jin and Patrick O'Neil Meredith and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/30757, April 2012
- Abstract. Runtime monitoring is an effective means to improve the reliability of systems. In recent years, parametric monitoring, which is highly suitable for object-oriented systems, has gained significant traction. Previous work on the performance of parametric runtime monitoring has focused on the performance of monitoring only one specification at a time. A realistic system, however, has numerous properties that need to be monitored simultaneously. This paper introduces scalable techniques to improve the performance of one of the fastest parametric monitoring systems, JavaMOP, in the presence of multiple simultaneous properties, resulting in average runtime overheads that are less than the summation of the overheads of the properties run in isolation. An extensive evaluation shows that these techniques, which were derived following a thorough investigation and analysis of the current bottlenecks in JavaMOP, improve its runtime performance in the presence of multiple properties by up to two times and the memory usage by 34\%.
- PDF, MOP, DOI, BIB
- Towards Categorizing and Formalizing the JDK API
- Choonghwan Lee and Dongyun Jin and Patrick O'Neil Meredith and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/30006, March 2012
- Abstract. Formal specification of correct library usage is extremely useful, both for software developers and for the formal analysis tools they use, such as model checkers or runtime monitoring systems. Unfortunately, the process of creating formal specifications is time consuming, and, for the most part, even the libraries in greatest use, such as the Java Development Kit (JDK) standard library, are left wholly without formal specification. This paper presents a tool-supported approach to help writing formal specifications for Java libraries and creating documentation augmented with highlighting and formal specifications. The presented approach has been applied to systematically and completely formalize the runtime properties of three core and commonly used packages of the JDK API, namely java.io, java.lang and java.util, yielding 137 formal specifications. Indirectly, this paper also brings empirical evidence that parametric specifications may be sufficiently powerful to express virtually all desirable runtime properties of the JDK API, and that its informal documentation can be formalized.
- PDF, Java API, DOI, BIB
- Making Maude Definitions more Interactive
- Andrei Arusoaie and Traian Florin Serbanuta and Chucky Ellison and Grigore Rosu
- WRLA'12, LNCS 7571, pp 83-98. 2012
- Abstract. This paper presents an interface for achieving interactive executions of Maude terms by allowing console and file input/output (I/O) operations. This interface consists of a Maude API for I/O operations, a Java-based server offering I/O capabilities, and a communication protocol between the two implemented using the external objects concept and Maude's TCP sockets. This interface was evaluated as part of the K framework, providing interactive interpreter capabilities for executing and testing programs for multiple language definitions.
- PDF, K, DOI, WRLA'12, BIB
- K Framework Distilled
- Dorel Lucanu and Traian Florin Serbanuta and Grigore Rosu
- WRLA'12, LNCS 7571, pp 31-53. 2012
- Abstract. K is a rewrite-based executable semantic framework in which programming languages, type systems, and formal analysis tools can be defined using configurations, computations and rules. Configurations organize the state in units called cells, which are labeled and can be nested. Computations are special nested list structures sequentializing computational tasks, such as fragments of program. K (rewrite) rules make it explicit which parts of the term they read-only, write-only, read-write, or do not care about. This makes K suitable for defining truly concurrent languages even in the presence of sharing. Computations are like any other terms in a rewriting environment: they can be matched, moved from one place to another, modified, or deleted. This makes K suitable for defining control-intensive features such as abrupt termination, exceptions or call/cc. This paper presents an overview of K Framework and the K tool, focusing on the interaction between the K tool and Maude.
- PDF, K, DOI, WRLA'12, BIB
- Semantics and Algorithms for Parametric Monitoring
- Grigore Rosu and Feng Chen
- J.LMCS, Volume 8(1), pp 1-47. 2012
- Abstract. Analysis of execution traces plays a fundamental role in many program analysis approaches, such as runtime verification, testing, monitoring, and specification mining. Execution traces are frequently parametric, i.e., they contain events with parameter bindings. Each parametric trace usually consists of many meaningful trace slices merged together, each slice corresponding to one parameter binding. This gives a semantics-based solution to parametric trace analysis. A general-purpose parametric trace slicing technique is introduced, which takes each event in the parametric trace and dispatches it to its corresponding trace slices. This parametric trace slicing technique can be used in combination with any conventional, non-parametric trace analysis technique, by applying the later on each trace slice. As an instance, a parametric property monitoring technique is then presented. The presented parametric trace slicing and monitoring techniques have been implemented and extensively evaluated. Measurements of runtime overhead confirm that the generality of the discussed techniques does not come at a performance expense when compared with existing parametric trace monitoring systems.
- PDF, MOP, DOI, J.LMCS, BIB
- An Executable Formal Semantics of C with Applications
- Chucky Ellison and Grigore Rosu
- POPL'12, ACM, pp 533-544. 2012
- Abstract. This paper describes an executable formal semantics of C. Being executable, the semantics has been thoroughly tested against the GCC torture test suite and successfully passes 99.2\% of 776 test programs. It is the most complete and thoroughly tested formal definition of C to date. The semantics yields an interpreter, debugger, state space search tool, and model checker "for free". The semantics is shown capable of automatically finding program errors, both statically and at runtime. It is also used to enumerate nondeterministic behavior.
- PDF, Slides(PDF), Semantics, DOI, POPL'12, BIB
- Recursive Proofs for Inductive Tree Data-Structures
- Parthasarathy Madhusudan and Xiaokang Qiu and Andrei Stefanescu
- POPL'12, ACM, pp 123-136. 2012
- Abstract. We develop logical mechanisms and procedures to facilitate the verification of full functional properties of inductive tree data-structures using recursion that are sound, incomplete, but terminating. Our contribution rests in a new extension of first-order logic with recursive definitions called *Dryad*, a syntactical restriction on pre- and post-conditions of recursive imperative programs using *Dryad*, and a systematic methodology for accurately unfolding the footprint on the heap uncovered by the program that leads to finding simple recursive proofs using formula abstraction and calls to SMT solvers. We evaluate our methodology empirically and show that several complex tree data-structure algorithms can be checked against full functional specifications automatically, given pre- and post-conditions. This results in the first automatic terminating methodology for proving a wide variety of annotated algorithms on tree data-structures correct, including max-heaps, treaps, red-black trees, AVL trees, binomial heaps, and B-trees.
Andrei
- KEVM: A Complete Semantics of the Ethereum Virtual Machine
- Everett Hildenbrandt and Manasvi Saxena and Xiaoran Zhu and Nishant Rodrigues and Philip Daian and Dwight Guth and Brandon Moore and Yi Zhang and Daejun Park and Andrei Stefanescu and Grigore Rosu
- CSF 2018, IEEE, pp 204-217. 2018
- Abstract. A developing field of interest for the distributed systems and applied cryptography communities is that of smart contracts: self-executing financial instruments that synchronize their state, often through a blockchain. One such smart contract system that has seen widespread practical adoption is Ethereum, which has grown to a market capacity of 100 billion USD and clears an excess of 500,000 daily transactions. Unfortunately, the rise of these technologies has been marred by a series of costly bugs and exploits. Increasingly, the Ethereum community has turned to formal methods and rigorous program analysis tools. This trend holds great promise due to the relative simplicity of smart contracts and bounded-time deterministic execution inherent to the Ethereum Virtual Machine (EVM). Here we present KEVM, an executable formal specification of the EVM's bytecode stack-based language built with the \K{} Framework, designed to serve as a solid foundation for further formal analyses. We empirically evaluate the correctness and performance of KEVM using the official Ethereum test suite~\cite{ethereum-tests-url}. To demonstrate the usability, several extensions of the semantics are presented and two different-language implementations of the ERC20 Standard Token are verified against the ERC20 specification. These results are encouraging for the executable semantics approach to language prototyping and specification.
- PDF, KEVM, CSF 2018, BIB
- Semantics-Based Program Verifiers for All Languages
- Andrei Stefanescu and Daejun Park and Shijiao Yuwen and Yilong Li and Grigore Rosu
- OOPSLA'16, ACM, pp 74-91. 2016
- Abstract. We present a language-independent verification framework that can be instantiated with an operational semantics to automatically generate a program verifier. The framework treats both the operational semantics and the program correctness specifications as reachability rules between matching logic patterns, and uses the sound and relatively complete reachability logic proof system to prove the specifications using the semantics. We instantiate the framework with the semantics of one academic language, KernelC, as well as with three recent semantics of real-world languages, C, Java, and JavaScript, developed independently. We evaluate our approach empirically and show that the generated program verifiers can check automatically the full functional correctness of challenging heap-manipulating programs implementing operations on list and tree data structures, like AVL trees. This is the first approach that can turn the operational semantics of real-world languages into correct-by-construction automatic verifiers.
- PDF, Slides(PDF), Matching Logic, DOI, OOPSLA'16, BIB
- Language Definitions as Rewrite Theories
- Vlad Rusu and Dorel Lucanu and Traian Florin Serbanuta and Andrei Arusoaie and Andrei Stefanescu and Grigore Rosu
- J.LAMP, Volume 85(1, Part 1), pp 98-120. 2016
- Abstract. K is a formal framework for defining operational semantics of programming languages. The K-Maude compiler translates \K language definitions to Maude rewrite theories. The compiler enables program execution by using the Maude rewrite engine with the compiled definitions, and program analysis by using various Maude analysis tools. K supports symbolic execution in Maude by means of an automatic transformation of language definitions. The transformed definition is called the *\em symbolic extension* of the original definition. In this paper we investigate the theoretical relationship between K language definitions and their Maude translations, between symbolic extensions of K definitions and their Maude translations, and how the relationship between K definitions and their symbolic extensions is reflected on their respective representations in Maude. In particular, the results show how analysis performed with Maude tools can be formally lifted up to the original language definitions.
- PDF, project, DOI, J.LAMP, BIB
- KJS: A Complete Formal Semantics of JavaScript
- Daejun Park and Andrei Stefanescu and Grigore Rosu
- PLDI'15, ACM, pp 346-356. 2015
- Abstract. This paper presents KJS, the most complete and throughly tested formal semantics of JavaScript to date. Being executable, KJS has been tested against the ECMAScript 5.1 conformance test suite, and passes all 2,782 core language tests. Among the existing implementations of JavaScript, only Chrome V8's passes all the tests, and no other semantics passes more than 90\%. In addition to a reference implementation for JavaScript, KJS also yields a simple coverage metric for a test suite: the set of semantic rules it exercises. Our semantics revealed that the ECMAScript 5.1 conformance test suite fails to cover several semantic rules. Guided by the semantics, we wrote tests to exercise those rules. The new tests revealed bugs both in production JavaScript engines (Chrome V8, Safari WebKit, Firefox SpiderMonkey) and in other semantics. KJS is symbolically executable, thus it can be used for formal analysis and verification of JavaScript programs. We verified non-trivial programs and found a known security vulnerability.
- PDF, Slides(PDF), Semantics, DOI, PLDI'15, BIB
- All-Path Reachability Logic
- Andrei Stefanescu and Stefan Ciobaca and Radu Mereuta and Brandon Moore and Traian Florin Serbanuta and Grigore Rosu
- RTA'14, LNCS 8560, pp 425-440. 2014
- Abstract. This paper presents a language-independent proof system for reachability properties of programs written in non-deterministic (e.g. concurrent) languages, referred to as *all-path reachability logic*. It derives partial-correctness properties with all-path semantics (a state satisfying a given precondition reaches states satisfying a given postcondition on all terminating execution paths). The proof system takes as axioms any unconditional operational semantics, and is sound (partially correct) and (relatively) complete, independent of the object language; the soundness has also been mechanized (Coq). This approach is implemented in a tool for semantics-based verification as part of the K framework.
- PDF, Slides(PPTX), Matching Logic, DOI, RTA'14, BIB
- Language Definitions as Rewrite Theories
- Andrei Arusoaie and Dorel Lucanu and Vlad Rusu and Traian Florin Serbanuta and Andrei Stefanescu and Grigore Rosu
- WRLA'14, LNCS 8663, pp 97-112. 2014
- Abstract. K is a formal framework for defining the operational semantics of programming languages. It includes software tools for compiling K language definitions to Maude rewrite theories, for executing programs in the defined languages based on the Maude rewriting engine, and for analyzing programs by adapting various Maude analysis tools. A recent extension to the K tool suite is an automatic transformation of language definitions that enables the symbolic execution of programs, i.e., the execution of programs with symbolic inputs. In this paper we investigate the theoretical relationships between K language definitions and their translations to Maude, between symbolic extensions of K definitions and their Maude encodings, and how the relations between K definitions and their symbolic extensions are reflected on their respective representations in Maude. These results show, in particular, how analyses performed with Maude tools can be formally lifted up to the original language definitions.
- PDF, K, DOI, WRLA'14, BIB
- Natural Proofs for Structure, Data, and Separation
- Xiaokang Qiu and Pranav Garg and Andrei Stefanescu and Parthasarathy Madhusudan
- PLDI'13, ACM, pp 231-242. 2013
- Abstract. We propose natural proofs, for reasoning with programs that manipulate data-structures against complex specifications --- specifications that describe the structure of the heap, the data stored within it, and separation and framing of sub-structures. Natural proofs are a subclass of proofs that are amenable to completely automated reasoning, that provide sound but incomplete procedures, and that capture common reasoning tactics in program verification. We develop a dialect of separation logic over heaps, called Dryad, with recursive definitions that avoids explicit quantification. We develop ways to reason with heaplets using classical logic over the theory of sets, and develop natural proofs for reasoning using proof tactics involving disciplined unfoldings and formula abstractions. Natural proofs are encoded into decidable theories of first-order logic so as to be discharged using SMT solvers. We also implement the technique and show that a large class of more than 100 correct programs that manipulate data-structures are amenable to full functional correctness using the proposed natural proof method. These programs are drawn from a variety of sources including standard data-structures, the Schorr-Waite algorithm for garbage collection, a large number of low-level C routines from the Glib library, the OpenBSD library and the Linux kernel, and routines from a secure verified OS-browser project. Our work is the first that we know of that can handle such a wide range of full functional verification properties of heaps automatically, given pre/post and loop invariant annotations. We believe that this work paves the way for the deductive verification technology to be used by programmers who do not (and need not) understand the internals of the underlying logic solvers, significantly increasing their applicability in building reliable systems.
- PDF, Slides(PPTX), Dryad, DOI, PLDI'13, BIB
- One-Path Reachability Logic
- Grigore Rosu and Andrei Stefanescu and Stefan Ciobaca and Brandon Moore
- LICS'13, IEEE, pp 358-367. 2013
- Abstract. This paper introduces *reachability logic*, a language-independent proof system for program verification, which takes an operational semantics as axioms and derives *reachability rules*, which generalize Hoare triples. This system improves on previous work by allowing operational semantics given with *conditional* rewrite rules, which are known to support all major styles of operational semantics. In particular, Kahn's big-step and Plotkin's small-step semantic styles are newly supported. The reachability logic proof system is shown sound (i.e., partially correct) and (relatively) complete. Reachability logic thus eliminates the need to independently define an axiomatic and an operational semantics for each language, and the non-negligible effort to prove the former sound and complete w.r.t. the latter. The soundness result has also been formalized in Coq, allowing reachability logic derivations to serve as formal proof certificates that rely only on the operational semantics.
- PDF, Slides(PPTX), Reachability Logic, LICS'13, BIB
- Low-Level Program Verification using Matching Logic Reachability
- Dwight Guth and Andrei Stefanescu and Grigore Rosu
- LOLA'13. 2013
- Abstract. Matching logic reachability is an emerging verification approach which uses a language-independent proof system to prove program properties based on the operational semantics. In this paper we apply this approach in the context of a low-level real-time language with interrupts, in which each instruction takes a specified time to execute. In particular, we verify that if the interrupts are scheduled with large enough intervals, the program execution terminates yielding the correct result. Surprisingly, it turns out that matching logic reachability can handle the low-level and real-time features of the language just by using their operational semantics, and that language specific reasoning is unnecessary.
- PDF, Slides(PDF), Matching Logic, LOLA'13, BIB
- Checking Reachability using Matching Logic
- Grigore Rosu and Andrei Stefanescu
- OOPSLA'12, ACM, pp 555-574. 2012
- Abstract. This paper presents a verification framework that is parametric in a (trusted) operational semantics of some programming language. The underlying proof system is language-independent and consists of eight proof rules. The proof system is proved partially correct and relatively complete (with respect to the programming language configuration model). To show its practicality, the generic framework is instantiated with a fragment of C and evaluated with encouraging results.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, OOPSLA'12, BIB
- From Hoare Logic to Matching Logic Reachability
- Grigore Rosu and Andrei Stefanescu
- FM'12, LNCS 7436, pp 387-402. 2012
- Abstract. Matching logic reachability has been recently proposed as an alternative program verification approach. Unlike Hoare logic, where one defines a language-specific proof system that needs to be proved sound for each language separately, matching logic reachability provides a *language-independent* and *sound* proof system that directly uses the trusted operational semantics of the language as axioms. Matching logic reachability thus has a clear practical advantage: it eliminates the need for an additional semantics of the same language in order to reason about programs, and implicitly eliminates the need for tedious soundness proofs. What is not clear, however, is whether matching logic reachability is as powerful as Hoare logic. This paper introduces a technique to mechanically translate Hoare logic proof derivations into equivalent matching logic reachability proof derivations. The presented technique has two consequences: first, it suggests that matching logic reachability has no theoretical limitation over Hoare logic; and second, it provides a new approach to prove Hoare logics sound.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, FM'12, BIB
- Reachability Logic
- Grigore Rosu and Andrei Stefanescu and Stefan Ciobaca and Brandon Moore
- Technical Report http://hdl.handle.net/2142/32952, Jul 2012
- Abstract. This paper introduces *reachability logic*, a language-independent seven-rule proof system for deriving reachability properties of systems. The key ingredients of *reachability logic* are its sentences, which are called *reachability rules* and generalize the transitions of operational semantics and the Hoare triples of axiomatic semantics, and the *Circularity* proof rule, which generalizes invariant proof rules for iterative and recursive constructs in axiomatic semantics. The target transition system is described as a set of reachability rules, which are taken as axioms in a reachability logic proof. Typical definition styles which can be read as collections of reachability rules include conventional small-step and big-step operational semantics. The reachability logic proof system is shown sound (in the sense of partial correctness) and relatively complete. The soundness result has also been formalized in Coq, allowing to convert reachability logic proofs into proof certificates depending only on the operational semantics and the unavoidable domain reasoning. Reachability logic thus eliminates the need to independently define an axiomatic and an operational semantics for each language, and the non-negligible effort to prove the former sound and complete w.r.t the latter.
- PDF, Reachability Logic, DOI, BIB
- Towards a Unified Theory of Operational and Axiomatic Semantics
- Grigore Rosu and Andrei Stefanescu
- ICALP'12, LNCS 7392, pp 351-363. 2012
- Abstract. This paper presents a nine-rule *language-independent* proof system that takes an operational semantics as axioms and derives program reachability properties, including ones corresponding to Hoare triples. This eliminates the need for language-specific Hoare-style proof rules to verify programs, and, implicitly, the tedious step of proving such proof rules sound for each language separately. The key proof rule is *Circularity*, which is coinductive in nature and allows for reasoning about constructs with repetitive behaviors (e.g., loops). The generic proof system is shown sound and has been implemented in the MatchC verifier.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, ICALP'12, BIB
- Recursive Proofs for Inductive Tree Data-Structures
- Parthasarathy Madhusudan and Xiaokang Qiu and Andrei Stefanescu
- POPL'12, ACM, pp 123-136. 2012
- Abstract. We develop logical mechanisms and procedures to facilitate the verification of full functional properties of inductive tree data-structures using recursion that are sound, incomplete, but terminating. Our contribution rests in a new extension of first-order logic with recursive definitions called *Dryad*, a syntactical restriction on pre- and post-conditions of recursive imperative programs using *Dryad*, and a systematic methodology for accurately unfolding the footprint on the heap uncovered by the program that leads to finding simple recursive proofs using formula abstraction and calls to SMT solvers. We evaluate our methodology empirically and show that several complex tree data-structure algorithms can be checked against full functional specifications automatically, given pre- and post-conditions. This results in the first automatic terminating methodology for proving a wide variety of annotated algorithms on tree data-structures correct, including max-heaps, treaps, red-black trees, AVL trees, binomial heaps, and B-trees.
- PDF, Dryad, DOI, POPL'12, BIB
- MatchC: A Matching Logic Reachability Verifier Using the K Framework
- Andrei Stefanescu
- K'11, ENTCS 304. 2011
- Abstract. This paper presents MatchC, a matching logic reachability verifier using the K framework. K is a rewriting-based framework for defining and analyzing programming languages. Matching logic is a logic designed to state and reason about structural properties over arbitrary program configurations. Matching logic reachability is a unifying framework for operational and axiomatic semantics of programing languages. The MatchC verifier (http://matching-logic.org/) checks reachability properties of programs written in a deterministic fragment of C and is implemented in the K framework. This paper discusses the correctness of the implementation of the matching logic reachability proof system in MatchC. The main contributions of this paper are the implementation of the verifier, with emphasis on using K for program verification, and the evaluation of the tool on a large number of programs, including complex ones, like programs implementing the AVL trees data structure and the Schorr-Waite graph marking algorithm.
- PDF, Matching Logic, DOI, BIB
- Matching Logic: A New Program Verification Approach (NIER Track)
- Grigore Rosu and Andrei Stefanescu
- ICSE'11, ACM, pp 868-871. 2011
- Abstract. Matching logic is a new program verification logic, which builds upon operational semantics. Matching logic specifications are constrained symbolic program configurations, called patterns, which can be matched by concrete configurations. By building upon an operational semantics of the language and allowing specifications to directly refer to the structure of the configuration, matching logic has at least three benefits that could be key factors in its potential usability: (1) One's familiarity with the formalism reduces to one's familiarity with the operational semantics of the language, that is, with the language itself; (2) The verification process proceeds the same way as the execution of the program, making debugging failed proof attempts manageable because one can always see the ``current configuration and ``what went wrong, same like in a debugger; and (3) Nothing is lost in translation, that is, there is no gap between the language itself and its verifier. Moreover, direct access to the structure of the configuration facilitates defining sub-patterns that one may reason about, such as disjoint lists or trees in the heap, as well as supporting framing in various components of the configuration at no additional costs.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, ICSE'11, BIB
- Matching Logic: A New Program Verification Approach
- Grigore Rosu and Andrei Stefanescu
- UV'10. 2010
- Abstract. Matching logic is a new program verification logic, which builds upon operational semantics. Matching logic specifications are constrained symbolic program configurations, called *patterns*, which can be *matched* by concrete configurations. By building upon an operational semantics of the language and allowing specifications to directly refer to the structure of the configuration, matching logic has at least three benefits that could be key factors in its usability: (1) One's familiarity with the formalism reduces to one's familiarity with the operational semantics of the language, that is, with the language itself; (2) The verification process proceeds the same way as the execution of the program, making debugging failed proof attempts manageable because one can always see the ``current configuration and ``what went wrong, same like in a debugger; and (3) Nothing is lost in translation, that is, there is no gap between the language itself and its verifier. Moreover, direct access to the structure of the configuration facilitates defining sub-patterns that one may reason about, such as disjoint lists or trees in the heap, as well as supporting framing in various components of the configuration at no additional costs.
FSL
- Applicative Matching Logic
- Xiaohong Chen and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/104616, July 2019
- Abstract. This paper proposes a novel logic for programming languages, which is both simple and expressive, to serve as a foundation for language semantics frameworks. Matching mu-logic has been recently proposed as a unifying foundation for programming languages, specification and verification. It has been shown to capture several logics important for programming languages, including first-order logic with least fixpoints, separation logic, temporal logics, modal mu-logic, and importantly, reachability logic, a language-independent logic for program verification that subsumes Hoare logic. This paper identifies a fragment of matching mu-logic called applicative matching logic (AML), which is much simpler and thus more appealing from a foundational perspective, yet as expressive as matching mu-logic. Several additional logical frameworks fundamental for programming languages are shown to be faithfully captured by AML, including many- and order-sorted algebras, lambda-calculus, (dependent) type systems, evaluation contexts, and rewriting. Finally, it is shown how all these make AML an appropriate underlying logic foundation for complex language semantics frameworks, such as K.
- PDF, Matching Logic, DOI, BIB
- A Complete Formal Semantics of x86-64 User-Level Instruction Set Architecture
- Sandeep Dasgupta and Daejun Park and Theodoros Kasampalis and Vikram S. Adve and Grigore Rosu
- PLDI'19, ACM, pp 1133-1148. 2019
- Abstract. We present the most complete and thoroughly tested formal semantics of x86-64 to date. Our semantics faithfully formalizes all the non-deprecated, sequential user-level instructions of the x86-64 Haswell instruction set architecture. This totals 3155 instruction variants, corresponding to 774 mnemonics. The semantics is fully executable and has been tested against more than 7,000 instruction-level test cases and the GCC torture test suite. This extensive testing paid off, revealing bugs in both the x86-64 reference manual and other existing semantics. We also illustrate potential applications of our semantics in different formal analyses, and discuss how it can be useful for processor verification.
- PDF, Slides(PDF), Semantics, DOI, PLDI'19, BIB
- Techniques for Evolution-Aware Runtime Verification
- Owolabi Legunsen and Yi Zhang and Milica Hadzi-Tanovic and Grigore Rosu and Darko Marinov
- ICST 2019, IEEE, pp 300-311. 2019
- Abstract. Runtime Verification (RV) can help find bugs by monitoring program executions against formal properties. De- velopers should ideally use RV whenever they run tests, to find more bugs earlier. Despite tremendous research progress, RV still incurs high overhead in (1) machine time to monitor properties, and (2) developer time to wait for and inspect violations from test executions that do not satisfy the properties. Moreover, all prior RV techniques consider only one program version and wastefully re-monitor unaffected properties and code as software evolves. We present the first evolution-aware RV techniques that re- duce RV overhead across multiple program versions. Regression Property Selection (RPS) re-monitors only properties that can be violated in parts of code affected by changes, reducing machine time and developer time. Violation Message Suppression (VMS) simply shows only new violations to reduce developer time; it does not reduce machine time. Regression Property Prioritization (RPP) splits RV in two phases: properties more likely to find bugs are monitored in a critical phase to provide faster feedback to the developers; the rest are monitored in a background phase. We compare our techniques with the evolution-unaware (base) RV when monitoring test executions in 200 versions of 10 open- source projects. RPS and the RPP critical phase reduce the average RV overhead from 9.4x (for base RV) to 1.8x, without missing any new violations. VMS reduces the average number of violations 540x, from 54 violations per version (for base RV) to one violation per 10 versions.
- PDF, Slides(PPTX), JavaMOP, DOI, ICST 2019, BIB
- Matching mu-Logic
- Xiaohong Chen and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/102281, January 2019
- Abstract. Matching logic is a logic for specifying and reasoning about structure by means of patterns and pattern matching. This paper makes two contributions. First, it proposes a sound and complete proof system for matching logic in its full generality. Previously, sound and complete deduction for matching logic was known only for particular theories providing equality and membership. Second, it proposes matching mu-logic, an extension of matching logic with a least fixpoint mu-binder. It is shown that matching mu-logic captures as special instances many important logics in mathematics and computer science, including first-order logic with least fixpoints, modal mu-logic as well as dynamic logic and various temporal logics such as infinite/finite-trace linear temporal logic and computation tree logic, and notably reachability logic, the underlying logic of the K framework for programming language semantics and formal analysis. Matching mu-logic therefore serves as a unifying foundation for specifying and reasoning about fixpoints and induction, programming languages and program specification and verification.
- PDF, Matching Logic, DOI, BIB
- Matching mu-Logic
- Xiaohong Chen and Grigore Rosu
- LICS'19. 2019. To appear
- Abstract. Matching logic is a logic for specifying and reasoning about structure by means of patterns and pattern matching. This paper makes two contributions. First, it proposes a sound and complete proof system for matching logic in its full generality. Previously, sound and complete deduction for matching logic was known only for particular theories providing equality and membership. Second, it proposes matching mu-logic, an extension of matching logic with a least fixpoint mu-binder. It is shown that matching mu-logic captures as special instances many important logics in mathematics and computer science, including first-order logic with least fixpoints, modal mu-logic as well as dynamic logic and various temporal logics such as infinite/finite-trace linear temporal logic and computation tree logic, and notably reachability logic, the underlying logic of the K framework for programming language semantics and formal analysis. Matching mu-logic therefore serves as a unifying foundation for specifying and reasoning about fixpoints and induction, programming languages and program specification and verification.
- PDF, Matching Logic, LICS'19, BIB
- IELE: A Rigorously Designed Language and Tool Ecosystem for the Blockchain
- Kasampalis, Theodoros and Guth, Dwight and Moore, Brandon and Serbanuta, Traian Florin and Zhang, Yi and Filaretti, Daniele and Serbanuta, Virgil and Johnson, Ralph and Rosu, Grigore
- FM 2019. 2019. To appear
- Abstract. This paper proposes IELE, an LLVM-style language, together with a tool ecosystem for implementing and formally reasoning about smart contracts on the blockchain. IELE was designed by specifying its semantics formally in the K framework. Its implementation, a IELE virtual machine (VM), as well as a formal verification tool for IELE smart contracts, were automatically generated from the formal specification. The automatically generated formal verification tool allows us to formally verify smart contracts without any gap between the verifier and the actual VM. A compiler from Solidity, the predominant high- level language for smart contracts, to IELE has also been (manually) implemented, so Ethereum contracts can now also be executed on IELE.
- PDF, BIB
- A Language-Independent Program Verification Framework
- Xiaohong Chen and Grigore Rosu
- ISoLA'18, Springer, pp 92-102. 2018
- Abstract. This invited paper describes an approach to language-independent deductive verification using the K semantics framework, in which an operational semantics of a language is defined and a program verifier together with other language tools are generated automatically, correct-by-construction.
- PDF, Slides(PPTX), DOI, ISoLA'18, BIB
- A Language-Independent Approach to Smart Contract Verification
- Xiaohong Chen and Daejun Park and Grigore Rosu
- ISoLA'18, Springer, pp 405-413. 2018
- Abstract. This invited paper reports the current progress of smart contract verification with the K framework in a language-independent style.
- PDF, Slides(PPTX), Formally Verified Smart Contracts, DOI, ISoLA'18, BIB
- A Formal Verification Tool for Ethereum VM Bytecode
- Daejun Park and Yi Zhang and Manasvi Saxena and Philip Daian and Grigore Rosu
- ESEC/FSE'18, ACM, pp 912-915. 2018
- Abstract. In this paper, we present a formal verification tool for the Ethereum Virtual Machine (EVM) bytecode. To precisely reason about all possible behaviors of the EVM bytecode, we adopted KEVM, a complete formal semantics of the EVM, and instantiated the K-framework's reachability logic theorem prover to generate a correct-by-construction deductive verifier for the EVM. We further optimized the verifier by introducing EVM-specific abstractions and lemmas to improve its scalability. Our EVM verifier has been used to verify various high-profile smart contracts including the ERC20 token, Ethereum Casper, and DappHub MakerDAO contracts.
- PDF, Slides(PDF), Formally Verified Smart Contracts, DOI, ESEC/FSE'18, BIB
- IELE: An Intermediate-Level Blockchain Language Designed and Implemented Using Formal Semantics
- Theodoros Kasampalis and Dwight Guth and Brandon Moore and Traian Serbanuta and Virgil Serbanuta and Daniele Filaretti and Grigore Rosu and Ralph Johnson
- Technical Report http://hdl.handle.net/2142/100320, July 2018
- Abstract. Most languages are given an informal semantics until they are implemented, so the formal semantics comes later. Consequently, there are usually inconsistencies among the informal semantics, the implementation, and the formal semantics. IELE is an LLVM-like language for the blockchain that was specified formally and its implementation, a virtual machine, generated from the formal specification. Moreover, its design was based on problems observed formalizing the semantics of the Ethereum Virtual Machine (EVM) and from formally specifying and verifying EVM programs (also called "smart contracts"), so even the design decisions made for IELE are based on formal specifications. A compiler from Solidity, the predominant high-level language for smart contracts, to IELE has also been implemented, so Ethereum contracts can now also be executed on IELE. The virtual machine automatically generated from the semantics of IELE is shown to be competitive in terms of performance with the state of the art and hence can stand as the de facto implementation of the language in a production setting. Indeed, IOHK, a major blockchain company, is currently experimenting with the IELE VM in order to deploy it as its computational layer in a few months. This makes IELE the first practical language that is designed and implemented as a formal specification. It took only 10 man-months to develop IELE, which demonstrates that the programming language semantics field has reached a level of maturity that makes it appealing over the traditional, adhoc approach even for pragmatic reasons.
- PDF, IELE, DOI, BIB
- Program Verification by Coinduction
- Brandon Moore and Lucas Pena and Grigore Rosu
- ESOP'18, Springer, pp 589-618. 2018
- Abstract. We present a novel program verification approach based on coinduction, which takes as input an operational semantics. No intermediates like axiomatic semantics or verification condition generators are needed. Specifications can be written using any state predicates. We implement our approach in Coq, giving a certifying language-independent verification framework. Our proof system is implemented as a single module imported unchanged into language-specific proofs. Automation is reached by instantiating a generic heuristic with language-specific tactics. Manual assistance is also smoothly allowed at points the automation cannot handle. We demonstrate the power and versatility of our approach by verifying algorithms as complicated as Schorr-Waite graph marking and instantiating our framework for object languages in several styles of semantics. Finally, we show that our coinductive approach subsumes reachability logic, a recent language-independent sound and (relatively) complete logic for program verification that has been instantiated with operational semantics of languages as complex as C, Java and JavaScript.
- PDF, Matching Logic, DOI, ESOP'18, BIB
- P4K: A Formal Semantics of P4 and Applications
- Ali Kheradmand and Grigore Rosu
- Technical Report https://arxiv.org/abs/1804.01468, April 2018
- Abstract. Programmable packet processors and P4 as a programming language for such devices have gained significant interest, because their flexibility enables rapid development of a diverse set of applications that work at line rate. However, this flexibility, combined with the complexity of devices and networks, increases the chance of introducing subtle bugs that are hard to discover manually. Worse, this is a domain where bugs can have catastrophic consequences, yet formal analysis tools for P4 programs / networks are missing. We argue that formal analysis tools must be based on a formal semantics of the target language, rather than on its informal specification. To this end, we provide an executable formal semantics of the P4 language in the K framework. Based on this semantics, K provides an interpreter and various analysis tools including a symbolic model checker and a deductive program verifier for P4. This paper overviews our formal K semantics of P4, as well as several P4 language design issues that we found during our formalization process. We also discuss some applications resulting from the tools provided by K for P4 programmers and network administrators as well as language designers and compiler developers, such as detection of unportable code, state space exploration of P4 programs and of networks, bug finding using symbolic execution, data plane verification, program verification, and translation validation.
- PDF, P4K, DOI, BIB
- KEVM: A Complete Semantics of the Ethereum Virtual Machine
- Everett Hildenbrandt and Manasvi Saxena and Xiaoran Zhu and Nishant Rodrigues and Philip Daian and Dwight Guth and Brandon Moore and Yi Zhang and Daejun Park and Andrei Stefanescu and Grigore Rosu
- CSF 2018, IEEE, pp 204-217. 2018
- Abstract. A developing field of interest for the distributed systems and applied cryptography communities is that of smart contracts: self-executing financial instruments that synchronize their state, often through a blockchain. One such smart contract system that has seen widespread practical adoption is Ethereum, which has grown to a market capacity of 100 billion USD and clears an excess of 500,000 daily transactions. Unfortunately, the rise of these technologies has been marred by a series of costly bugs and exploits. Increasingly, the Ethereum community has turned to formal methods and rigorous program analysis tools. This trend holds great promise due to the relative simplicity of smart contracts and bounded-time deterministic execution inherent to the Ethereum Virtual Machine (EVM). Here we present KEVM, an executable formal specification of the EVM's bytecode stack-based language built with the \K{} Framework, designed to serve as a solid foundation for further formal analyses. We empirically evaluate the correctness and performance of KEVM using the official Ethereum test suite~\cite{ethereum-tests-url}. To demonstrate the usability, several extensions of the semantics are presented and two different-language implementations of the ERC20 Standard Token are verified against the ERC20 specification. These results are encouraging for the executable semantics approach to language prototyping and specification.
- PDF, KEVM, CSF 2018, BIB
- Matching logic
- Grigore Rosu
- LMCS, Volume 13(4), pp 1-61. 2017
- Abstract. This paper presents matching logic, a first-order logic (FOL) variant for specifying and reasoning about structure by means of patterns and pattern matching. Its sentences, the patterns, are constructed using variables, symbols, connectives and quantifiers, but no difference is made between function and predicate symbols. In models, a pattern evaluates into a power-set domain (the set of values that match it), in contrast to FOL where functions and predicates map into a regular domain. Matching logic uniformly generalizes several logical frameworks important for program analysis, such as: propositional logic, algebraic specification, FOL with equality, modal logic, and separation logic. Patterns can specify separation requirements at any level in any program configuration, not only in the heaps or stores, without any special logical constructs for that: the very nature of pattern matching is that if two structures are matched as part of a pattern, then they can only be spatially separated. Like FOL, matching logic can also be translated into pure predicate logic with equality, at the same time admitting its own sound and complete proof system. A practical aspect of matching logic is that FOL reasoning with equality remains sound, so off-the-shelf provers and SMT solvers can be used for matching logic reasoning. Matching logic is particularly well-suited for reasoning about programs in programming languages that have an operational semantics, but it is not limited to this.
- PDF, project, DOI, LMCS, BIB
- Open Problems and Challenges 2017
- Grigore Rosu
- Technical Report http://hdl.handle.net/2142/95775, March 2017
- Abstract. Here is a 2017 update of the list of open problems and challenges that I (Grigore Rosu) am interested in solving, in no particular order. While we are doing our best to keep this list actual, it may well be the case that some of the problems have been solved in the meanwhile or that we have found a different way to approach them. In case you are interested in working on any of these problems, please send me a note at grosu@illinois.edu to make sure that the problem is still actual and nobody is already working on it. The reason I decided to create and maintain this list of challenges is not only that it helps me keep track of them, but more importantly, that is also helps my students understand these topics better and put things at their place in the big picture. There are more problems here than one person can finish in a life-time. If you choose to work on a problem and believe that I can help, please let me know and we may work together on it. If you are a student in my FSL group, then you are actually expected to work together with me, and possibly other students, on one or more of these problems.
- PDF, Webpage, DOI, BIB
- K - A Semantic Framework for Programming Languages and Formal Analysis Tools
- Grigore Rosu
- Marktoberdorf'16, NATO Science for Peace and Security. 2017. To appear
- Abstract. We give an overview of the K framework, following the lecture notes presented by the author at the Marktoberdorf Summer School in year 2016.
- PDF, K, BIB
- Semantics-Based Program Verifiers for All Languages
- Andrei Stefanescu and Daejun Park and Shijiao Yuwen and Yilong Li and Grigore Rosu
- OOPSLA'16, ACM, pp 74-91. 2016
- Abstract. We present a language-independent verification framework that can be instantiated with an operational semantics to automatically generate a program verifier. The framework treats both the operational semantics and the program correctness specifications as reachability rules between matching logic patterns, and uses the sound and relatively complete reachability logic proof system to prove the specifications using the semantics. We instantiate the framework with the semantics of one academic language, KernelC, as well as with three recent semantics of real-world languages, C, Java, and JavaScript, developed independently. We evaluate our approach empirically and show that the generated program verifiers can check automatically the full functional correctness of challenging heap-manipulating programs implementing operations on list and tree data structures, like AVL trees. This is the first approach that can turn the operational semantics of real-world languages into correct-by-construction automatic verifiers.
- PDF, Slides(PDF), Matching Logic, DOI, OOPSLA'16, BIB
- Finite-Trace Linear Temporal Logic: Coinductive Completeness
- Grigore Rosu
- RV'16, LNCS 10012, pp 333-350. 2016
- Abstract. Linear temporal logic (LTL) is suitable not only for infinite-trace systems, but also for finite-trace systems. The completeness of LTL with only infinite or with both infinite and finite traces has been extensively studied, but similar direct results for LTL with only finite traces are missing. This paper proposes a sound and complete proof system for finite-trace LTL. The axioms and proof rules are natural and expected, except for one rule of coinductive nature, reminiscent of the Goedel-Loeb axiom. A direct decision procedure for finite-trace LTL satisfiability, a PSPACE-complete problem, is also obtained as a corollary.
- PDF, Slides(PPTX), Matching Logic, DOI, RV'16, BIB
- Runtime Verification at Work: A Tutorial
- Philip Daian and Dwight Guth and Chris Hathhorn and Yilong Li and Edgar Pek and Manasvi Saxena and Traian Florin Serbanuta and Grigore Rosu
- RV'16, LNCS 10012, pp 46-67. 2016
- Abstract. We present a suite of runtime verification tools developed by Runtime Verification Inc.: RV-Match, RV-Predict, and RV-Monitor. RV-Match is a tool for checking C programs for undefined behavior and other common programmer mistakes. It is extracted from the most complete formal semantics of the C11 language and beats many similar tools in its ability to catch a broad range of undesirable behaviors. RV-Predict is a dynamic data race detector for Java and C/C++ programs. It is perhaps the only tool that is both sound and maximal: it only reports real races and it can find all races that can be found by any other sound data race detector analyzing the same execution trace. RV-Monitor is a runtime monitoring tool that checks and enforces safety and security properties during program execution. Our tools focus on reporting no false positives and are free for non-commercial use.
- PDF, Slides(PPTX), RV Products, DOI, RV'16, BIB
- How Good are the Specs? A Study of the Bug-Finding Effectiveness of Existing Java API Specifications
- Owolabi Legunsen and Wajih Ul Hassan and Xinyue Xu and Grigore Rosu and Darko Marinov
- ASE 2016, IEEE/ACM, pp 602-613. 2016
- Abstract. Runtime verification can be used to find bugs early, during software development, by monitoring test executions against formal specifications (specs). The quality of runtime verification depends on the quality of the specs. While previous research has produced many specs for the Java API, manually or through automatic mining, there has been no large-scale study of their bug-finding effectiveness. We present the first in-depth study of the bug-finding effectiveness of previously proposed specs. We used JavaMOP to monitor 182 manually written and 17 automatically mined specs against more than 18K manually written and 2.1M automatically generated tests in 200 open-source projects. The average runtime overhead was under 4.3x. We inspected 652 violations of manually written specs and (randomly sampled) 200 violations of automatically mined specs. We reported 95 bugs, out of which developers already accepted or fixed 74. However, most violations, 82.81% of 652 and 97.89% of 200, were false alarms. Our empirical results show that (1) runtime verification technology has matured enough to incur tolerable runtime overhead during testing, and (2) the existing API specifications can find many bugs that developers are willing to fix; however, (3) the false alarm rates are worrisome and suggest that substantial effort needs to be spent on engineering better specs and properly evaluating their effectiveness.
- PDF, Slides(PDF), JavaMOP, DOI, ASE 2016, BIB
- RV-Match: Practical Semantics-Based Program Analysis
- Dwight Guth and Chris Hathhorn and Manasvi Saxena and Grigore Rosu
- CAV'16, LNCS 9779, pp 447-453. 2016
- Abstract. We present RV-Match, a tool for checking C programs for undefined behavior and other common programmer mistakes. Our tool is extracted from the most complete formal semantics of the C11 language. Previous versions of this tool were used primarily for testing the correctness of the semantics, but we have improved it into a tool for doing practical analysis of real C programs. It beats many similar tools in its ability to catch a broad range of undesirable behaviors. We demonstrate this with comparisons based on a third-party benchmark.
- PDF, RV-Match, DOI, CAV'16, BIB
- A Language-Independent Proof System for Full Program Equivalence
- Stefan Ciobaca and Dorel Lucanu and Vlad Rusu and Grigore Rosu
- J.FAOC, Volume 28(3), pp 469-497. 2016
- Abstract. Two programs are fully equivalent if, for the same input, either they both diverge or they both terminate with the same result. Full equivalence is an adequate notion of equivalence for programs written in deterministic languages. It is useful in many contexts, such as capturing the correctness of program transformations within the same language, or capturing the correctness of compilers between two different languages. In this paper we introduce a language-independent proof system for full equivalence, which is parametric in the operational semantics of two languages and in a state-similarity relation. The proof system is sound: a proof tree establishes the full equivalence of the programs given to it as input. We illustrate it on two programs in two different languages (an imperative one and a functional one), that both compute the Collatz sequence. The Collatz sequence is an interesting case study since it is not known weather the sequence terminates or not; nevertheless, our proof system shows that the two programs are fully equivalent (even if we cannot establish termination or divergence of either one).
- PDF, Matching Logic, DOI, J.FAOC, BIB
- RV-ECU: Maximum Assurance In-Vehicle Safety Monitoring
- Philip Daian and Bhargava Manja and Shinichi Shiraishi and Akihito Iwai and Grigore Rosu
- SAE'16, SAE International 2016-01-0126, pp 1-13. 2016
- Abstract. The Runtime Verification ECU (RV-ECU) is a new development platform for checking and enforcing the safety of automotive bus communications and software systems. RV-ECU uses runtime verification, a formal analysis subfield geared at validating and verifying systems as they run, to ensure that all manufacturer and third-party safety specifications are complied with during the operation of the vehicle. By compiling formal safety properties into code using a certifying compiler, the RV-ECU executes only provably correct code that checks for safety violations as the system runs. RV-ECU can also recover from violations of these properties, either by itself in simple cases or together with safe message-sending libraries implementable on third-party control units on the bus. RV-ECU can be updated with new specifications after a vehicle is released, enhancing the safety of vehicles that have already been sold and deployed. Currently a prototype, RV-ECU is meant to eventually be deployed as global and local ECU safety monitors, ultimately responsible for the safety of the entire vehicle system. We describe its overall architecture and implementation, and demonstrate monitoring of safety specifications on the CAN bus. We use past automotive recalls as case studies to demonstrate the potential of updating the RV-ECU as a cost effective and practical alternative to software recalls, while requiring the development of rigorous, formal safety specifications easily sharable across manufacturers, OEMs, regulatory agencies and even car owners.
- PDF, Slides(PPT), RV-ECU, DOI, SAE'16, BIB
- Open Problems and Challenges 2016
- Grigore Rosu
- Technical Report http://hdl.handle.net/2142/88925, February 2016
- Abstract. Here is a list of open problems and challenges that I (Grigore Rosu) am interested in solving, in no particular order. While we are doing our best to keep this list actual, it may well be the case that some of the problems have been solved in the meanwhile or that we have found a different way to approach them. In case you are interested in working on any of these problems, please send me a note at grosu@illinois.edu to make sure that the problem is still actual and nobody is already working on it. The reason I decided to create and maintain this list of challenges is not only that it helps me keep track of them, but more importantly, that is also helps my students understand these topics better and put things at their place in the big picture. There are more problems here than one person can finish in a life-time. If you choose to work on a problem and believe that I can help, please let me know and we may work together on it. If you are a student in my FSL group, then you are actually expected to work together with me, and possibly other students, on one or more of these problems.
- PDF, Webpage, DOI, BIB
- Language Definitions as Rewrite Theories
- Vlad Rusu and Dorel Lucanu and Traian Florin Serbanuta and Andrei Arusoaie and Andrei Stefanescu and Grigore Rosu
- J.LAMP, Volume 85(1, Part 1), pp 98-120. 2016
- Abstract. K is a formal framework for defining operational semantics of programming languages. The K-Maude compiler translates \K language definitions to Maude rewrite theories. The compiler enables program execution by using the Maude rewrite engine with the compiled definitions, and program analysis by using various Maude analysis tools. K supports symbolic execution in Maude by means of an automatic transformation of language definitions. The transformed definition is called the *\em symbolic extension* of the original definition. In this paper we investigate the theoretical relationship between K language definitions and their Maude translations, between symbolic extensions of K definitions and their Maude translations, and how the relationship between K definitions and their symbolic extensions is reflected on their respective representations in Maude. In particular, the results show how analysis performed with Maude tools can be formally lifted up to the original language definitions.
- PDF, project, DOI, J.LAMP, BIB
- Towards a Kool Future
- Dorel Lucanu and Traian Florin Serbanuta and Grigore Rosu
- Boer's Festschrift, LNCS 9660, pp 325-343. 2016
- Abstract. The K framework was successfully used for defining formal semantics for several practical languages, e.g. C, Java, Java Script, but no language with distributed concurrent objects was defined in K up to now. In this paper we investigate how the model of asynchronous method calls, using the so-called futures for handling the return values, can be added to an existing K definition using the ideas from the Complete Guide to the Future paper. As the running example we use the K definition of Kool, a pedagogical and research language that captures the essence of the object-oriented programming paradigm. This is a first step toward a generic methodology for modularly adding future-based mechanisms to allow asynchronous method calls.
- PDF, K, DOI, Boer's Festschrift, BIB
- From Rewriting Logic, to Programming Language Semantics, to Program Verification
- Grigore Rosu
- Meseguer's Festschrift, LNCS 9200, pp 598-616. 2015
- Abstract. Rewriting logic has proven to be an excellent formalism to define executable semantics of programming languages, concurrent or not, and then to derive formal analysis tools for the defined languages with very little effort, such as model checkers. In this paper we give an overview of recent results obtained in the context of the rewriting logic semantics framework K, such as complete semantics of large programming languages like C, Java, JavaScript, Python, and deductive program verification techniques that allow us to verify programs in these languages using a common verification infrastructure.
- PDF, Slides(PPTX), K, DOI, Meseguer's Festschrift, BIB
- RV-Android: Efficient Parametric Android Runtime Verification, a Brief Tutorial
- Philip Daian and Ylies Falcone and Patrick Meredith and Traian Florin Serbanuta and Shinichi Shiraishi and Akihito Iwai and Grigore Rosu
- RV'15, Lecture Notes in Computer Science 9333, pp 342-357. 2015
- Abstract. RV-Android is a new freely available open source runtime library for monitoring formal safety properties on Android. RV-Android uses the commercial RV-Monitor technology as its core monitoring library generation technology, allowing for the verification of safety properties during execution and operating entirely in userspace with no kernel or operating system modifications required. RV-Android improves on previous Android monitoring work by replacing the JavaMOP framework with RV-Monitor, a more advanced monitoring library generation tool with core algorithmic improvements that greatly improve resource consumption, efficiency, and battery life considerations. We demonstrate the developer usage of RV-Android with the standard Android build process, using instrumentation mechanisms effective on both Android binaries and source code. Our method allows for both property development and advanced application testing through runtime verification. We showcase the user frontend of RV-Monitor, which is available for public demo use and requires no knowledge of RV concepts. We explore the extra expressiveness the MOP paradigm provides over simply writing properties as aspects through two sample security properties, and show an example of a real security violation mitigated by RV-Android on-device. Lastly, we propose RV as an extension to the next-generation Android permissions system debuting in Android M.
- PDF, Slides(PDF), RV-Android, DOI, RV'15, BIB
- Matching Logic --- Extended Abstract
- Grigore Rosu
- RTA'15, Leibniz International Proceedings in Informatics (LIPIcs) 36, pp 5-21. 2015
- Abstract. This paper presents matching logic, a first-order logic (FOL) variant for specifying and reasoning about structure by means of patterns and pattern matching. Its sentences, the patterns, are constructed using variables, symbols, connectives and quantifiers, but no difference is made between function and predicate symbols. In models, a pattern evaluates into a power-set domain (the set of values that match it), in contrast to FOL where functions and predicates map into a regular domain. Matching logic uniformly generalizes several logical frameworks important for program analysis, such as: propositional logic, algebraic specification, FOL with equality, and separation logic. Patterns can specify separation requirements at any level in any program configuration, not only in the heaps or stores, without any special logical constructs for that: the very nature of pattern matching is that if two structures are matched as part of a pattern, then they can only be spatially separated. Like FOL, matching logic can also be translated into pure predicate logic, at the same time admitting its own sound and complete proof system. A practical aspect of matching logic is that FOL reasoning remains sound, so off-the-shelf provers and SMT solvers can be used for matching logic reasoning. Matching logic is particularly well-suited for reasoning about programs in programming languages that have a rewrite-based operational semantics.
- PDF, Slides(PPTX), Matching Logic, DOI, RTA'15, BIB
- Defining the Undefinedness of C
- Chris Hathhorn and Chucky Ellison and Grigore Rosu
- PLDI'15, ACM, pp 336-345. 2015
- Abstract. We present a negative semantics of the C11 language---a semantics that does not just give meaning to correct programs, but also rejects undefined programs. We investigate undefined behavior in C and discuss the techniques and special considerations needed for formally specifying it. We have used these techniques to modify and extend a semantics of C into one that captures undefined behavior. The amount of semantic infrastructure and effort required to achieve this was unexpectedly high, in the end nearly doubling the size of the original semantics. From our semantics, we have automatically extracted an undefinedness checker, which we evaluate against other popular analysis tools, using our own test suite in addition to a third-party test suite. Our checker is capable of detecting examples of all 77 categories of core language undefinedness appearing in the C11 standard, more than any other tool we considered. Based on this evaluation, we argue that our work is the most comprehensive and complete semantic treatment of undefined behavior in C, and thus of the C language itself.
- PDF, C Semantics, DOI, PLDI'15, BIB
- KJS: A Complete Formal Semantics of JavaScript
- Daejun Park and Andrei Stefanescu and Grigore Rosu
- PLDI'15, ACM, pp 346-356. 2015
- Abstract. This paper presents KJS, the most complete and throughly tested formal semantics of JavaScript to date. Being executable, KJS has been tested against the ECMAScript 5.1 conformance test suite, and passes all 2,782 core language tests. Among the existing implementations of JavaScript, only Chrome V8's passes all the tests, and no other semantics passes more than 90\%. In addition to a reference implementation for JavaScript, KJS also yields a simple coverage metric for a test suite: the set of semantic rules it exercises. Our semantics revealed that the ECMAScript 5.1 conformance test suite fails to cover several semantic rules. Guided by the semantics, we wrote tests to exercise those rules. The new tests revealed bugs both in production JavaScript engines (Chrome V8, Safari WebKit, Firefox SpiderMonkey) and in other semantics. KJS is symbolically executable, thus it can be used for formal analysis and verification of JavaScript programs. We verified non-trivial programs and found a known security vulnerability.
- PDF, Slides(PDF), Semantics, DOI, PLDI'15, BIB
- GPredict: Generic Predictive Concurrency Analysis
- Jeff Huang and Qingzhou Luo and Grigore Rosu
- ICSE'15, ACM, pp 847-857. 2015
- Abstract. Predictive trace analysis (PTA) is an effective ap- proach for detecting subtle bugs in concurrent programs. Existing PTA techniques, however, are typically based on adhoc algorithms tailored to low-level errors such as data races or atomicity violations, and are not applicable to high-level properties such as "a resource must be authenticated before use" and "a collection cannot be modified when being iterated over". In addition, most techniques assume as input a globally ordered trace of events, which is expensive to collect in practice as it requires synchronizing all threads. In this paper, we present GPredict: a new technique that realizes PTA for generic concurrency properties. Moreover, GPredict does not require a global trace but only the local traces of each thread, which incurs much less runtime overhead than existing techniques. Our key idea is to uniformly model violations of concurrency properties and the thread causality as constraints over events. With an existing SMT solver, GPredict is able to precisely predict property violations allowed by the causal model. Through our evaluation using both benchmarks and real world applications, we show that GPredict is effective in expressing and predicting generic property violations. Moreover, it reduces the runtime overhead of existing techniques by 54\% on DaCapo benchmarks on average.
- PDF, Slides(PDF), jPredictor, DOI, ICSE'15, BIB
- Evolution-Aware Monitoring-Oriented Programming
- Owolabi Legunsen and Darko Marinov and Grigore Rosu
- ICSE NIER'15, ACM, pp 615-618. 2015
- Abstract. Monitoring-Oriented Programming (MOP) helps develop more reliable software by means of monitoring against formal specifications. While MOP showed promising results, all prior research has focused on checking a single version of a target software application. We propose to extend MOP to support multiple software versions and thus be more relevant in the context of rapid software evolution. Our approach, called eMOP, is inspired by regression test selection---a well studied, evolution-centered technique. The key idea in eMOP is to monitor only the parts of code that changed between versions. We illustrate eMOP by means of a running example, and show the results of preliminary experiments. eMOP opens up a new line of research on MOP---it can significantly improve usability and performance when applied across multiple versions of an application, and is complementary to algorithmic MOP advances on single versions.
- PDF, Slides(PDF), JavaMOP, DOI, ICSE NIER'15, BIB
- Term-Generic Logic
- Andrei Popescu and Grigore Rosu
- Journal of Theoretical Computer Science, Volume 577(1), pp 1-24. 2015
- Abstract. We introduce term-generic logic (TGL), a first-order logic parameterized with terms defined axiomatically (rather than constructively), by requiring terms to only provide free variable and substitution operators satisfying some reasonable axioms. TGL has a notion of model that generalizes both first-order models and Henkin models of the lambda-calculus. The abstract notions of term syntax and model are shown to be sufficient for obtaining the completeness theorem of a Gentzen system generalizing that of first-order logic. Various systems featuring bindings and contextual reasoning, ranging from pure type systems to the pi-calculus, are captured as theories inside TGL. For two particular, but rather typical instances---untyped lambda-calculus and System F---the general-purpose TGL models are shown to be equivalent with standard ad hoc models.
- PDF, project, DOI, Journal of Theoretical Computer Science, BIB
- Program Verification by Coinduction
- Brandon Moore and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/73177, February 2015
- Abstract. We present a program verification framework based on coinduction, which makes it feasible to verify programs directly against an operational semantics, without requiring intermediates like axiomatic semantics or verification condition generators. Specifications can be written and proved using any predicates on the state space of the operational semantics. We implement our approach in Coq, giving a certifying language-independent verification framework. The core proof system is implemented as a single module imported unchanged into proofs of programs in any semantics. A comfortable level of automation is provided by instantiating a simple heuristic with tactics for language-specific tasks such as finding the successor of a symbolic state, and for domain-specific reasoning about the predicates used in a particular specification. This approach also smoothly allows manual assistance at points the automation cannot handle. We demonstrate the power of our approach by verifying algorithms as complicated as Schorr-Waite graph marking, and the versatility by instantiating it for object languages in several styles of semantics. Despite the greater flexibility and generality of our approach, proof size and proof/certificate-checking time compare favorably with Bedrock, another Coq-based certifying program verification framework.
- PDF, Matching Logic, DOI, BIB
- K-Java: A Complete Semantics of Java
- Denis Bogdanas and Grigore Rosu
- POPL'15, ACM, pp 445-456. 2015
- Abstract. This paper presents K-Java, a complete executable formal semantics of Java 1.4. K-Java was extensively tested with a test suite developed alongside the project, following the Test Driven Development methodology. In order to maintain clarity while handling the great size of Java, the semantics was split into two separate definitions -- a static semantics and a dynamic semantics. The output of the static semantics is a preprocessed Java program, which is passed as input to the dynamic semantics for execution. The preprocessed program is a valid Java program, which uses a subset of the features of Java. The semantics is applied to model-check multi-threaded programs. Both the test suite and the static semantics are generic and ready to be used in other Java-related projects.
- PDF, Slides(PDF), K-Java, DOI, POPL'15, BIB
- A Language-Independent Proof System for Mutual Program Equivalence
- Stefan Ciobaca and Dorel Lucanu and Vlad Rusu and Grigore Rosu
- ICFEM'14, LNCS 8829, pp 75-90. 2014
- Abstract. Two programs are mutually equivalent if they both diverge or they both terminate with the same result. In this paper we introduce a language-independent proof system for mutual equivalence, which is parametric in the operational semantics of two languages and in a state-similarity relation. We illustrate it on two programs in two different languages (an imperative one and a functional one), that both compute the Collatz sequence.
- PDF, Slides(PDF), Matching Logic, DOI, ICFEM'14, BIB
- ROSRV: Runtime Verification for Robots
- Jeff Huang and Cansu Erdogan and Yi Zhang and Brandon Moore and Qingzhou Luo and Aravind Sundaresan and Grigore Rosu
- RV'14, LNCS 8734, pp 247-254. 2014
- Abstract. We present ROSRV, a runtime verification framework for robotic applications on top of the Robot Operating System (ROS), a widely used open-source framework for robot software development. ROSRV aims to address the safety and security issues of robots by providing a transparent monitoring infrastructure that intercepts and monitors the commands and messages passing through the system. Safety and security properties can be defined in a formal specification language, and are ensured by automatically generated monitors. ROSRV integrates seamlessly with ROS---no change in ROS nor the application code is needed. ROSRV has been applied and evaluated on a commercial robot.
- PDF, Slides(PPTX), Slides(PDF), ROSRV, DOI, RV'14, BIB
- RV-Monitor: Efficient Parametric Runtime Verification with Simultaneous Properties
- Qingzhou Luo and Yi Zhang and Choonghwan Lee and Dongyun Jin and Patrick O'Neil Meredith and Traian Florin Serbanuta and Grigore Rosu
- RV'14, LNCS 8734, pp 285-300. 2014
- Abstract. Runtime verification can effectively increase the reliability of software systems. In recent years, parametric runtime verification has gained a lot of traction, with several systems proposed. However, lack of real specifications and prohibitive runtime overhead when checking numerous properties simultaneously prevent developers or users from using runtime verification. This paper reports on more than 150 formal specifications manually derived from the Java API documentation of commonly used packages, as well as a series of novel techniques which resulted in a new runtime verification system, RV-Monitor. Experiments show that these specifications are useful for finding bugs and bad software practice, and RV-Monitor is capable of monitoring all our specifications simultaneously, and runs substantially faster than other state-of-the-art runtime verification systems.
- PDF, Slides(PPTX), JavaMOP, DOI, RV'14, BIB
- Converting Parallel Code from Low-Level Abstractions to Higher-Level Abstractions
- Semih Okur and Cansu Erdogan and Danny Dig
- ECOOP'14, Springer, pp 515-540. 2014
- Abstract. Parallel libraries continuously evolve from low-level to higher-level abstractions. However, developers are not up-to-date with these higher-level abstractions, thus their parallel code might be hard to read, slow, and unscalable. Using a corpus of 880 open-source C# applications, we found that developers still use the old Thread and ThreadPool abstractions in 62% of the cases when they use parallel abstractions. Converting code to higher-level abstractions is (i) tedious and (ii) error-prone. e.g., it can harm performance and silence the uncaught exceptions. We present two automated migration tools, Taskifier and Simplifier that work for C# code. The first tool transforms old style Thread and ThreadPool abstractions to Task abstractions. The second tool transforms code with Task abstractions into higher-level design patterns. Using our code corpus, we have applied these tools 3026 and 405 times, respectively. Our empirical evaluation shows that the tools (i) are highly applicable, (ii) reduce the code bloat, (iii) are much safer than manual transformations. We submitted 66 patches generated by our tools, and the open-source developers accepted 53.
- PDF, Taskifier, DOI, ECOOP'14, BIB
- All-Path Reachability Logic
- Andrei Stefanescu and Stefan Ciobaca and Radu Mereuta and Brandon Moore and Traian Florin Serbanuta and Grigore Rosu
- RTA'14, LNCS 8560, pp 425-440. 2014
- Abstract. This paper presents a language-independent proof system for reachability properties of programs written in non-deterministic (e.g. concurrent) languages, referred to as *all-path reachability logic*. It derives partial-correctness properties with all-path semantics (a state satisfying a given precondition reaches states satisfying a given postcondition on all terminating execution paths). The proof system takes as axioms any unconditional operational semantics, and is sound (partially correct) and (relatively) complete, independent of the object language; the soundness has also been mechanized (Coq). This approach is implemented in a tool for semantics-based verification as part of the K framework.
- PDF, Slides(PPTX), Matching Logic, DOI, RTA'14, BIB
- Maximal Sound Predictive Race Detection with Control Flow Abstraction
- Jeff Huang and Patrick Meredith and Grigore Rosu
- PLDI'14, ACM, pp 337-348. 2014
- Abstract. Despite the numerous static and dynamic program analysis techniques in the literature, data races remain one of the most common bugs in modern concurrent software. Further, the techniques that do exist either have limited detection capability or are unsound, meaning that they report false positives. We present a sound race detection technique that achieves a provably higher detection capability than existing sound techniques. A key insight of our technique is the inclusion of abstracted control flow information into the execution model, which increases the space of the causal model permitted by classical happens-before or causally-precedes based detectors. By encoding the control flow and a minimal set of feasibility constraints as a group of first-order logic formulae, we formulate race detection as a constraint solving problem. Moreover, we formally prove that our formulation achieves the maximal possible detection capability for any sound dynamic race detector with respect to the same input trace under the sequential consistency memory model. We demonstrate via extensive experimentation that our technique detects more races than the other state-of-the-art sound race detection techniques, and that it is scalable to executions of real world concurrent applications with tens of millions of critical events. These experiments also revealed several previously unknown races in real systems (e.g., Eclipse) that have been confirmed or fixed by the developers. Our tool is also adopted by Eclipse developers.
- PDF, Slides(PPT), Slides(PDF), jPredictor, DOI, PLDI'14, BIB
- K Overview and SIMPLE Case Study
- Grigore Rosu and Traian Florin Serbanuta
- K'11, ENTCS 304, pp 3-56. 2014
- Abstract. This paper gives an overview of the tool-supported K framework for semantics-based programming language design and formal analysis. K provides a convenient notation for modularly defining the syntax and the semantics of a programming language, together with a series of tools based on these, including a parser and an interpreter. A case study is also discussed, namely the K definition of the dynamic and static semantics of SIMPLE, a non-trivial imperative programming language. The material discussed in this paper was presented in an invited talk at the K'11 workshop.
- PDF, K, DOI, K'11, BIB
- Abstract Semantics for K Module Composition
- Codruta Girlea and Grigore Rosu
- K'11, ENTCS 304, pp 127-149. 2014
- Abstract. A structured K definition is easier to write, understand and debug than one single module containing the whole definition. Furthermore, modularization makes it easy to reuse work between definitions that share some principles or features. Therefore, it is useful to have a semantics for module composition operations that allows the properties of the resulting modules to be well understood at every step of the composition process. This paper presents an abstract semantics for a module system proposal for the K framework. It describes K modules and module transformations in terms of institution-based model theory introduced by Goguen and Burstall.
- PDF, K, DOI, K'11, BIB
- The K Primer (version 3.3)
- Traian Florin Serbanuta and Andrei Arusoaie and David Lazar and Chucky Ellison and Dorel Lucanu and Grigore Rosu
- K'11, ENTCS 304, pp 57-80. 2014
- Abstract. This paper serves as a brief introduction to the K tool, a system for formally defining programming languages. It is shown how sequential or concurrent languages can be defined in K simply and modularly. These formal definitions automatically yield an interpreter for the language, as well as program analysis tools such as a state-space explorer.
- PDF, K, DOI, K'11, BIB
- On the Complexity of Stream Equality
- Joerg Endrullis and Dimitri Hendriks and Rena Bakhshi and Grigore Rosu
- Journal of Functional Programming, Volume 24(2-3), pp 166-217. 2014
- Abstract. We study the complexity of deciding the equality of streams specified by systems of equations. There are several notions of stream models in the literature, each generating a different semantics of stream equality. We pinpoint the complexity of each of these notions in the arithmetical or analytical hierarchy. Their complexity ranges from low levels of the arithmetical hierarchy such as $\cpi{0}{2}$ for the most relaxed stream models, to levels of the analytical hierarchy such as $\cpi{1}{1}$ and up to subsuming the entire analytical hierarchy for more restricted but natural stream models. Since all these classes properly include both the semi-decidable and co-semi-decidable classes, it follows that regardless of the stream semantics employed, there is no complete proof system or algorithm for determining equality or inequality of streams. We also discuss several related problems, such as the existence and uniqueness of stream solutions for systems of equations, as well as the equality of such solutions.
- PDF, CIRC, DOI, Journal of Functional Programming, BIB
- Language Definitions as Rewrite Theories
- Andrei Arusoaie and Dorel Lucanu and Vlad Rusu and Traian Florin Serbanuta and Andrei Stefanescu and Grigore Rosu
- WRLA'14, LNCS 8663, pp 97-112. 2014
- Abstract. K is a formal framework for defining the operational semantics of programming languages. It includes software tools for compiling K language definitions to Maude rewrite theories, for executing programs in the defined languages based on the Maude rewriting engine, and for analyzing programs by adapting various Maude analysis tools. A recent extension to the K tool suite is an automatic transformation of language definitions that enables the symbolic execution of programs, i.e., the execution of programs with symbolic inputs. In this paper we investigate the theoretical relationships between K language definitions and their translations to Maude, between symbolic extensions of K definitions and their Maude encodings, and how the relations between K definitions and their symbolic extensions are reflected on their respective representations in Maude. These results show, in particular, how analyses performed with Maude tools can be formally lifted up to the original language definitions.
- PDF, K, DOI, WRLA'14, BIB
- Behavioral Rewrite Systems and Behavioral Productivity
- Grigore Rosu and Dorel Lucanu
- Futatsugi Festschrift 2014, LNCS 8373, pp 296-314. 2014
- Abstract. This paper introduces behavioral rewrite systems, where rewriting is used to evaluate experiments, and behavioral productivity, which says that each experiment can be fully evaluated, and investigates some of their properties. First, it is shown that, in the case of (infinite) streams, behavioral productivity generalizes and may bring to a more basic rewriting setting the existing notion of stream productivity defined in the context of infinite rewriting and lazy strategies; some arguments are given that in some cases one may prefer the behavioral approach. Second, a behavioral productivity criterion is given, which reduces the problem to conventional term rewrite system termination, so that one can use off-the-shelf termination tools and techniques for checking behavioral productivity in general, not only for streams. Finally, behavioral productivity is shown to be equivalent to a proof-theoretic (rather than model-theoretic) notion of behavioral well-specifiedness, and its difficulty in the arithmetic hierarchy is shown to be -complete.
- PDF, Slides(PPTX), CIRC, DOI, Futatsugi Festschrift 2014, BIB
- Optimizing SYB is Easy!
- Michael D. Adams and Andrew Farmer and Jose Pedro Magalhaes
- PEPM'14, ACM, pp 71-82. 2014
- Abstract. The most widely used generic-programming system in the Haskell community, Scrap Your Boilerplate (SYB), also happens to be one of the slowest. Generic traversals in SYB are often an order of magnitude slower than equivalent handwritten, non-generic traversals. Thus while SYB allows the concise expression of many traversals, its use incurs a significant runtime cost. Existing techniques for optimizing other generic-programming systems are not able to eliminate this overhead. This paper presents an optimization that completely eliminates this cost. Essentially, it is a partial evaluation that takes advantage of domain-specific knowledge about the structure of SYB. It optimizes SYB-style traversals to be as fast as handwritten, non-generic code, and benchmarks show that this optimization improves the speed of SYB-style code by an order of magnitude or more.
- PDF, Optimizing SYB, DOI, PEPM'14, BIB
- Matching Logic: A Logic for Structural Reasoning
- Grigore Rosu
- Technical Report http://hdl.handle.net/2142/47004, Jan 2014
- Abstract. Matching logic is a first-order logic (FOL) variant to reason about structure. Its sentences, called patterns, are constructed using variables, symbols, connectives and quantifiers, but no difference is made between function and predicate symbols. In models, a pattern evaluates into a power-set domain (the set of values that match it), in contrast to FOL where functions, predicates and connectives map into a domain. Matching logic generalizes several logical frameworks important for program analysis, such as: propositional logic, algebraic specification, FOL with equality, and separation logic. Patterns allow for specifying separation requirements at any level in any program configuration, not only in the heaps or stores, without any special logical constructs for that: the very nature of pattern matching is that if two structures are matched as part of a pattern, then they can only be spatially separated. Like FOL, matching logic can also be translated into pure predicate logic with equality, but it also admits its own sound and complete proof system.
- PDF, Matching Logic, DOI, BIB
- The Rewriting Logic Semantics Project: A Progress Report
- Jose Meseguer and Grigore Rosu
- Information and Computation, Volume 231(1), pp 38-69. 2013
- Abstract. Rewriting logic is an executable logical framework well suited for the semantic definition of languages. Any such framework has to be judged by its effectiveness to bridge the existing gap between language definitions on the one hand, and language implementations and language analysis tools on the other. We give a progress report on how researchers in the rewriting logic semantics project are narrowing the gap between theory and practice in areas such as: modular semantic definitions of languages; scalability to real languages; support for real time; semantics of software and hardware modeling languages; and semantics-based analysis tools such as static analyzers, model checkers, and program provers.
- PDF, K, DOI, Information and Computation, BIB
- Specifying Languages and Verifying Programs with K
- Grigore Rosu
- SYNASC'13, IEEE/CPS, pp 28-31. 2013
- Abstract. K is a rewrite-based executable semantic framework for defining languages. The K framework is designed to allow implementing a variety of generic tools that can be used with any language defined in K, such as parsers, interpreters, symbolic execution engines, semantic debuggers, test-case generators, state-space explorers, model checkers, and even deductive program verifiers. The latter are based on matching logic for expressing static properties, and on reachability logic for expressing dynamic properties. Several large languages have been already defined or are being defined in K, including C, Java, Python, Javascript, and LLVM.
- PDF, Slides(PPTX), K, DOI, SYNASC'13, BIB
- EnforceMOP: A Runtime Property Enforcement System for Multithreaded Programs
- Qingzhou Luo and Grigore Rosu
- ISSTA'13, ACM, pp 156-166. 2013
- Abstract. Multithreaded programs are hard to develop and test. In order for programs to avoid unexpected concurrent behaviors at runtime, for example data-races, synchronization mechanisms are typically used to enforce a safe subset of thread interleavings. Also, to test multithreaded programs, developers need to enforce the precise thread schedules that they want to test. These tasks are nontrivial and error prone. This paper presents EnforceMOP, a framework for specifying and enforcing complex properties in multithreaded Java programs. A property is enforced at runtime by blocking the threads whose next actions would violate it. This way the remaining threads, whose execution is safe, can make global progress until the system eventually reaches a global state in which the blocked threads can be safely unblocked and allowed to execute. Users of EnforceMOP can specify the properties to be enforced using the expressive MOP multi-formalism notation, and can provide code to be executed at deadlock (when no thread is safe to continue). EnforceMOP was used in two different kinds of applications. First, to enforce general properties in multithreaded programs, as a formal, semantic alternative to the existing rigid and sometimes expensive syntactic synchronization mechanisms. Second, to enforce testing desirable schedules in unit testing of multithreaded programs, as an alternative to the existing limited and often adhoc techniques. Results show that EnforceMOP is able to effectively express and enforce complex properties and schedules in both scenarios.
- PDF, Slides(PPTX), EnforceMOP, DOI, ISSTA'13, BIB
- One-Path Reachability Logic
- Grigore Rosu and Andrei Stefanescu and Stefan Ciobaca and Brandon Moore
- LICS'13, IEEE, pp 358-367. 2013
- Abstract. This paper introduces *reachability logic*, a language-independent proof system for program verification, which takes an operational semantics as axioms and derives *reachability rules*, which generalize Hoare triples. This system improves on previous work by allowing operational semantics given with *conditional* rewrite rules, which are known to support all major styles of operational semantics. In particular, Kahn's big-step and Plotkin's small-step semantic styles are newly supported. The reachability logic proof system is shown sound (i.e., partially correct) and (relatively) complete. Reachability logic thus eliminates the need to independently define an axiomatic and an operational semantics for each language, and the non-negligible effort to prove the former sound and complete w.r.t. the latter. The soundness result has also been formalized in Coq, allowing reachability logic derivations to serve as formal proof certificates that rely only on the operational semantics.
- PDF, Slides(PPTX), Reachability Logic, LICS'13, BIB
- Low-Level Program Verification using Matching Logic Reachability
- Dwight Guth and Andrei Stefanescu and Grigore Rosu
- LOLA'13. 2013
- Abstract. Matching logic reachability is an emerging verification approach which uses a language-independent proof system to prove program properties based on the operational semantics. In this paper we apply this approach in the context of a low-level real-time language with interrupts, in which each instruction takes a specified time to execute. In particular, we verify that if the interrupts are scheduled with large enough intervals, the program execution terminates yielding the correct result. Surprisingly, it turns out that matching logic reachability can handle the low-level and real-time features of the language just by using their operational semantics, and that language specific reasoning is unnecessary.
- PDF, Slides(PDF), Matching Logic, LOLA'13, BIB
- Efficient Parametric Runtime Verification with Deterministic String Rewriting
- Patrick Meredith and Grigore Rosu
- ASE'13, IEEE/ACM, pp 70-80. 2013
- Abstract. Early efforts in runtime verification show that parametric regular and temporal logic specifications can be monitored efficiently. These approaches, however, have limited expressiveness: their specifications always reduce to monitors with finite state. More recent developments showed that parametric context-free properties can be efficiently monitored with overheads generally lower than 12-15%. While context-free grammars are more expressive than finite-state languages, they still do not allow every computable safety property. This paper presents a monitor synthesis algorithm for string rewriting systems (SRS). SRSs are well known to be Turing complete, allowing for the formal specification of any computable safety property. Earlier attempts at Turing complete monitoring have been relatively inefficient. This paper demonstrates that monitoring parametric SRSs is practical. The presented algorithm uses a modified version of Aho-Corasick string searching for quick pattern matching with an incremental rewriting approach that avoids reexamining parts of the string known to contain no redexes.
- PDF, Slides(PPT), Slides(KEY), Slides(PDF), MOP, DOI, ASE'13, BIB
- On Safety Properties and Their Monitoring
- Grigore Rosu
- SACS, Volume 22(2), pp 327-365. 2012
- Abstract. This paper addresses the problem of runtime verification from a foundational perspective, answering questions like Is there a consensus among the various definitions of a safety property? (Answer: Yes), How many safety properties exist? (Answer: As many as real numbers), How difficult is the problem of monitoring a safety property? (Answer: Arbitrarily complex), Is there any formalism that can express all safety properties? (Answer: No), etc. Various definitions of safety properties as sets of execution traces have been proposed in the literature, some over finite traces, others over infinite traces, yet others over both finite and infinite traces. By employing cardinality arguments and a novel notion of persistence, this paper first establishes the existence of bijective correspondences between the various notions of safety property. It then shows that safety properties can be characterized as always past properties. Finally, it proposes a general notion of monitor, which allows to show that safety properties correspond precisely to the monitorable properties, and then to establish that monitoring a safety property is arbitrarily hard.
- PDF, MOP, DOI, SACS, BIB
- Checking Reachability using Matching Logic
- Grigore Rosu and Andrei Stefanescu
- OOPSLA'12, ACM, pp 555-574. 2012
- Abstract. This paper presents a verification framework that is parametric in a (trusted) operational semantics of some programming language. The underlying proof system is language-independent and consists of eight proof rules. The proof system is proved partially correct and relatively complete (with respect to the programming language configuration model). To show its practicality, the generic framework is instantiated with a fragment of C and evaluated with encouraging results.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, OOPSLA'12, BIB
- A Truly Concurrent Semantics for the K Framework Based on Graph Transformations
- Traian Florin Serbanuta and Grigore Rosu
- ICGT'12, LNCS 7562, pp 294-310. 2012
- Abstract. This paper gives a truly concurrent semantics with sharing of resources for the K semantic framework, an executable (term-)rewriting-based formalism for defining programming languages and calculi. Akin to graph rewriting rules, the K (rewrite) rules explicitly state what can be concurrently shared with other rules. The desired true concurrency is obtained by translating the K rules into a novel instance of term-graph rewriting with explicit sharing, and then using classical concurrency from the double-pushout (DPO) approach to graph rewriting. The resulting parallel term-rewriting relation is proved sound, complete, and serializable with respect to the jungle rewriting flavor of term-graph rewriting, and, therefore, also to term rewriting.
- PDF, Slides(PDF), K, DOI, ICGT'12, BIB
- Maximal Causal Models for Sequentially Consistent Systems
- Traian Florin Serbanuta and Feng Chen and Grigore Rosu
- RV'12, LNCS 7687, pp 136-150. 2012
- Abstract. This paper shows that it is possible to build a maximal and sound causal model for concurrent computations from a given execution trace. It is sound, in the sense that any program which can generate a trace can also generate all traces in its causal model. It is maximal (among sound models), in the sense that by extending the causal model of an observed trace with a new trace, the model becomes unsound: there exists a program generating the original trace which cannot generate the newly introduced trace. Thus, the maximal sound model has the property that it comprises all traces which all programs that can generate the original trace can also generate. The existence of such a model is of great theoretical value as it can be used to prove the soundness of non-maximal, and thus smaller, causal models.
- PDF, Slides(PDF), JPredictor, DOI, RV'12, BIB
- From Hoare Logic to Matching Logic Reachability
- Grigore Rosu and Andrei Stefanescu
- FM'12, LNCS 7436, pp 387-402. 2012
- Abstract. Matching logic reachability has been recently proposed as an alternative program verification approach. Unlike Hoare logic, where one defines a language-specific proof system that needs to be proved sound for each language separately, matching logic reachability provides a *language-independent* and *sound* proof system that directly uses the trusted operational semantics of the language as axioms. Matching logic reachability thus has a clear practical advantage: it eliminates the need for an additional semantics of the same language in order to reason about programs, and implicitly eliminates the need for tedious soundness proofs. What is not clear, however, is whether matching logic reachability is as powerful as Hoare logic. This paper introduces a technique to mechanically translate Hoare logic proof derivations into equivalent matching logic reachability proof derivations. The presented technique has two consequences: first, it suggests that matching logic reachability has no theoretical limitation over Hoare logic; and second, it provides a new approach to prove Hoare logics sound.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, FM'12, BIB
- Executing Formal Semantics with the K Tool
- David Lazar and Andrei Arusoaie and Traian Florin Serbanuta and Chucky Ellison and Radu Mereuta and Dorel Lucanu and Grigore Rosu
- FM'12, LNCS 7436, pp 267-271. 2012
- Abstract. This paper describes the K tool, a system for formally defining programming languages. Formal definitions created using the K tool automatically yield an interpreter for the language, as well as program analysis tools such as a state-space explorer. The modularity of K and the design of the tool allow one semantics to be used for several applications.
- PDF, Slides(PDF), K, DOI, FM'12, BIB
- Making Runtime Monitoring of Parametric Properties Practical
- Dongyun Jin
- PhD Thesis, University of Illinois at Urbana-Champaign. August 2012
- Abstract. Software reliability has become more important than ever in recent years, as a wide spectrum of software solutions are being used on various platforms. To this end, runtime monitoring is one of the most promising and feasible solutions for enhancing software reliability. In particular, runtime monitoring of parametric properties (parametric monitoring) has been receiving growing attention for its suitability in object-oriented systems. Despite many parametric monitoring approaches that have been proposed recently, they are still not widely used in real applications, implying that parametric monitoring is not sufficiently practical yet. In this dissertation, three perspectives for better practicality of parametric monitoring are proposed: expressiveness, efficiency, and scalability. A number of techniques on all three perspectives are developed and integrated to the JavaMOP framework, which is a formalism-independent, extensible runtime monitoring framework for parametric properties. One limitation in expressing parametric properties is that the first event must alway initiate all parameters. This limitation is removed in the proposed work to improve expressiveness of parametric monitoring. Further, a new logical formalism, PTCaRet, is introduced for describing properties of the call stack. As for efficiency, the `enable set optimization', the `indexing cache', and the `monitor garbage collection' are proposed for optimizing creation of monitors, access to monitors, and termination of monitors, respectively. In addition, several scalable parametric monitoring techniques are introduced. These techniques, for the first time, allow a large number of simultaneous parametric specifications to be monitored efficiently. The optimization techniques presented in this dissertation were implemented into the JavaMOP framework, yielding JavaMOP 3.0, the latest and most efficient version of JavaMOP. Thorough evaluations show that these techniques can improve runtime performance of JavaMOP by 3 times on average, and up to 63 times in some cases; as for memory usage, by 3 times on average. While Tracematches and the previous version of JavaMOP crashed on several cases due to out of memory errors, the newer version of JavaMOP did not crash on any case during the evaluations. Considering that the previous version of JavaMOP was one of the most efficient parametric monitoring frameworks in terms of runtime performance, the results presented in the dissertation can be argued significant.
- PDF, MOP, DOI, BIB
- Efficient, Expressive, and Effective Runtime Verification
- Patrick O'Neil Meredith
- PhD Thesis, University of Illinois at Urbana-Champaign. August 2012
- Abstract. Runtime Verification is a quickly growing technique for providing many of the guarantees of formal verification, but in a manner that is scalable. It useful information available from actual runs of programs to make verification decisions, rather than the purely static information used in formal verification. One of the main facets of Runtime Verification is runtime monitoring, where safety properties are checked against the execution of a program during (or in some cases after) its run. Prior work on efficient monitoring focused primarily on finite state properties. Non-finite state techniques existed, but added orders of magnitude of runtime overhead on the monitored system. The vast majority of runtime monitoring has also been limited to the application domain, with violations of safety properties only found on the actual trace of a given program. This thesis describes research that demonstrates that various logical formalisms, including those more powerful than finite logics, can be efficiently monitored in multiple monitoring domains. The demonstrated monitoring domains run the gamut from the application level with the Java programming language, to monitoring traces \emph{predicted} from a given run of a program, to hardware based monitors designed to ensure proper peripheral operation. The logical formalisms include multicategory finite state machines, extended regular expressions, past-time linear temporal logic with optimization for hardware based monitors, context-free grammars, linear temporal logic with both past and future operators, and string rewriting. This combination of domains and logical formalisms show that monitoring can be both expressive and efficient, regardless of the expressive power of the logical formalism, and that monitoring can be used not only for flat traces generated by software applications, but also in predictive traces and a hardware context.
- PDF, MOP, DOI, BIB
- Reachability Logic
- Grigore Rosu and Andrei Stefanescu and Stefan Ciobaca and Brandon Moore
- Technical Report http://hdl.handle.net/2142/32952, Jul 2012
- Abstract. This paper introduces *reachability logic*, a language-independent seven-rule proof system for deriving reachability properties of systems. The key ingredients of *reachability logic* are its sentences, which are called *reachability rules* and generalize the transitions of operational semantics and the Hoare triples of axiomatic semantics, and the *Circularity* proof rule, which generalizes invariant proof rules for iterative and recursive constructs in axiomatic semantics. The target transition system is described as a set of reachability rules, which are taken as axioms in a reachability logic proof. Typical definition styles which can be read as collections of reachability rules include conventional small-step and big-step operational semantics. The reachability logic proof system is shown sound (in the sense of partial correctness) and relatively complete. The soundness result has also been formalized in Coq, allowing to convert reachability logic proofs into proof certificates depending only on the operational semantics and the unavoidable domain reasoning. Reachability logic thus eliminates the need to independently define an axiomatic and an operational semantics for each language, and the non-negligible effort to prove the former sound and complete w.r.t the latter.
- PDF, Reachability Logic, DOI, BIB
- Towards a Unified Theory of Operational and Axiomatic Semantics
- Grigore Rosu and Andrei Stefanescu
- ICALP'12, LNCS 7392, pp 351-363. 2012
- Abstract. This paper presents a nine-rule *language-independent* proof system that takes an operational semantics as axioms and derives program reachability properties, including ones corresponding to Hoare triples. This eliminates the need for language-specific Hoare-style proof rules to verify programs, and, implicitly, the tedious step of proving such proof rules sound for each language separately. The key proof rule is *Circularity*, which is coinductive in nature and allows for reasoning about constructs with repetitive behaviors (e.g., loops). The generic proof system is shown sound and has been implemented in the MatchC verifier.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, ICALP'12, BIB
- A Formal Semantics of C with Applications
- Chucky Ellison
- PhD Thesis, University of Illinois. July 2012
- Abstract. This dissertation shows that complex, real programming languages can be completely formalized in the K Framework, yielding interpreters and analysis tools for testing and bug detection. This is demonstrated by providing, in K, the first complete formal semantics of the C programming language. With varying degrees of effort, tools such as interpreters, debuggers, and model-checkers, together with tools that check for memory safety, races, deadlocks, and undefined behavior are then generated from the semantics. Being executable, the semantics has been thoroughly tested against the GCC torture test suite and successfully passes 99.2\% of 776 test programs. The semantics is also evaluated against popular analysis tools, using a new test suite in addition to a third-party test suite. The semantics-based tool performs at least as well or better than the other tools tested.
- PDF, K, DOI, BIB
- Test-Case Reduction for C Compiler Bugs
- John Regehr and Yang Chen and Pascal Cuoq and Eric Eide and Chucky Ellison and Xuejun Yang
- PLDI'12, ACM, pp 335-346. 2012
- Abstract. To report a compiler bug, one must often find a small test case that triggers the bug. The existing approach to automated test-case reduction, delta debugging, works by removing substrings of the original input; the result is a concatenation of substrings that delta cannot remove. We have found this approach less than ideal for reducing C programs because it typically yields test cases that are too large or even invalid (relying on undefined behavior). To obtain small and valid test cases consistently, we designed and implemented three new, domain-specific test-case reducers. The best of these is based on a novel framework in which a generic fixpoint computation invokes modular transformations that perform reduction operations. This reducer produces outputs that are, on average, more than 25 times smaller than those produced by our other reducers or by the existing reducer that is most commonly used by compiler developers. We conclude that effective program reduction requires more than straightforward delta debugging.
- PDF, Slides(PPTX), C-Reduce, DOI, PLDI'12, BIB
- JavaMOP: Efficient Parametric Runtime Monitoring Framework
- Dongyun Jin and Patrick O'Neil Meredith and Choonghwan Lee and Grigore Rosu
- ICSE'12, IEEE, pp 1427-1430. 2012
- Abstract. Runtime monitoring is a technique usable in all phases of the software development cycle, from initial testing, to debugging, to actually maintaining proper function in production code. Of particular importance are parametric monitoring systems, which allow the specification of properties that relate objects in a program, rather than only global properties. In the past decade, a number of parametric runtime monitoring systems have been developed. Here we give a demonstration of our system, JavaMOP. It is the only parametric monitoring system that allows multiple differing logical formalisms. It is also the most efficient in terms of runtime overhead, and very competitive with respect to memory usage.
- PDF, MOP, DOI, ICSE'12, BIB
- Security-Policy Monitoring and Enforcement with JavaMOP
- Soha Hussein and Patrick Meredith and Grigore Rosu
- PLAS'12, ACM, pp 3:1-3:11. 2012
- Abstract. Software security attacks represent an ever growing problem. One way to make software more secure is to use Inlined Reference Monitors (IRMs), which allow security specifications to be inlined inside a target program to ensure its compliance with the desired security specifications. The IRM approach has been developed primarily by the security community. Runtime Verification (RV), on the other hand, is a software engineering approach, which is intended to formally encode system specifications within a target program such that those specifications can be later enforced during the execution of the program. Until now, the IRM and RV approaches have lived separate lives; in particular RV techniques have not been applied to the security domain, being used instead to aid program correctness and testing. This paper discusses the usage of a formalism-generic RV system, JavaMOP, as a means to specify IRMs, leveraging the careful engineering of the JavaMOP system for ensuring secure operation of software in an efficient manner.
- PDF, MOP, DOI, PLAS'12, BIB
- Defining the Undefinedness of C
- Chucky Ellison and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/30780, April 2012
- Abstract. This paper investigates undefined behavior in C and offers a few simple techniques for operationally specifying such behavior formally. A semantics-based undefinedness checker for C is developed using these techniques, as well as a test suite of undefined programs. The tool is evaluated against other popular analysis tools, using the new test suite in addition to a third-party test suite. The semantics-based tool performs at least as well or better than the other tools tested.
- PDF, K, DOI, BIB
- Scalable Parametric Runtime Monitoring
- Dongyun Jin and Patrick O'Neil Meredith and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/30757, April 2012
- Abstract. Runtime monitoring is an effective means to improve the reliability of systems. In recent years, parametric monitoring, which is highly suitable for object-oriented systems, has gained significant traction. Previous work on the performance of parametric runtime monitoring has focused on the performance of monitoring only one specification at a time. A realistic system, however, has numerous properties that need to be monitored simultaneously. This paper introduces scalable techniques to improve the performance of one of the fastest parametric monitoring systems, JavaMOP, in the presence of multiple simultaneous properties, resulting in average runtime overheads that are less than the summation of the overheads of the properties run in isolation. An extensive evaluation shows that these techniques, which were derived following a thorough investigation and analysis of the current bottlenecks in JavaMOP, improve its runtime performance in the presence of multiple properties by up to two times and the memory usage by 34\%.
- PDF, MOP, DOI, BIB
- Towards Categorizing and Formalizing the JDK API
- Choonghwan Lee and Dongyun Jin and Patrick O'Neil Meredith and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/30006, March 2012
- Abstract. Formal specification of correct library usage is extremely useful, both for software developers and for the formal analysis tools they use, such as model checkers or runtime monitoring systems. Unfortunately, the process of creating formal specifications is time consuming, and, for the most part, even the libraries in greatest use, such as the Java Development Kit (JDK) standard library, are left wholly without formal specification. This paper presents a tool-supported approach to help writing formal specifications for Java libraries and creating documentation augmented with highlighting and formal specifications. The presented approach has been applied to systematically and completely formalize the runtime properties of three core and commonly used packages of the JDK API, namely java.io, java.lang and java.util, yielding 137 formal specifications. Indirectly, this paper also brings empirical evidence that parametric specifications may be sufficiently powerful to express virtually all desirable runtime properties of the JDK API, and that its informal documentation can be formalized.
- PDF, Java API, DOI, BIB
- Making Maude Definitions more Interactive
- Andrei Arusoaie and Traian Florin Serbanuta and Chucky Ellison and Grigore Rosu
- WRLA'12, LNCS 7571, pp 83-98. 2012
- Abstract. This paper presents an interface for achieving interactive executions of Maude terms by allowing console and file input/output (I/O) operations. This interface consists of a Maude API for I/O operations, a Java-based server offering I/O capabilities, and a communication protocol between the two implemented using the external objects concept and Maude's TCP sockets. This interface was evaluated as part of the K framework, providing interactive interpreter capabilities for executing and testing programs for multiple language definitions.
- PDF, K, DOI, WRLA'12, BIB
- K Framework Distilled
- Dorel Lucanu and Traian Florin Serbanuta and Grigore Rosu
- WRLA'12, LNCS 7571, pp 31-53. 2012
- Abstract. K is a rewrite-based executable semantic framework in which programming languages, type systems, and formal analysis tools can be defined using configurations, computations and rules. Configurations organize the state in units called cells, which are labeled and can be nested. Computations are special nested list structures sequentializing computational tasks, such as fragments of program. K (rewrite) rules make it explicit which parts of the term they read-only, write-only, read-write, or do not care about. This makes K suitable for defining truly concurrent languages even in the presence of sharing. Computations are like any other terms in a rewriting environment: they can be matched, moved from one place to another, modified, or deleted. This makes K suitable for defining control-intensive features such as abrupt termination, exceptions or call/cc. This paper presents an overview of K Framework and the K tool, focusing on the interaction between the K tool and Maude.
- PDF, K, DOI, WRLA'12, BIB
- Semantics and Algorithms for Parametric Monitoring
- Grigore Rosu and Feng Chen
- J.LMCS, Volume 8(1), pp 1-47. 2012
- Abstract. Analysis of execution traces plays a fundamental role in many program analysis approaches, such as runtime verification, testing, monitoring, and specification mining. Execution traces are frequently parametric, i.e., they contain events with parameter bindings. Each parametric trace usually consists of many meaningful trace slices merged together, each slice corresponding to one parameter binding. This gives a semantics-based solution to parametric trace analysis. A general-purpose parametric trace slicing technique is introduced, which takes each event in the parametric trace and dispatches it to its corresponding trace slices. This parametric trace slicing technique can be used in combination with any conventional, non-parametric trace analysis technique, by applying the later on each trace slice. As an instance, a parametric property monitoring technique is then presented. The presented parametric trace slicing and monitoring techniques have been implemented and extensively evaluated. Measurements of runtime overhead confirm that the generality of the discussed techniques does not come at a performance expense when compared with existing parametric trace monitoring systems.
- PDF, MOP, DOI, J.LMCS, BIB
- An Executable Formal Semantics of C with Applications
- Chucky Ellison and Grigore Rosu
- POPL'12, ACM, pp 533-544. 2012
- Abstract. This paper describes an executable formal semantics of C. Being executable, the semantics has been thoroughly tested against the GCC torture test suite and successfully passes 99.2\% of 776 test programs. It is the most complete and thoroughly tested formal definition of C to date. The semantics yields an interpreter, debugger, state space search tool, and model checker "for free". The semantics is shown capable of automatically finding program errors, both statically and at runtime. It is also used to enumerate nondeterministic behavior.
- PDF, Slides(PDF), Semantics, DOI, POPL'12, BIB
- Recursive Proofs for Inductive Tree Data-Structures
- Parthasarathy Madhusudan and Xiaokang Qiu and Andrei Stefanescu
- POPL'12, ACM, pp 123-136. 2012
- Abstract. We develop logical mechanisms and procedures to facilitate the verification of full functional properties of inductive tree data-structures using recursion that are sound, incomplete, but terminating. Our contribution rests in a new extension of first-order logic with recursive definitions called *Dryad*, a syntactical restriction on pre- and post-conditions of recursive imperative programs using *Dryad*, and a systematic methodology for accurately unfolding the footprint on the heap uncovered by the program that leads to finding simple recursive proofs using formula abstraction and calls to SMT solvers. We evaluate our methodology empirically and show that several complex tree data-structure algorithms can be checked against full functional specifications automatically, given pre- and post-conditions. This results in the first automatic terminating methodology for proving a wide variety of annotated algorithms on tree data-structures correct, including max-heaps, treaps, red-black trees, AVL trees, binomial heaps, and B-trees.
- PDF, Dryad, DOI, POPL'12, BIB
- Maximal Causal Models for Sequentially Consistent Systems
- Traian Florin Serbanuta and Feng Chen and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/27708, October 2011
- Abstract. This paper shows that it is possible to build a maximal and sound causal model for concurrent computations from a given execution trace. It is sound, in the sense that any program which can generate a trace can also generate all traces in its causal model. It is maximal (among sound models), in the sense that by extending the causal model of an observed trace with a new trace, the model becomes unsound: there exists a program generating the original trace which cannot generate the newly introduced trace. Thus, the maximal sound model has the property that it comprises all traces which all programs that can generate the original trace can also generate. The existence of such a model is of great theoretical value. First, it can be used to prove the soundness of non-maximal, and thus smaller, causal models. Second, since it is maximal, the proposed model allows for natural and causal-model-independent definitions of trace-based properties; this paper proposes maximal definitions for causal dataraces and causal atomicity. Finally, although defined axiomatically, the set of traces comprised by the proposed model are shown to be effectively constructed from the original trace. Thus, maximal causal models are also amenable for developing practical analysis tools.
- PDF, JPredictor, DOI, BIB
- Improved Multithreaded Unit Testing
- Vilas Jagannath and Milos Gligoric and Dongyun Jin and Qingzhou Luo and Grigore Rosu and Darko Marinov
- FSE'11, ACM, pp 223-233. 2011
- Abstract. Multithreaded code is notoriously hard to develop and test. A multithreaded test exercises the code under test with two or more threads. Each test execution follows some schedule/interleaving of the multiple threads, and different schedules can give different results. Developers often want to enforce a particular schedule for test execution, and to do so, they use time delays (sleep in Java). Unfortunately, this approach can produce false positives or negatives, and can result in unnecessarily long testing time. This paper presents IMUnit, a novel approach to specifying and executing schedules for multithreaded tests. A new language is introduced that allows explicitly specifying schedules as constraints on the events during test execution. A tool that automatically controls the code to execute the specified schedule is provided, as well as a tool that helps developers to migrate their legacy, sleep-based tests into event-based tests in IMUnit. The latter tool uses novel techniques for inferring events and schedules from the executions of sleep-based tests. Experience in migrating over 200 tests is described. The inference techniques have high precision and recall, of over 75%. IMUnit reduces testing time compared to sleep-based tests on average 3.41x.
- PDF, DOI, FSE'11, BIB
- MatchC: A Matching Logic Reachability Verifier Using the K Framework
- Andrei Stefanescu
- K'11, ENTCS 304. 2011
- Abstract. This paper presents MatchC, a matching logic reachability verifier using the K framework. K is a rewriting-based framework for defining and analyzing programming languages. Matching logic is a logic designed to state and reason about structural properties over arbitrary program configurations. Matching logic reachability is a unifying framework for operational and axiomatic semantics of programing languages. The MatchC verifier (http://matching-logic.org/) checks reachability properties of programs written in a deterministic fragment of C and is implemented in the K framework. This paper discusses the correctness of the implementation of the matching logic reachability proof system in MatchC. The main contributions of this paper are the implementation of the verifier, with emphasis on using K for program verification, and the evaluation of the tool on a large number of programs, including complex ones, like programs implementing the AVL trees data structure and the Schorr-Waite graph marking algorithm.
- PDF, Matching Logic, DOI, BIB
- The Rewriting Logic Semantics Project: A Progress Report
- Jose Meseguer and Grigore Rosu
- FCT'11, Lecture Notes in Computer Science 6914, pp 1-37. 2011
- Abstract. Rewriting logic is an executable logical framework well suited for the semantic definition of languages. Any such framework has to be judged by its effectiveness to bridge the existing gap between language definitions on the one hand, and language implementations and language analysis tools on the other. We give a progress report on how researchers in the rewriting logic semantics project are narrowing the gap between theory and practice in areas such as: modular semantic definitions of languages; scalability to real languages; support for real time; semantics of software and hardware modeling languages; and semantics-based analysis tools such as static analyzers, model checkers, and program provers.
- PDF, K, DOI, FCT'11, BIB
- An Executable Formal Semantics of C with Applications
- Chucky Ellison and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/25816, July 2011
- Abstract. This paper describes an executable formal semantics of C. Being executable, the semantics has been thoroughly tested against the GCC torture test suite and successfully passes 770 of 776 test programs. It is the most complete and thoroughly tested formal definition of C to date. The semantics yields an interpreter, debugger, state space search tool, and model checker "for free". The semantics is shown capable of automatically finding program errors, both statically and at runtime. It is also used to enumerate nondeterministic behavior.
- PDF, C Semantics, DOI, BIB
- Towards Semantics-Based WCET Analysis
- Mihail Asavoae and Dorel Lucanu and Grigore Rosu
- WCET'11. 2011
- Abstract. This paper proposes the use of formal semantics as a basis for worst-case execution time (WCET) analysis. Specifically, the semantics of a RISC assembly language is formally defined using recent advances in rewrite-based semantics, and then is used to discover and eliminate erroneous execution paths in the context of WCET analysis. This paper makes two novel contributions: (1) it shows that using a formal semantics of the employed language can be practically feasible in WCET analysis (not only theoretically desirable); and (2) it shows that the discovery and elimination of erroneous execution paths can not only improve the WCET estimation, but can also be achieved using off-the-shelf technology for rewrite-based semantics.
- PDF, Slides(PDF), Matching Logic, WCET'11, BIB
- Garbage Collection for Monitoring Parametric Properties
- Dongyun Jin and Patrick O'Neil Meredith and Dennis Griffith and Grigore Rosu
- PLDI'11, ACM, pp 415-424. 2011
- Abstract. Parametric properties are behavioral properties over program events that depend on one or more parameters. Parameters are bound to concrete data or objects at runtime, which makes parametric properties particularly suitable for stating multi-object relationships or protocols. Monitoring parametric properties independently of the employed formalism involves slicing traces with respect to parameter instances and sending these slices to appropriate non-parametric monitor instances. The number of such instances is theoretically unbounded and tends to be enormous in practice, to an extent that how to efficiently manage monitor instances has become one of the most challenging problems in runtime verification. The previous formalism-independent approach was only able to do the obvious, namely to garbage collect monitor instances when all bound parameter objects were garbage collected. This led to pathological behaviors where unnecessary monitor instances were kept for the entire length of a program. This paper proposes a new approach to garbage collecting monitor instances. Unnecessary monitor instances are collected lazily to avoid creating undue overhead. This lazy collection, along with some careful engineering, has resulted in RV, the most efficient parametric monitoring system to date. Our evaluation shows that the average overhead of RV in the DaCapo benchmark is 15%, which is two times lower than that of JavaMOP and orders of magnitude lower than that of Tracematches.
- PDF, Slides(PPTX), Slides(PDF), MOP, DOI, PLDI'11, BIB
- An Overview of the MOP Runtime Verification Framework
- Patrick O'Neil Meredith and Dongyun Jin and Dennis Griffith and Feng Chen and Grigore Rosu
- J.STTT, Volume 14(3), pp 249-289. 2011
- Abstract. This article gives an overview of the Monitoring Oriented Programming framework (MOP). In MOP, runtime monitoring is supported and encouraged as a fundamental principle for building reliable systems. Monitors are automatically synthesized from specified properties and are used in conjunction with the original system to check its dynamic behaviors. When a specification is violated or validated at runtime, user-defined actions will be triggered, which can be any code, such as information logging or runtime recovery. Two instances of MOP are presented: JavaMOP (for Java programs) and BusMOP (for monitoring PCI bus traffic). The architecture of MOP is discussed, and an explanation of parametric trace monitoring and its implementation is given. A comprehensive evaluation of JavaMOP attests to its efficiency, especially in comparison with similar systems. The implementation of BusMOP is discussed in detail. In general, BusMOP imposes no runtime overhead on the system it is monitoring.
- PDF, MOP, DOI, J.STTT, BIB
- Matching Logic: A New Program Verification Approach (NIER Track)
- Grigore Rosu and Andrei Stefanescu
- ICSE'11, ACM, pp 868-871. 2011
- Abstract. Matching logic is a new program verification logic, which builds upon operational semantics. Matching logic specifications are constrained symbolic program configurations, called patterns, which can be matched by concrete configurations. By building upon an operational semantics of the language and allowing specifications to directly refer to the structure of the configuration, matching logic has at least three benefits that could be key factors in its potential usability: (1) One's familiarity with the formalism reduces to one's familiarity with the operational semantics of the language, that is, with the language itself; (2) The verification process proceeds the same way as the execution of the program, making debugging failed proof attempts manageable because one can always see the ``current configuration and ``what went wrong, same like in a debugger; and (3) Nothing is lost in translation, that is, there is no gap between the language itself and its verifier. Moreover, direct access to the structure of the configuration facilitates defining sub-patterns that one may reason about, such as disjoint lists or trees in the heap, as well as supporting framing in various components of the configuration at no additional costs.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, ICSE'11, BIB
- Mining Parametric Specifications
- Choonghwan Lee and Feng Chen and Grigore Rosu
- ICSE'11, ACM, pp 591-600. 2011
- Abstract. Specifications carrying formal parameters that are bound to concrete data at runtime can effectively and elegantly capture multi-object behaviors or protocols. Unfortunately, parametric specifications are not easy to formulate by non-experts and, consequently, are rarely available. This paper presents a general approach for mining parametric specifications from program executions, based on a strict separation of concerns: (1) a trace slicer first extracts sets of independent interactions from parametric execution traces; and (2) the resulting non-parametric trace slices are then passed to any conventional non-parametric property learner. The presented technique has been implemented in jMiner, which has been used to automatically mine many meaningful and non-trivial parametric properties of OpenJDK 6.
- PDF, Slides(PPTX), JMiner, DOI, ICSE'11, BIB
- A Rewriting Approach to Concurrent Programming Language Design and Semantics
- Traian Florin Serbanuta
- PhD Thesis, University of Illinois at Urbana-Champaign. December 2010
- Abstract. A plethora of programming languages have been and continue to be developed to keep pace with hardware advancements and the ever more demanding requirements of software development. As these increasingly sophisticated languages need to be well understood by both programmers and implementors, precise specifications are increasingly required. Moreover, the safety and adequacy with respect to requirements of programs written in these languages needs to be tested, analyzed, and, if possible, proved. This dissertation proposes a rigorous approach to define programming languages based on rewriting, which allows to easily design and test language extensions, and to specify and analyze safety and adequacy of program executions. To this aim, this dissertation describes the K Framework, an executable semantic framework inspired from rewriting logic but specialized and optimized for programming languages. The K Framework consists of three components: (1) a language definitional technique; (2) a specialized notation; and (3) a resource-sharing concurrent rewriting semantics. The language definitional technique is a rewriting technique built upon the lessons learned from capturing and studying existing operational semantics frameworks within rewriting logic, and upon attempts to combine their strengths while avoiding their limitations. The specialized notation makes the technical details of the technique transparent to the language designer, and enhances modularity, by allowing the designer to specify the minimal context needed for a semantic rule. Finally, the resource-sharing concurrent semantics relies on the particular form of the semantic rules to enhance concurrency, by allowing overlapping rule instances (e.g., two threads writing in different locations in the store, which overlap on the store entity) to apply concurrently as long as they only overlap on the parts they do not change. The main contributions of the dissertation are: (1) a uniform recasting of the major existing operational semantics techniques within rewriting logic; (2) an overview description of the K Framework and how it can be used to define, extend and analyze programming languages; (3) a semantics for K concurrent rewriting obtained through an embedding in graph rewriting; and (4) a description of the K-Maude tool, a tool for defining programming languages using the K technique on top of the Maude rewriting language.
- PDF, K, DOI, BIB
- Automating Coinduction with Case Analysis
- Eugen-Ioan Goriac and Dorel Lucanu and Grigore Rosu
- ICFEM'10, Springer, pp 220-236. 2010
- Abstract. Coinduction is a major technique employed to prove behavioral properties of systems, such as behavioral equivalence. Its automation is highly desirable, despite the fact that most behavioral problems are $\Pi_2^0$-complete. Circular coinduction, which is at the core of the $\CIRC$ prover, automates coinduction by systematically deriving new goals and proving existing ones until, hopefully, all goals are proved. Motivated by practical examples, circular coinduction and $\CIRC$ have been recently extended with several features, such as special contexts, generalization and simplification. Unfortunately, none of these extensions eliminates the need for case analysis and, consequently, there are still many natural behavioral properties that $\CIRC$ cannot prove automatically. This paper presents an extension of circular coinduction with case analysis constructs and reasoning, as well as its implementation in $\CIRC$. To uniformly prove the soundness of this extension, as well as of past and future extensions of circular coinduction and $\CIRC$, this paper also proposes a general correct-extension technique based on equational interpolants.
- PDF, Circ, DOI, ICFEM'10, BIB
- A Formal Semantics of C with Applications
- Chucky Ellison and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/17414, November 2010
- Abstract. This paper describes an executable formal semantics of C expressed using a formalism based on term rewriting. Being executable, the semantics has been thoroughly tested against the GCC torture test suite and successfully passes over 96\% of 715 test programs. It is the most complete and thoroughly tested formal definition of C to date. The semantics yields an interpreter, debugger, and state space search tool "for free". The semantics is shown capable of automatically finding program errors, both statically and at runtime. It is also used to enumerate nondeterministic behavior. These techniques together allow the tool to identify undefined programs.
- PDF, C Semantics, DOI, BIB
- Runtime Verification with the RV System
- Patrick Meredith and Grigore Rosu
- RV'10, Springer, pp 136-152. 2010
- Abstract. The RV system is the first system to merge the benefits of Runtime Monitoring with Predictive Analysis. The Runtime Monitoring portion of RV is based on the successful Monitoring Oriented Programming system developed at the University of Illinois [6, 7, 9, 21, 5], while the Predictive Analysis capability is a vastly expanded version of the jPredictor System also developed at the University of Illinois [11, 14]. With the RV system, runtime monitoring is supported and encouraged as a fundamental principle for building reliable software: monitors are automatically synthesized from specified properties and integrated into the original system to check its dynamic behaviors. When certain conditions of interest occur, such as a violation of a specification, user-defined actions will be triggered, which can be any code from information logging to runtime recovery. The RV system supports the monitoring of parametric properties that may specify a relationship between objects. Properties may be defined using one of several logical formalisms, such as: extended regular languages, context-free patterns, deterministic finite state machines, linear temporal logic, and past time linear temporal logic. The system is designed in such a way that adding new logical formalisms is a relatively simple task. The predictive capabilities allow any of these monitoring specifications to be extended to checking not just the actual runtime traces of program execution, but any trace that may be inferred from a constructed casual model. The Predictive Analysis also features built in algorithms for race detection and atomicity violations, that are both highly useful in concurrent system design and difficult to specify in terms of formal specification languages.
- PDF, Slides(PPT), Slides(PDF), RV, DOI, RV'10, BIB
- Matching Logic: A New Program Verification Approach
- Grigore Rosu and Andrei Stefanescu
- UV'10. 2010
- Abstract. Matching logic is a new program verification logic, which builds upon operational semantics. Matching logic specifications are constrained symbolic program configurations, called *patterns*, which can be *matched* by concrete configurations. By building upon an operational semantics of the language and allowing specifications to directly refer to the structure of the configuration, matching logic has at least three benefits that could be key factors in its usability: (1) One's familiarity with the formalism reduces to one's familiarity with the operational semantics of the language, that is, with the language itself; (2) The verification process proceeds the same way as the execution of the program, making debugging failed proof attempts manageable because one can always see the ``current configuration and ``what went wrong, same like in a debugger; and (3) Nothing is lost in translation, that is, there is no gap between the language itself and its verifier. Moreover, direct access to the structure of the configuration facilitates defining sub-patterns that one may reason about, such as disjoint lists or trees in the heap, as well as supporting framing in various components of the configuration at no additional costs.
- Slides(PPTX), Slides(PDF), Matching Logic, UV'10
- Maximal Causal Models for Sequentially Consistent Multithreaded Systems
- Traian Florin Serbanuta and Feng Chen and Grigore Rosu
- Technical Report http://hdl.handle.net/2142/17336, September 2010
- Abstract. This paper shows that it is possible to build a theoretically maximal and sound causal model for concurrent computations from a given execution trace. For an observed execution, the proposed model comprises all consistent executions which can be derived from it using only knowledge about the execution machine. The existence of such a model is of great theoretical value. First, by comprising all feasible executions, it can be used to prove soundness of other causal models: indeed, several models underlying existing techniques are shown to be embedded into the maximal model, so all these models are sound. Second, since it is maximal, the proposed model allows for natural and causal-model-independent definitions of trace-based properties; this paper proposes maximal definitions for causal dataraces and causal atomicity. Finally, although defined axiomatically, the set of traces comprised by the proposed model are shown to be effectively constructed from an initial observed trace. Thus, maximal causal models are not only theoretically relevant, but they are also amenable for developing practical analysis tools.
- PDF, JPredictor, DOI, BIB
- Matching Logic: An Alternative to Hoare/Floyd Logic
- Grigore Rosu and Chucky Ellison and Wolfram Schulte
- AMAST'10, Springer, pp 142-162. 2010
- Abstract.
- PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, AMAST'10, BIB
- K-Maude: A Rewriting Based Tool for Semantics of Programming Languages
- Traian Florin Serbanuta and Grigore Rosu
- WRLA'10, Springer, pp 104-122. 2010
- Abstract. K is a rewriting-based framework for defining programming languages. K-Maude is a tool implementing K on top of Maude. K-Maude provides an interface accepting K modules along with regular Maude modules and a collection of tools for transforming K language definitions into Maude rewrite theories for execution or analysis, or into LaTeX for documentation purposes. The current K-Maude prototype was successfully used in defining several languages and language analysis tools, both for research and for teaching purposes. This paper describes the K-Maude tool, both from a user and from an implementer perspective.
- PDF, Slides(PDF), K, DOI, WRLA'10, BIB
- Matching Logic --- Extended Report
- Grigore Rosu and Wolfram Schulte
- Technical Report http://hdl.handle.net/2142/10790, January 2009
- Abstract. Hoare logics rely on the fact that logic formulae can encode, or specify, program states, including environments, stacks, heaps, path conditions, data constraints, and so on. Such formula encodings tend to lose the structure of the original program state and thus to be complex in practice, making it difficult to relate formal systems and program correctness proofs to the original programming language and program, respectively. Worse, since programs often manipulate mathematical objects such as lists, trees, graphs, etc., one needs to also encode, as logical formulae, the process of identifying these objects in the encoded program state. This paper proposes atching logic, an alternative to Hoare logics in which the state structure plays a crucial role. Program states are represented as algebraic data-types called (concrete) configurations, and program state specifications are represented as configuration terms with variables and constraints on them, called (configuration) patterns. A pattern specifies those configurations that match it. Patterns can bind variables to their scope, allowing both for pattern abstraction and for expressing loop invariants. Matching logic is tightly connected to rewriting logic semantics (RLS): matching logic formal systems can systematically be obtained from executable RLS of languages. This relationship allows to prove soundness of matching logic formal systems w.r.t. complementary, testable semantics. All notions are exemplified using KernelC, a fragment of C with dynamic memory allocation/deallocation.
- PDF, Matching Logic, DOI, BIB
- Monitoring Algorithms for Metric Temporal Logic
- Prasanna Thati and Grigore Rosu
- RV'05, Electronic Notes in Theoretical Computer Science 113, pp 145-162. 2005
- Abstract. Program execution traces can be so large in practical testing and monitoring applications that it would be very expensive, if not impossible, to store them for detailed analysis. Monitoring execution traces {\em without storing} them, can be a nontrivial matter for many specification formalisms, because complex formulae may require a considerable amount of information about the past. {\em Metric temporal logic} (MTL) is an extension of propositional linear temporal logic with discrete-time-bounded temporal operators. In MTL, one can specify time limits within which certain temporal properties must hold, thus making it very suitable to express real-time monitoring requirements. In this paper, we present monitoring algorithms for checking timestamped execution traces against formulae in MTL or certain important sublogics of it. We also present lower bounds for the monitoring problem, showing that the presented algorithms are asymptotically optimal.
- PDF, MOP, DOI, RV'05, BIB
- Rewriting-Based Techniques for Runtime Verification
- Grigore Rosu and Klaus Havelund
- J.ASE, Volume 12(2), pp 151-197. 2005
- Abstract. Techniques for efficiently evaluating future time Linear Temporal Logic (abbreviated LTL) formulae on finite execution traces are presented. While the standard models of LTL are infinite traces, finite traces appear naturally when testing and/or monitoring real applications that only run for limited time periods. A finite trace variant of LTL is formally defined, together with an immediate executable semantics which turns out to be quite inefficient if used directly, via rewriting, as a monitoring procedure. Then three algorithms are investigated. First, a simple synthesis algorithm for monitors based on dynamic programming is presented; despite the efficiency of the generated monitors, they unfortunately need to analyze the trace backwards, thus making them unusable in most practical situations. To circumvent this problem, two rewriting-based practical algorithms are further investigated, one using rewriting directly as a means for online monitoring, and the other using rewriting to generate automata-like monitors, called binary transition tree finite state machines (and abbreviated BTT-FSMs). Both rewriting algorithms are implemented in Maude, an executable specification language based on a very efficient implementation of term rewriting. The first rewriting algorithm essentially consists of a set of equations establishing an executable semantics of LTL, using a simple formula transforming approach. This algorithm is further improved to build automata on-the-fly via caching and reuse of rewrites (called memoization), resulting in a very efficient and small Maude program that can be used to monitor program executions. The second rewriting algorithm builds on the first one and synthesizes provably minimal BTT-FSMs from LTL formulae, which can then be used to analyze execution traces online without the need for a rewriting system. The presented work is part of an ambitious runtime verification and monitoring project at NASA Ames, called {\sc PathExplorer}, and demonstrates that rewriting can be a tractable and attractive means for experimenting and implementing logics for program monitoring.
- PDF, MOP, DOI, J.ASE, BIB
- CS322 - Programming Language Design: Lecture Notes
- Grigore Rosu
- Technical Report http://hdl.handle.net/2142/11385, December 2003
- Abstract. This report contains the complete lecture notes for CS322, Programming Language Design, taught by Grigore Rosu in the Fall 2003 semester at the University of Illinois at Urbana Champaign. This large PDF document has been generated automatically from the CS322's website at: http://fsl.cs.uiuc.edu/~grosu/classes/2003/fall/cs322/. Of particular importance is the novel technique for defining concurrent languages that starts at page 673, based on a first-order representation of computations (called "continuations" for simplicity there, though only their suffix is an actual "continuation structure").
- PDF, Matching Logic, DOI, BIB
- Generating Optimal Linear Temporal Logic Monitors by Coinduction
- Koushik Sen and Grigore Rosu and Gul Agha
- ASIAN'03, Lecture Notes in Computer Science (LNCS) 2896, pp 260-275. 2003
- Abstract. A coinduction-based technique to generate an optimal monitor from a Linear Temporal Logic (LTL) formula is presented in this paper. Such a monitor receives a sequence of states (one at a time) from a running process, checks them against a requirements specification expressed as an LTL formula, and determines whether the formula has been violated or validated. It can also say whether the LTL formula is not monitorable any longer, i.e., that the formula can in the future neither be violated nor be validated. A Web interface for the presented algorithm adapted to extended regular expressions is available.
- PDF, MOP, DOI, ASIAN'03, BIB
- Certifying Measurement Unit Safety Policy
- Grigore Rosu and Feng Chen
- ASE'03, IEEE, pp 304-309. 2003
- Abstract. Measurement unit safety policy checking is a topic in software analysis concerned with ensuring that programs do not violate basic principles of units of measurement. Such violations can hide significant domain-specific errors which are hard or impossible to find otherwise. Measurement unit analysis by means of automatic deduction is addressed in this paper. We draw general design principles for measurement unit certification tools and discuss our prototype for the C language, which includes both dynamic and static checkers. Our approach is based on assume/assert annotations of code, which are properly interpreted by our deduction-based tools and ignored by standard compilers. We do not modify the language in order to support units. The approach can be extended to incorporate other safety policies without great efforts.
- PDF, MOP, DOI, ASE'03, BIB
- Certifying and Synthesizing Membership Equational Proofs
- Grigore Rosu and Steven Eker and Patrick Lincoln and Jose Meseguer
- FME'03, Lecture Notes in Computer Science (LNCS) 2805, pp 359-380. 2003
- Abstract. As the systems we have to specify and verify become larger and more complex, there is a mounting need to combine different tools and decision procedures to accomplish large proof tasks. The problem, then, is how to be sure that we can trust the correctness of a heterogeneous proof. In this work we focus on certification and synthesis of equational proofs, that are pervasive in most proof tasks and for which many tools are poorly equipped. Fortunately, equational proof engines like ELAN and Maude can perform millions of equational proof steps per second which, if certified by proof objects, can be trusted by other tools. We present a general method to certify and synthesize proofs in membership equational logic, where the synthesis may involve generating full proofs from proof traces modulo combinations of associativity, commutativity, and identity axioms. We propose a simple representation for proof objects and give algorithms that can synthesize space-efficient, machine-checkable proof objects from proof traces.
- PDF, Slides(PPT), Maude, DOI, FME'03, BIB
- Generating Optimal Monitors for Extended Regular Expressions
- Koushik Sen and Grigore Rosu
- RV'03, Electronic Notes in Theoretical Computer Science 89(2), pp 226-245. 2003
- Abstract. Ordinary software engineers and programmers can easily understand regular patterns, as shown by the immense interest in and the success of scripting languages like Perl, based essentially on regular expression pattern matching. We believe that regular expressions provide an elegant and powerful specification language also for {\em monitoring requirements}, because an execution trace of a program is in fact a string of states. Extended regular expressions (EREs) add complementation to regular expressions, which brings additional benefits by allowing one to specify patterns that must {\em not} occur during an execution. Complementation gives one the power to express patterns on strings more compactly. In this paper we present a technique to generate {\em optimal monitors} from EREs. Our monitors are deterministic finite automata (DFA) and our novel contribution is to generate them using a modern coalgebraic technique called {\em coinduction}. Based on experiments with our implementation, which can be publicly tested and used over the web, we believe that our technique is more efficient than the simplistic method based on complementation of automata which can quickly lead to a highly-exponential state explosion.
- PDF, MOP, DOI, RV'03, BIB
- Testing Extended Regular Language Membership Incrementally by Rewriting
- Grigore Rosu and Mahesh Viswanathan
- RTA'03, Lecture Notes in Computer Science (LNCS) 2706, pp 499-514. 2003
- Abstract. In this paper we present lower bounds and rewriting algorithms for testing membership of a word in a regular language described by an extended regular expression. Motivated by intuitions from monitoring and testing, where the words to be tested (execution traces) are typically much longer than the size of the regular expressions (patterns or requirements), and by the fact that in many applications the traces are only available incrementally, on an event by event basis, our algorithms are based on an event-consumption idea: a just arrived event is ``consumed by the regular expression, i.e., the regular expression modifies itself into another expression discarding the event. We present an exponential space lower bound for monitoring extended regular expressions and argue that the presented rewriting-based algorithms, besides their simplicity and elegance, are practical and almost as good as one can hope. We experimented with and evaluated our algorithms in Maude.
- PDF, Slides(PPT), MOP, DOI, RTA'03, BIB
- Rule-Based Analysis of Dimensional Safety
- Feng Chen and Grigore Rosu and Ram prasad Venkatesan
- RTA'03, Lecture Notes in Computer Science (LNCS) 2706, pp 197-207. 2003
- Abstract. Dimensional safety policy checking is an old topic in software analysis concerned with ensuring that programs do not violate basic principles of units of measurement. Scientific and/or navigation software is routinely dimensional and violations of measurement unit safety policies can hide significant domain-specific errors which are hard or impossible to find otherwise. Dimensional analysis of programs written in conventional programming languages is addressed in this paper. We draw general design principles for dimensional analysis tools and then discuss our prototypes, implemented by rewriting, which include both dynamic and static checkers. Our approach is based on assume/assert annotations of code which are properly interpreted by our tools and ignored by standard programming language compilers/interpreters. The output of our prototypes consists of warnings that list those expressions violating the unit safety policy. These prototypes are implemented in the rewriting system Maude, using more than 2,000 rewriting rules. This paper presents a non-trivial application of rewriting techniques to software analysis.
- PDF, MOP, DOI, RTA'03, BIB
- Towards Monitoring-Oriented Programming: A Paradigm Combining Specification and Implementation
- Feng Chen and Grigore Rosu
- RV'03, Electronic Notes in Theoretical Computer Science 89(2), pp 108-127. 2003
- Abstract. With the explosion of software size, checking conformance of implementation to specification becomes an increasingly important but also hard problem. Current practice based on ad-hoc testing does not provide correctness guarantees, while highly confident traditional formal methods like model checking and theorem proving are still too expensive to become common practice. In this paper we present a paradigm for combining formal specification with implementation, called monitoring-oriented programming (MoP), providing a light-weighted formal method to check conformance of implementation to specification at runtime. System requirements are expressed using formal specifications given as annotations inserted at various user selected places in programs. Efficient monitoring code using the same target language as the implementation is then automatically generated during a pre-compilation stage. The generated code has the same effect as a logical checking of requirements and can be used in any context, in particular to trigger user defined actions, when requirements are violated. Our proposal is language- and logic- independent, and we argue that it smoothly integrates other interesting system development paradigms, such as design by contract and aspect oriented programming. A prototype has been implemented for Java, which currently supports requirements expressed using past time and future time linear temporal logics, as well as extended regular expressions.
- PDF, MOP, DOI, RV'03, BIB
- Inductive Behavioral Proofs by Unhiding
- Grigore Rosu
- CMCS'03, Volume 82(1), pp 285-302. 2003
- Abstract. We show that for any behavioral -specification there is an ordinary algebraic specification over a larger signature, such that a model behaviorally satisfies iff it satisfies, in the ordinary sense, the -theorems of . The idea is to add machinery for contexts and experiments (sorts, operations and equations), use it, and then hide it. We develop a procedure, called unhiding, which takes a finite and produces a finite . The practical aspect of this procedure is that one can use any standard equational inductive theorem prover to derive behavioral theorems, even if neither equational reasoning nor induction is sound for behavioral satisfaction.
OOPSLA'12
- Checking Reachability using Matching Logic
- Grigore Rosu and Andrei Stefanescu
- OOPSLA'12, ACM, pp 555-574. 2012
- Abstract. This paper presents a verification framework that is parametric in a (trusted) operational semantics of some programming language. The underlying proof system is language-independent and consists of eight proof rules. The proof system is proved partially correct and relatively complete (with respect to the programming language configuration model). To show its practicality, the generic framework is instantiated with a fragment of C and evaluated with encouraging results.
Technical Report
- Checking Reachability using Matching Logic
- Grigore Rosu and Andrei Stefanescu
- Technical Report http://hdl.handle.net/2142/33771, August 2012
- Abstract. This paper presents a verification framework that is parametric in a (trusted) operational semantics of some programming language. The underlying proof system is language-independent and consists of eight proof rules. The proof system is proved partially correct and relatively complete (with respect to the programming language configuration model). To show its practicality, the generic framework is instantiated with a fragment of C and evaluated with encouraging results.
- PDF, Matching Logic, DOI, BIB
J.LMCS
- Semantics and Algorithms for Parametric Monitoring
- Grigore Rosu and Feng Chen
- J.LMCS, Volume 8(1), pp 1-47. 2012
- Abstract. Analysis of execution traces plays a fundamental role in many program analysis approaches, such as runtime verification, testing, monitoring, and specification mining. Execution traces are frequently parametric, i.e., they contain events with parameter bindings. Each parametric trace usually consists of many meaningful trace slices merged together, each slice corresponding to one parameter binding. This gives a semantics-based solution to parametric trace analysis. A general-purpose parametric trace slicing technique is introduced, which takes each event in the parametric trace and dispatches it to its corresponding trace slices. This parametric trace slicing technique can be used in combination with any conventional, non-parametric trace analysis technique, by applying the later on each trace slice. As an instance, a parametric property monitoring technique is then presented. The presented parametric trace slicing and monitoring techniques have been implemented and extensively evaluated. Measurements of runtime overhead confirm that the generality of the discussed techniques does not come at a performance expense when compared with existing parametric trace monitoring systems.
OOPSLA'12
- Checking Reachability using Matching Logic
- Grigore Rosu and Andrei Stefanescu
OOPSLA'12, ACM, pp 555-574. 2012
PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, OOPSLA'12, BIB
- Checking Reachability using Matching Logic
- Grigore Rosu and Andrei Stefanescu
OOPSLA'12, ACM, pp 555-574. 2012
PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, OOPSLA'12, BIB
- Checking Reachability using Matching Logic
- Grigore Rosu and Andrei Stefanescu
OOPSLA'12, ACM, pp 555-574. 2012
PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, OOPSLA'12, BIB
OOPSLA'12
- Checking Reachability using Matching Logic
- Grigore Rosu and Andrei Stefanescu
OOPSLA'12, ACM, pp 555-574. 2012
PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, OOPSLA'12, BIB
OOPSLA'12
- Checking Reachability using Matching Logic
- Grigore Rosu and Andrei Stefanescu
OOPSLA'12, ACM, pp 555-574. 2012
PDF, Slides(PPTX), Slides(PDF), Matching Logic, DOI, OOPSLA'12, BIB