# Xiaohong's Research Diary

## Contents |

## Dec 10, 2016

Classic Mistakes in Testing says that "A tester's itch to start breaking the program is as strong as a programmer's itch to start writing code - and it has the same effect: design work is skimped, and quality suffers." I understand that itch. So is there a way to help programmers write their design down?

## Oct 18, 2016

I had a long discussion with Daejun in the afternoon. We found a very subtle and tricky thing about the Boolean expressions in matching logic. The nonlogical symbol "and" `∧` for Boolean expression is different from the built-in one `∧`, which will be interpreted with intersection of sets. Such difference will make a big difference when they appear together in a formula, and that often causes lots of confusion.

## Oct 14, 2016

Berdine et al studied in their *FSTTCS'04* paper what they called *A Decidable Fragment of Separation Logic*. The heap assertions are not allowed to overlap, and only one data structure of linked lists is concerned. Over such constraints, a complete proof system can be shown in a very nontrivial way. The most interesting fact is that one can reduce the list predicate by only considering two cases. One is when the list is an empty list, and the other is when the list contains only two elements. I haven't get the point very clearly, but it seems to be the breaking point of the paper.

## Oct 12, 2016

I had some more words to say about the reachability property in the first order graph theory. The first order graph theory is a first order language L with a new relation symbol E that is interpreted as the edge relation of a graph. The theory contains one nonlogical axiom

∀x.¬E(x,x)

that forbids self-loops. One wants to define reachability `Reach(x,y)`in our formal theory, which could be inductively defined as

Reach0(x,y) ≡ x = y

Reach1(x,y) ≡ ∃z.E(x,z) ∧ Reach0(z,y)

Reach2(x,y) ≡ ∃z.E(x,z) ∧ Reach1(z,y)

...

Reach (x,y) ≡ ∃n.Reachn(x,y)

Unfortunately, `Reach(x,y)` defined in this way is not a legal well formed formula. In fact, there does not exist a formula `φ(x,y)` with two free variables `x` and `y` such that a model `G` satisfies `φ` in an interpretation that maps `x` to a node `u` and `y` to a node `v` in `G`, whenever `v` is reachable from `u` in `G`.

Assume there exist such a formula `φ(x,y)`, then we build the following three formulas

ψ1 ≡ ∀x∀y.φ(x,y) /* G is strongly connected */

ψ2 ≡ ∀x∃y.E(x,y) ∧ ∀x∀y∀z.(E(x,y)∧E(x,z)→(y=z)) /* For any node, there is one outgoing edge */

ψ3 ≡ ∀x∃y.E(y,x) ∧ ∀x∀y∀z.(E(y,x)∧E(z,x)→(y=z)) /* For any node, there is one ingoing edge */

Therefore, the conjunctive formula `ψ ≡ ψ1 ∧ ψ2 ∧ ψ3` means the graph `G` is a circle. Every circle is a model of `ψ`, and circles can get arbitrarily large, so there must exist an infinite model (based on a theorem in logic), and leads to the contradictory that there does not exist an infinite circle.

## Oct 7, 2016

I had a fruitful discussion with Grigore today. The most interesting thing I learned is what we mean by saying a logic is *sound and complete*. Suppose we have a formal language `L` in which we have some logical axioms comes with it. In many cases, we want to prove such a logic is sound and complete, and once we done that, we think we are fine. Because the logic is complete, then for any formula `P`,

|- P iff |= P

The fact, however, is that whenever we use this formal language in practice, such completeness does not help us a lot, because we will be working with *some model* `M` of the logic, who may have arbitrarily complex properties, say, one of which is `Q`. The property `Q` will not, as it should not, be captured by our logic `L`, because `Q` is so unique that it only belongs to the real world `M` that we are working in. As a result, even if our logic is complete, we will not be able to give `Q` a formal proof.

## Oct 5, 2016

It was interesting to know that in first order graph theory, some P problems cannot be represented as first order formulas. One such typical example is the readability problem, which asks whether a vertex `u` can reach the other vertex ` v ` in a directed graph. That is part of the reason why there were some researchers studied a *decidable* fragment of some logic [1].

I was not very sure why specifications need to be reusable, as is suggested in *runtime verification* [2].

## Oct 4, 2016

I was looking for papers on separation logic provers because I want to have a clearer picture of what a matching logic prover should be like. I soon found a milestone paper by Berdine et al. in 2004 which proposed an aim to "provide algorithms and tools to transfer the simplicity of handwritten proofs with separation logic to an automatic setting".

## Oct 3, 2016

When I went through the tutorial of K, I got interested in the lambda calculus and found a very concise introduction by Barendregt and Barendsen. There are a couple of notation abbreviation conversion that I obviously need to take care of, but the first nontrivial thing is the fixed point theorem, which says that for any `F` there exists an `X = Y F` such that `F X = X`. The lambda term `Y` plays a similar role as an infinite while-loop, that is,

Y F = F(F(F(...))).

The `Y` term plays an important role in recursively defining things.

Dr. David Garlan gave a talk on *self-adaptive system*, which seems to be a control system over software projects. The idea is to get rid of ad-hoc low level programming techniques including exceptions, timeouts, etc., and replace then with an integrated "smart" control mechanism (a control panel) that senses and affects the executing system.

## Sep 25, 2016

Ciobaca et al. shows in their *A Language-Independent Proof System for Full Program Equivalence* that a specification of a piece of code is a reachability rule.

## Sep 20, 2016

I reviewed some basic concepts in first-order logic. I believe it is always a good practice to use a word when you are 100% sure what it means.

A first order *language* consists of two categories of symbols. There are logical symbols that include variables, logical connectives, logic quantifiers and the equality symbol. There are also nonlogical symbols that consist of the *signature* of the language. Nonlogical symbols contain constant symbols, function symbols and relation symbols. If we pick some formula in the language and make them *axioms*, then we will have a first-order theory.

A *structure* or *interpretation* of the language contain a universe as the domain where variables are interpreted. It also associates constant symbols, function symbols and relation symbols with constants, functions and relations over the domain.

When it comes to separation logic, things become more complicated because we have one *sort* of variables that are interpreted as integers, and another sort of variables that are interpreted as heaps. Therefore, the first-order language that we use to represent separation logic is a two-sorted languages, in which we have both *registers variables* and *heap variables*. An assertion in separation logic will have to be transformed to be considered as a well-formed formula in the two-sorted first order language.

## Sep 19, 2016

I reviewed some basic concepts in mathematical logic. An inference rule

premise

-------

conclusion

simply means that if (under an interpretation) the premise holds, then the conclusion holds. In other words, the conclusion of any rule of inference (should) be a tautological consequence of its premises. Reynolds' posted such an inference rule in the separation logic paper that says

p1 => p2 q1 => q2

------------------

p1 * q1 => p2 * q2

which is not valid unless we provide more restrictions on ` p1, p2, q1, q2 `.

## Sep 16, 2016

I found a couple of technical mistakes in Reynolds' *Separation Logic: A Logic for Shared Mutable Data Structures*. Besides that, the paper indeed propose an interesting example and show a sound motivation. The example is the following algorithm that reverse a link list.

j := nil;

while i != nil do

i, j, [i+1] := [i+1], i, j;

We can easily guess an invariant candidate for the while-loop, but in order to prove that the candidate is indeed an invariant, we need a formal definition of what we mean by saying "`i` points to a list". Further, we need to specify that `i` and `j` will not *reach* the same fragment of memory except nil. Such sanity conditions are essential to prove the invariant condition but are tedious and not scalable. That is why we want to have separation of concern when reasoning programs that manipulate data structure in a dynamic way.

## Sep 12, 2016

I had a discussion with Grigore about why and how matching logic is a game changer. Take Hoare logic as an example. Hoare logic has many good properties (such as compositional reasoning) because the semantic model of the logic, i.e., the IMP language itself is compositional reasonable. And IMP is a very simple programming language. If we want something similar in other programming language, for example, Java, we need to manipulate and modify the Hoare logic to some “JHoare logic”. Similarly, separation logic has a semantic domain of IMP with a heap. Separation logic endorses the frame rule because IMP with a heap endorses the frame rule. If we want to use separation logic in other languages, say, Java, we need to modify the logic to some "JSepaLogic".

What matching logic does is to forget all J-xxx staff. There is only one logic called matching logic. If we are using IMP with a heap, then the semantics of the language should make us able to *proof* a (meta-)theorem that serves as the frame rule. If we are using Java, then the semantics of Java should make us able to *proof* a (meta-)theorem that serves as the frame rule, or other interesting rules. If we are using some odd languages that do not satisfy the frame rule, we still might proof some other interesting (meta-)theorem about its nature. One logic, any semantics.

## Sep 11, 2016

I personally found O'Hearn's *Separation Logic Tutorial* very useful as a link source to other papers. The second section "Model theory and proof theory" contains many interesting papers on separation logic and more. From there, I found some papers by Yang discussing the so-called "local reasoning", which highlights the difference between specifications in separation logic and the Hoare triples. The main difference seems to be *the* main contribution that separation logic brings to us, that is, to overcome the frame problem. Frame problem is stated informally as follows. In mathematical logic, especially first-order logic, it is easy to define an action by giving its pre- and post-conditions. An action might change one or some aspects of the system and keep the irrelevant ones unchanged. Frame problem is all about those unchanged aspects. Obviously, we want to prevent verbosely specify in the specification of an action that it will not change any irrelevant attributes of the system, otherwise the logic would become too verbose to use in practice. Separation logic solves the problem by introducing a separation conjunction operator and together with a frame rule.

## Aug 16, 2016

I will be doing a short-term individual project on separation logic and matching logic with Grigore Rosu. The research currently being undertaken in FSL has two tracks. One is the K framework and the other is runtime verification. The future plan is to merge those two tracks, to which I will contribute.