Skip to Content

Specification Refinement

When teaching formal methods, I often get asked how we can connect the specifications we write to the code we produce. One way we do this is with refinement. All examples are in TLA+.1

Imagine we’re designing a multithreaded counter. We don’t know how we’re going to implement it yet, so we start by specifying the abstract behavior.2

---- MODULE abstract ----
EXTENDS Integers, Sequences

VARIABLES pc, counter
vars == <<pc, counter>>

\* Two threads
Threads == 1..2

\* State transition action
Trans(thread, from, to) ==
  /\ pc[thread] = from
  /\ pc' = [pc EXCEPT ![thread] = to]

\* Both threads start in the `start` state
Init ==
  /\ pc = [t \in Threads |-> "start"]
  /\ counter = 0

Next ==
  \* Pick one thread
  \/ \E t \in Threads:
     \* Move it to `done`
     /\ Trans(t, "start", "done")
     \* Increment counter
     /\ counter' = counter + 1

Spec == Init /\ [][Next]_vars

====

We find out that our implementation system can’t do atomic updates. If we want to increment the counter, we have to nonatomically read the value into a local variable and write a new value in a separate step. We add that to a new specification:

---- MODULE bad ----
EXTENDS Integers, Sequences

VARIABLES pc, counter, tmp
vars == <<pc, counter, tmp>>

Threads == 1..2

States == {"start", "inc", "done"}

Trans(thread, from, to) ==
  /\ pc[thread] = from
  /\ pc' = [pc EXCEPT ![thread] = to]

Init ==
  /\ pc = [t \in Threads |-> "start"]
  /\ counter = 0
  /\ tmp = [t \in Threads |-> 0]

GetCounter(t) ==
  /\ tmp' = [tmp EXCEPT ![t] = counter]
  /\ UNCHANGED counter

IncCounter(t) ==
  /\ counter' = tmp[t] + 1
  /\ UNCHANGED tmp

Next ==
  \/ \E t \in Threads:
    \* Step to get the counter value
    \/ /\ Trans(t, "start", "inc")
       /\ GetCounter(t)
    \* Step to increment the counter
    \/ /\ Trans(t, "inc", "done")
       /\ IncCounter(t)

Spec == Init /\ [][Next]_vars
====

We now have an additional thread state and a local tmp per thread. This is trying to do the same thing as abstract with a more fleshed-out implementation. I want to encode that “trying to do” as a property. Is bad a faithful implementation of abstract?

Refinement

TLA+ specifications are “just” temporal logic predicates, or boolean functions. If we have

Spec == Init /\ [][Next]_vars

Then Spec is the predicate “Init is true for the initial state and [Next]_vars is true of all subsequent states.” This specifies our system: a sequence of states counts as a system behavior if-and-only-if Spec is true for that sequence.

When we tell TLC to check Inv as an invariant of Spec, that’s equivalent to verifying that Spec => []Inv. If Spec is true of a sequence of states, then Inv is true for all states in that behavior. That makes Inv an invariant. Liveness properties, like “L is eventually true”, work the same way.

We say Impl refines Abstract if

ImplSpec => AbstractSpec

If ImplSpec is true, then AbstractSpec is also true. But AbstractSpec is the predicate that describes valid behaviors of Abstract. So if this statement is true, any behavior of Impl is also a behavior of Abstract. If we make Impl a more detailed version of Abstract, we can use this to “close the gap” between our high-level behavior abstractions and our low-level code.

I think of it like a game: we initialize both specs and make “moves” in the Impl state space. After every move, we make an “equivalent” move in the Abstract state space. If we can always make an “equivalent” move, then Impl refines Abstract. Otherwise it’s possible for Impl to do something that Abstract cannot do, so Impl cannot be a more detailed implementation of Abstract.3

To prove refinement, we need to specify what counts as an “equivalent” move. That means finding a way to transform bad behaviors into abstract behaviors. This transformation is called a refinement mapping.

Checking Correspondence

bad is more detailed than abstract: it has one more thread state and an additional variable. As part of the refinement mapping we want to say that these are “implementation details” that don’t matter to the refinement. We do this the INSTANCE keyword. This lets us import another module into our current one, but replace the values of state variables with expressions.

Mapping == 
  INSTANCE abstract WITH
    counter <- counter,
    pc <- [t \in Threads |-> 
      IF pc[t] = "inc" THEN "start"
      ELSE pc[t]
      ]

Refinement == Mapping!Spec

To understand what’s going on here, let’s look at a step where pc[1] transitions from "start" to "inc". This ends us in a valid state for bad, but and invalid one for abstract, which only has threads states "start" and "done". However, in our refinement mapping, we replace all instances of "inc" with "start". So Mapping!pc[1] = "start", which is a valid state for abstract.

Now that we have a refinement mapping, we can test Refinement as a temporal property. 4

Error: Action property of module abstract is violated.
Error: The behavior up to this point is:
State 1: 
/\ counter = 0
/\ pc = <<"start", "start">>
/\ tmp = <<0, 0>>

State 2: 
/\ counter = 0
/\ pc = <<"start", "inc">>
/\ tmp = <<0, 0>>

State 3:
/\ counter = 0
/\ pc = <<"inc", "inc">>
/\ tmp = <<0, 0>>

State 4:
/\ counter = 1
/\ pc = <<"done", "inc">>
/\ tmp = <<0, 0>>

State 5:
/\ counter = 1
/\ pc = <<"done", "done">>
/\ tmp = <<0, 0>>

This error trace is a valid behavior for bad, but it is not one for abstract. There is no way to take the abstract spec and end with both threads in "done" and counter = 1. Therefore, we conclude that our implementation is not a valid refinement of our design.5

Fixing this

How can we fix the race condition? By implementing a lock, of course!

---- MODULE good ----
EXTENDS Integers, Sequences

VARIABLES pc, counter, tmp, lock
vars == <<pc, counter, tmp, lock>>

Threads == 1..2

States == {"start", "inc", "done"}

Trans(thread, from, to) ==
  /\ pc[thread] = from
  /\ pc' = [pc EXCEPT ![thread] = to]

Init ==
  /\ pc = [t \in Threads |-> "start"]
  /\ counter = 0
  /\ tmp = <<0, 0>>
  /\ lock = 0

AcquireLock(t) ==
  /\ lock = 0
  /\ lock' = t

ReleaseLock(t) ==
  /\ lock = t
  /\ lock' = 0

GetCounter(t) ==
  /\ tmp' = [tmp EXCEPT ![t] = counter]
  /\ UNCHANGED counter

IncCounter(t) ==
  /\ counter' = tmp[t] + 1
  /\ UNCHANGED tmp

Next ==
  \/ \E t \in Threads:
    \/ /\ Trans(t, "start", "inc")
       /\ GetCounter(t)
       /\ AcquireLock(t)
    \/ /\ Trans(t, "inc", "done")
       /\ IncCounter(t)
       /\ ReleaseLock(t)

Spec == Init /\ [][Next]_vars

Mapping == 
  INSTANCE abstract WITH
    counter <- counter,
    pc <- [t \in Threads |-> 
      IF pc[t] = "inc" THEN "start"
      ELSE pc[t]
      ]

Refinement == Mapping!Spec

====

This passes the model checker: good is a valid refinement of abstract. We can then refine good into a more detailed spec, perhaps closer to our machine bytecode, and eventually refine the spec into executable code. There isn’t any tooling to do this in TLA+, but it’s possible. I’ll talk a bit more about spec-to-code refinements later.

Refinement Guarantees in TLA+

Surprisingly, good also a valid refinement of bad! You can think of the refinement as showing that the implementation behaviors map to a subset of the abstract behaviors. good maps to a subset of bad’s behaviors, the subset of bad that also conforms with abstract.

One consequence of this: we know that a given Impl behavior is also an Abstract behavior, but we don’t know if a given Abstract behavior is possible in Impl. We can’t guarantee that every state of Abstract is reachable by our refinement. But we can guarantee all of Abstract’s safety and liveness properties:

  • If Abstract will never do a “bad” thing, then Impl will never do it, either.
  • If Abstract is guaranteed to do a “good” thing, then Impl is guaranteed to do it, too.

good can have additional properties beyond abstract, such as safety and liveness properties on tmp and counter. In general, Impl will have a strict superset of Abstract’s properties.

Refining is Hard

I’ve not seen much industrial use of refinement. Part of this is a lack of knowledge and resources, but there are also a few technical challenges with refining specifications.

The first is that implementations are much larger than specifications, so refinements of a spec will be larger too. As a rule of thumb, the difficulty to write and verify a spec is exponential on its length. That’s why most specs are either high-level models of large systems or low-level models of small systems. With refinement you have the worst of both worlds.

Additionally, all refinements (in TLA+) need to totally describe the behavior of Abstract. Since all state variables need to be refined, we can’t declare any behavioral properties as unnecessary for the refinement.6 Since good simultaneously updates lock and tmp, all further refinements of good must also simultaneously change those two variables. We couldn’t write a spec which sets the lock and updates tmp as two separate actions.7

Finally, part of the reason we use formal specifications in the first place is to get away from the messiness of machine code. If you try to refine all the way from abstract spec to implementable code, at some point you’ll have to deal with Unicode and floating point semantics and filesystems and all that jazz. For these reasons, while refining specifications to code is possible, refining specifications to more detailed specifications is (relatively) more popular.

Ultimately, while I’m optimistic about broader industry adoption of formal methods, I don’t think many people will use refinement. It adds too many essential difficulties on top of what’s already a challenging discipline.8

Refinement in other formalisms

This article was TLA-centric because that’s the specification language I know best. But user-friendly spec-to-code refinement is one of the holy grails of formal methods, so lot of formalisms explore it in some way. To my understanding, the most “refinement-oriented” specification language is Event-B, which I say entirely because refinement is part of the core syntax. It also has a lot of third party tooling for refining specifications to code. But I haven’t used Event-B enough to know how much either happens in practice.

You can also refine code to other code. This isn’t often done because if you already have the code, you can just use the code. But it can make sense if you need to convert between languages. Most famously, seL4 was first specified in Isabelle/HOL, then refined into Haskell, and then refined again into C.

Thanks to Ron Pressler for feedback and corrections. Any further errors are mine alone. I shared the first draft of this essay on my newsletter. If you like my writing, why not subscribe?


  1. Disclaimer, I’m only talking about refinement as it relates to formal specification of systems. So I’m not covering things like stepwise refinement or the refinement calculus. Nothing against them, they’re just out of scope here. [return]
  2. If you’re following along with the TLA+ toolbox, you’ll need to disable deadlock checking. When the algorithm terminates, there are no possible next states, so the model “deadlocks”. [return]
  3. “Formal methods as adversarial game” has a long history and is often used to prove bisimulation. [return]
  4. It’s a temporal property and not an invariant because we’re looking at both states and state transitions here. See Action Properties. [return]
  5. Or, more precisely, that Mapping is not a correct refinement mapping. It could be that we just misunderstood the relationship between our design and implementation, and a different transformation would show us a valid refinement. However, seeing the refinement mapping fail should make us less confident that there is a refinement. [return]
  6. TLA+ has a mechanism for variable hiding, the temporal existential operator (\EE), which lets us remove state variables from refinement mappings. But the model checker can’t check specifications which use it. [return]
  7. This gets especially annoying when you’re using auxilary variables to check complex properties. The refinement also needs to map the aux vars, even though they have no bearing on behavior whatsoever. [return]
  8. One place I think refinement could be very useful: refining partial specifications. That’d be when Abstract is an incomplete specification that covers only a small part of the system’s state. You could, for example, write a specification for a stack ADT and use refinement to verify that some state variables in your system behave like stacks. This would be one way to make reusable libraries and get around the composition problem. [return]