Formally Specifying a Package Manager

This post was commissioned by Graham Christensen as part of the Great Slate fundraiser. Thank you!

I wanted to help out the Great Slate fundraiser and donated two technical essays of the purchaser’s choice. Graham bought one and asked for

Something that has to do with Nix or NixOS

This was a bit of a challenge, given that I didn’t even know what Nix was. I had never used it and still haven’t, nor can I use it for the near future, because my only computer right now is a Windows box.¹ Fortunately, “something” has a lot of leeway and I quickly found a topic I could write about. Nix claims it’s a “fully-functional” package manager. What does that actually mean? Buckle up everyone, it’s time for some formal methods. We’re going to specify a bunch of theoeretical package managers and show what benefits (and possibly drawbacks) a “fully-functional” package manager has.

Readers of this blog will know my spec langs of choice are TLA+, which excels for temporal systems, and Alloy, which is excels for relational data. This kind of straddles both domains, so neither immediately seems the right choice. In the end, I went with Alloy because

Better visualizations
The Alloy 5 Beta just came out
I spend all my time with TLA+ anyway and need a break

Okay, let’s begin. You can download Alloy here if you want to follow along. If this is your first experience with Alloy, I’d recommend reading my introduction first, since it explains all of the terms I’m using here.

A Naive Package Manager

We’ll start with the following assumptions:

All packages are divided into two types: Programs are packages we try to install. Requirements are packages that a program requires. Requirements don’t require anything.
All packages may or may not have an upgrade. Upgrades are linear and acyclic: at most one thing upgrades to any program, a program can’t eventually upgrade into itself, etc. Upgrading cannot change a requirement into a package or vice versa. The set of all programs in a given list of upgrades is called the program version.
There’s no horrible cycle nonsense: a program can’t require something and its upgrade, require its own upgrade, etc. Most of these are implicit in the prior two assumptions, but if we loosen them, this one should still hold.

All of these assumptions may not be valid in practice, and if they aren’t, the validity of the spec doesn’t hold. However, it is still useful for exploring less pathological cases, and a more comprehensive model may not be as informative to laypeople like me.² Let’s get them represented:

open util/ordering[Time]
sig Time {}

abstract sig Package {
  requires: set Requirement,
  installed: set Time,
  upgrade: lone Package
} {
  lone @upgrade.this
  no requires & requires.^@upgrade
}

sig Program extends Package {} { upgrade in Program }
sig Requirement extends Package {} { 
  no requires 
  upgrade in Requirement
}

fun other_versions: Package -> set Package {
  ^upgrade + ^~upgrade
}

fun versions: Package -> set Package {
  other_versions + (Program <: iden)
}

fact NoCycleNonsense {
  all p: Package | p not in p.other_versions
  no (^requires + ^~requires) & versions
}

Time will be our trace. Package has two facts on it: at most one thing upgrades to it, and it can’t require its own upgrade. @ is a workaround to avoid expansion in signature, since otherwise Alloy will replace upgrade with this.upgrade. The functions just give us versions as a niladic relation instead of a monadic function. And we enforce the cycle restrictions as facts, so they cannot occur in any solutions.

For the naive package manager, we’ll say it works as follows:

It can only get one package in a single step.
For a given step, it picks a package that’s not on the system. It attempts to install or update to that package by:
1. If there are any missing requirements, either update or install them. This may take several steps.
2. If there is no version of that package installed, install the package.
3. If there is a prevision version of the package installed, “upgrade” by removing the old package and installing the new one.

This means that if a package has two requirements, we can get it in three steps: install requirement A, install requirement B, install package. Our rules also mean you can only have one version of package at a time. Some of you are already cringing. Everybody else is going to see why this is a bad idea soon enough.

pred install (t, t': Time, p: Package) {
  no p.versions & installed.t
  p.requires in installed.t  

  installed.t' = (installed.t + p)
}


pred upgrade_to (t, t': Time, p: Package) {
  p not in installed.t
  p.requires in installed.t
  some p': Package | {
    p' in (installed.t & (^upgrade).p) //old version    
    installed.t' = (installed.t + p - p')
  }
}

fact Trace {
  no installed.first
  all t: Time - last | let t' = t.next |
    some p: Program | {
      install[t, t', p] or upgrade_to[t, t', p]
      or some p': p.requires | {
        install[t, t', p'] or upgrade_to[t, t', p']
      }
    }
}

Finally, we have an assertion we want to check. The one I will use is that if a package is installed, then all of its requirements must also be installed at all times. Otherwise, we have a requirement breakage.

assert AllRequirements {
  all t: Time | requires[installed.t] in installed.t
}

check AllRequirements for 4

You can probably see where this breaks.

A failure

P1 requires R1 but P0 requires R0. Once we upgrade R1 -> R0 then P1 no longer has all of its requirements installed.³ This is why ~~most~~ some package managers don’t actually upgrade files that way. Instead, they simply install both versions.

pred install (t, t': Time, p: Package) {
  p.requires in installed.t
  installed.t' = (installed.t + p)
}

fact Trace {
  no installed.first
  all t: Time - last | let t' = t.next |
    some p: Program | {
      install[t, t', p]
      or some p': p.requires | install[t, t', p']
    }
}

With this, the problem goes away. This is the basic specification we will be comparing against Nix. For the future, we’ll simplify the spec by saying that the package manager can install both a package and all of the requirements in a single time step. In other words:

pred install (t, t': Time, p: Package) {
-  p.requires in installed.t
-  installed.t' = (installed.t + p)
+  p not in installed.t
+  installed.t' = (installed.t + p + p.requires)

fact Trace {
  no installed.first
  all t: Time - last | let t' = t.next |
    some p: Program | {
      install[t, t', p]
-      or some p': p.requires | install[t, t', p']
    }
}

Completeness

One of the claimed advantages of Nix is that, as a purely functional package manager, it’s pure: the success or failure of a build does not depend on prior builds.⁴ To see why this is an advantage, let’s explore how a build is actually produced. The developer will manually list the requirements a package needs and then upload it; when the user installs the package, it also installs the requirements. But can we guarantee that, if it works on the developer’s machine, it works on the users?

One way we can check is to step outside the bounds of simply representing the package manager and start representing a “simulation” of the wider environment. What we’re going to do is implicitly define another user by adding a dependent inventory. We also create an explicit relation that is whatever packages the developer had to install to match the requirements. First the developer installs the requirements. Whatever she installs will be the explicit for the package. When the user tries to install the package, she also installs everything in explicit. In all cases, this shouldn’t cause a bug. Let’s represent it.

abstract sig Package {
+  explicit: set Requirement,
+  dependent: set Time,
} {
+  explicit in requires
}

pred install (t, t': Time, p: Package) {
+  p.explicit = p.requires & (installed.t' - installed.t)
+  dependent.t' = dependent.t
}

+ pred install_to_dependent (t, t': Time, p: Package) {
+   p not in dependent.t
+   p in installed.t //to enforce that explicit is defined
+   dependent.t' = dependent.t + p + p.explicit
+   installed.t' = installed.t
+ }

fact Trace {
+  no dependent.first
  all t: Time - last | let t' = t.next |
    some p: Program | {
+      install[t, t', p] or install_to_dependent[t, t', p]
    }
} 

assert AllRequirements {
  all t: Time | {
+    requires[dependent.t] in dependent.t
  }
}

I want to call special attention the p.explicit = ... line in install. It looks like an assignment, but that’s me abusing the notation. Predicates don’t set values. They’re just statements that are true or false. If install[t, t', p] is true, then p.explicit = ... must also be true. That ‘acts’ like an assignment. If install[t, t', p] is false, then we don’t know anything about p.explicit. It might be nothing, or it could be every single Requirement in the model. Alloy is free to do whatever it wants with it.

That’s why we have p in installed.t in install_to_dependent. Based on our model, p in installed.t is only true when install[t, t', p] is true, which is only true if p.explicit is behaving nicely. So by adding that extra line to install_to_dependent, we’re only checking the model when we have the expected explicit.

This model also fails.

Another failure

Developer builds P1, which consists of R0 and R1. P1.explicit = R0 + R1.
Developer builds P0. While P0 requires R1, the developer does not add it to explicit, because it was already on her computer via P1. P0.explicit = {}
User installs P0. She’s missing a requirement, and our assertion is violated.

One way to fix this is to simply set explicit = requires. But that’s a cop-out: it presumes the developer knows all of the packages needed in advance. We’re trying to make sure this works even when the developer makes mistakes!

Let’s try a completely different tack. Instead of changing how we calculate the requirements, we can change how the package manager tracks installations. Up until now we’ve simply made installed pairs of packages and times. Let’s instead make it a triple: one package, one program, and a time. The program is whichever one the package was installed for.

-  installed: set Time,
-  dependent: set Time,
+  installed: Package -> set Time,
+  dependent: Package -> set Time,

Now we have to change what install and install_to_dependent do:

pred install (t, t': Time, p: Package) {
-  p not in installed.t
-  installed.t' = (installed.t + p + p.requires)
+  p -> p not in installed.t
+  installed.t' = installed.t + p -> (p + p.requires)

//...

pred install_to_dependent (t, t': Time, p: Package) {
-  p not in dependent.t
-  p in installed.t //to enforce that explicit is defined
+  p -> p not in dependent.t
+  p -> p in installed.t //to enforce that explicit is defined
+  dependent.t' = dependent.t + p -> (p + p.explicit)

-> here is the relation operator: p -> p is the relation that maps p to itself. p -> (p + p.requires) is equivalent to writing (p -> p) + p -> p.requires. Finally, we have to change the assertion to accurately pull the requirements out.

assert AllRequirements {
  all t: Time | {
-      requires[installed.t] in installed.t
-      requires[dependent.t] in dependent.t
+      Program.(installed.t).requires in Program.installed.t
+      Program.(dependent.t).requires in Program.dependent.t
  }
}

When we run this, there’s no more errors! This is because the developer can’t get away with having just the requirements on her computer. Since we’re installing requirements as coupled to the appropriate program, she must install a complete copy of every requirement in order to get the program. This means her explicit dependencies always match the actual requirements.

This is exactly how Nix does it. By keeping all of the requirements isolated, you have a deterministic build. If it works on your machine, it should (ideally) work on other machines regardless of their current state.

Here’s the final spec:

open util/ordering[Time]
sig Time {}

abstract sig Package {
  requires: set Requirement,
  explicit: set Requirement,
  installed: Package -> set Time,
  dependent: Package -> set Time,
  upgrade: lone Package
} {
  lone @upgrade.this
  no requires & requires.^@upgrade
  explicit in requires
}

sig Program extends Package {} { upgrade in Program }
sig Requirement extends Package {} { 
  no requires
  upgrade in Requirement
}

fun other_versions: Package -> set Package {
  ^upgrade + ^~upgrade
}

fun versions: Package -> set Package {
  other_versions + (Program <: iden)
}

fact NoCycleNonsense {
  all p: Package | p not in p.other_versions
  no (^requires + ^~requires) & versions
}


pred install (t, t': Time, p: Package) {
  p -> p not in  installed.t
  installed.t' = installed.t + p -> (p + p.requires)
  p.explicit = p.requires & p.(installed.t' - installed.t)
  dependent.t' = dependent.t
}

pred install_to_dependent (t, t': Time, p: Package) {
  p -> p not in dependent.t
  p -> p in installed.t //to enforce that explicit is defined
  dependent.t' = dependent.t + p -> (p + p.explicit)
  installed.t' = installed.t
}

fact Trace {
  no installed.first
  no dependent.first
  all t: Time - last | let t' = t.next |
    some p: Program | {
      install[t, t', p] or install_to_dependent[t, t', p]
    }
} 

assert AllRequirements {
  all t: Time | {
      Program.(installed.t).requires in Program.installed.t
      Program.(dependent.t).requires in Program.dependent.t
  }
}

check AllRequirements for 4

Conclusion

By specifying the system, we could see a couple of common failure modes of package managers and how Nix fixes one of them. Formal methods are a pretty powerful tool to find bugs! And, as I’ve said before, they also are really good for understanding your problem domain. I feel like I have a much better sense of what Nix does and why after I did this exercise.⁵

Nix can be found here and Alloy, of course, can be found here. If you’re interested in playing more with this model, here’s a few of exercises you can try:

What happens if we allow our Requirements to have requires? Does everything still work? If not, how can we fix it?
Nix has a feature called profiles. If you have two packages for python, the user might only want to see one at a time, so they can do python bar.py and not vp969yhdq-python bar.py. How could you represent this in Alloy?
Nix also has a feature called rollback, where you can return to a previous Time. What properties are worth checking about rollbacks? Does Nix’s structure help here at all? Remember that we can create predicates on t.prev or t.next.next.
Programs often have ranges as requirements. For example, it might work with any one of versions 2, 3, 4. How could you represent that? It might be similar to something we do here.

I didn’t actually try any of these so they might be trivial or impossible. Have fun!

Update

I’ve been informed by several people that most package managers don’t actually handle upgrades well. Many apparently just try to find a version that satisfies all requirement ranges and hope for the best. I had three immediate thoughts on this:

aaaaaaaaaaaaaaaa
Huh I guess that’s another benefit to using Nix
aaaaaaaaaaaaaaaa

But I think this also underscores just how powerful formal methods are. The “upgrades are tricky” case came up within a few minutes of me starting the spec. I’ve noticed similar things with TLA+ and even informal graphviz modeling: just having a way to clearly think about your system makes a huge difference in the overall quality.

On one hand, it’s slow and nothing works. On the other hand, AutoHotKey! ^[return]
I’m not investing 50+ hours into understanding package management problems for a 2,000 word post. ^[return]
If you’re seeing something much messier try projecting over Time and hiding versions and other_versions. ^[return]
This is being used sliiiiiightly loosely, as it does change the state of your system. ^[return]
The other lesson I’m learning is man, I really need to make an Alloy cheatsheet. Searching the physical book for syntax questions is a pain. ^[return]