This was originally written as a tutorial for Hypothesis and will eventually be reproduced there, but it might still be useful for people using other property-testing libraries. This essay assumes some familiarity with Hypothesis. If you haven’t used it, a better introduction is here.
Once you learn the basics, there are two hard parts to writing property-based tests:
- What are good properties to test?
- How do I generate complex inputs?
The second problem is just as important as the first. Often we’re working with data that has lots of preconditions and we want to generate inputs that satisfy the preconditions. We also often need to create data that depends on other data, or data with extra conditions, or independent-but-similar data. We need some way to build more complex strategies. This is neglected by most PBT literature, which focuses on the finding properties side of the problem. Let’s fix that.
We have an Exam which consists of a set of questions and multiple-choice answers. Students take ExamInstances, which consists of the exam they took and the answers they gave. For each question they may either list an answer or leave it blank. There are many things we could test here, but the first step is actually generating the data.
For the purposes of this tutorial we will make the data classes as simple as possible:
import typing as t from dataclasses import field, dataclass @dataclass class Exam: """The exam template. Students take ExamInstances.""" name: str answer_key: t.List[int] @dataclass class ExamInstance: """The instantiation of the exam.""" student: int exam: Exam # vvv must be same length as exam.answer_key answers: t.List[t.Optional[int]] # ^^^ but students can skip answers
For our property tests we need to generate both Exams and ExamInstances. We have some restrictions on the Exam, and we need the ExamInstances to match the Exam in terms of answer length. This makes it somewhat more complex than the usual “generate an integer” case.
Note: all examples use pytest.
We don’t have to write a special generator for Exam. We’ve already type-annotated every field of Exam, so Hypothesis is smart enough to generate valid Exams via the
from hypothesis import given import hypothesis.strategies as s @given(s.from_type(Exam)) def test_1(exam): ...
In fact, it can even generate valid ExamInstances, as all those fields are typed, too:
@given(s.from_type(ExamInstance)) def test_2(ei): ...
Sometimes this is good enough. In our case it is not for a couple of reasons:
- The state space is huge. Hypothesis is using the default generators for each of the fields, when for our purposes they should be more constrained. The more narrowly we can constrain the search space the more likely we are to get interesting edge cases.
- Hypothesis is going to generate a lot of “nonsensical” exams. You might get something where the answer key is
[-45, 0, 800002]. We want to use stricter strategies for the fields of Exam.
If our requirements are simple and we know that most generated inputs will be correct, we can usually get away with using
filter is a method on all strategies, while
assume goes in the body of the test itself. Both of them tell hypothesis to discard invalid data and make a new draw.
@given(s.from_type(Exam).filter(lambda x: x.answer_key and min(x.answer_key) >= 0)) def test_2(exam): assume(max(exam.answer_key) <= 5) ...
The upside of
assume is that they’re easy to write and clearly express what our constraints are. The downside is that we are still generating bad inputs- we’re just throwing it away right after. This makes the testing slower and Hypothesis might not find enough valid inputs.
We’re better off writing a customized generator that always generates good inputs. Fortunately that’s a lot easier than it sounds.
builds strategy takes in an object and a set of initialization strategies, draws the corresponding values, and passes them into the object’s
__init__. Let’s say we want to make sure that
answer_key only uses numbers between 1 and 5. We can use
builds to do this.
@given(s.builds(Exam, answer_key=s.lists(s.integers(min_value=1, max_value=5)))) def test_3(exam): for x in exam.answer_key: assert x in range(6)
We haven’t specified what
name is, so Hypothesis will use the default generator for text. If we want to always use the same name, we can use the
@given(s.builds(Exam, name=s.just(""), answer_key=s.lists(s.integers(min_value=1, max_value=5)))) def test_3(exam): # same
At this point we should probably pull it into its own function:
def exam_strategy(): return s.builds(Exam, name=s.just(""), answer_key=s.lists(s.integers(min_value=1, max_value=5))) @given(exam_strategy()) def test_3(exam): # same
Now that we have a custom builder, we can tell Hypothesis to always use it when inferring Exams. We do this with
def exam_strategy(): ... s.register_type_strategy(Exam, exam_strategy())
This will now be used any time Hypothesis infers that it needs to build an Exam, even if it needs to do so as part of generating something else, like an ExamInstance.
@given(s.from_type(Exam)) def test_4(exam): ...
That takes care of the Exam. But we still have a problem with the ExamInstance: its
answer must have the same length as its exam’s
answer_key. But there’s no way to guarantee that with our builder. We can’t link strategies to each other.
# This will fail @given(s.from_type(ExamInstance)) def test_5(ei): assert len(ei.answers) == len(ei.exam.answer_key)
In order to combine multiple strategies we need to use something a little more powerful:
Composite strategies are user-written functions “lifted” into full strategies. They can be a little bit unintuitive at first, so let’s start with a simple example. This composite strategy returns a list of numbers where the first element of the list is always the smallest number in the list:
@s.composite def example(draw): i = draw(s.integers()) l = draw(s.lists(s.integers(min_value=i))) return [i] + l
Let’s break down what’s going on here. First we have the inner function. The inner function does not return a strategy. It returns regular Python values. The
@composite decorator is what converts this function into a strategy. In addition to its usual parameters, the inner function also has a
draw parameter. Calling this function on a strategy draws a concrete value. We need this because, again, the inner function only returns values, not strategies. We do not explicitly include
draw when calling the full function; the decorator takes care of that. So we would call this function in a test like:
@given(example()) def test_example(l): assert min(l) == l
We can do anything we want inside the body of the composite function, making it an exceptionally powerful tool. Here’s how we can make sure that our Exam and ExamInstance have the same length:
# Helper function to make the following examples terser def ei_answers_strategy(exam): return s.lists( s.one_of(s.none(), s.integers(min_value=1, max_value=5)), min_size=len(exam.answer_key), max_size=len(exam.answer_key), ) @s.composite def exam_instance_strategy(draw): exam = draw(s.from_type(Exam)) answers = ei_answers_strategy(exam) return ExamInstance(student=1, exam=exam, answers=draw(answers))
We use it like any other strategy.1
@given(exam_instance_strategy()) #pylint: disable=no-value-for-parameter def test_6(ei): assert len(ei.answers) == len(ei.exam.answer_key)
Composite strategies are just functions and can be called in other strategies. This means you can use it to create complex connected data. Let’s say we want to generate several
ExamInstances for the same student and exam. We can break this into several composable pieces:
@s.composite def exam_instance_for_exam(draw, exam, student=""): answers = ei_answers_strategy(exam) return ExamInstance(student=student, exam=exam, answers=draw(answers)) @s.composite def many_exam_instances(draw, student=""): exam = draw(s.from_type(Exam)) exams = s.lists(exam_instance_for_exam(exam, student), min_size=1) return draw(exams) @given(many_exam_instances("brian")) def test_7(eis): ...
In this case we’re only passing raw values into the composite. This means we can’t randomize the name of the student. In some cases this is an advantage, as it restricts the state space we have to search. In other cases this is a disadvantage, as it goes against the spirit of property testing. If you want the more thorough testing, you can pass in strategies instead into the composite and draw values in the body of the function.
@s.composite def exam_instance_for_exam(draw, exam, student=s.just("")): student = draw(student) exam = draw(exam) answers = ei_answers_strategy(exam) return ExamInstance(student=student, exam=exam, answers=draw(answers)) # many_exam_instances is the same... for now
There’s a usability problem with this, though. Imagine we have
many_exam_instances pass in complex strategies for
student. How do we know what values we drew? In this particular case we can extract them from the returned ExamInstance, but we can’t always rely on that, especially if we do more complex transformations. For this reason it’s usually a good idea to pass back all the draws from the composite.
@s.composite def exam_instance_for_exam(draw, exam, student=s.just("")): - return ExamInstance(student=student, exam=exam, answers=draw(answers)) + return exam, student, ExamInstance(student=student, exam=exam, answers=draw(answers))
We also have to adjust
many_exam_instances and our test. In particular we can’t use the
lists strategy anymore, as that assumed
exam_instance_for_exam only returned an ExamInstance. Now that we’re returning a tuple of data it gets a bit messier.
@s.composite def many_exam_instances(draw, student=s.just("")): number = draw(s.integers(min_value=1, max_value=5)) exam = draw(s.from_type(Exam)) student = draw(student) instances =  # Getting the values in a loop for _ in range(number): ei_strategy = exam_instance_for_exam(exam=s.just(exam), student=s.just(student)) _, _, ei = draw(ei_strategy) instances.append(ei) return number, exam, student, instances @given(many_exam_instances(student=s.characters())) def test_8(stuff): _, exam, _, eis = stuff ...
As you can see, this leads to a lot more boilerplate. We also had to change from using the
lists strategy to using a loop, which is slower and less clear. Whether the tradeoffs are worth it depends on your specific case.2
The last way to generate complex inputs is the
data strategy. Like
composite, it gives us a
draw function that we can use to interactively pick values. Unlike
composite, we can use it in the test body! Instead of
exam_instance_strategy, we can do this instead:
@given(s.data(), s.from_type(Exam)) def test_data(data, exam): answers = data.draw(ei_answers_strategy(exam)) ei = ExamInstance(student=1, exam=exam, answers=answers) assert len(ei.answers) == len(ei.exam.answer_key)
The main benefit of
data is that you can customize your strategy based on the behavior inside the test. Something like:
if f(x): y = data.draw(st.integers(min_value=1, max_value=10)) else: y = data.draw(st.integers(min_value=6, max_value=20))
The downside is that
data doesn’t play well with other Hypothesis features, like examples and error reporting. If you can generate all your data ahead of time then you’re better off using composites.
Hypothesis can infer a lot using
from_type. If most random inputs are valid and you just need to rule out rare edge cases, then
filter are simple and effective. Past that, you can use
builds to generate data with complex invariants. The
composite strategy gives you even more control and lets you relate multiple draws to each other, making it possible to create lots of interdependent data. Finally, the
data strategy lets you interactively pick values in the test itself. And all custom strategies can be associated with a type via
Hypothesis provides a lot of mechanisms to create complex data. Learning how to use them well is a bit of an art, but well worth it. Hopefully this makes “what to use when” a little more clear.
Thanks to Zac Hatfield-Dodds and Oskar Wickström for feedback.
- Pylint might complain at you here since we’re not passing in an argument for
draw. There’s no actual bug; the
@compositedecorator handles that argument. You can disable the warning by adding
# pylint: disable=no-value-for-parameterto the line. [return]
- A good compromise is to have the “root” composite take strategies and all its helper strategies take raw values. [return]