QCCore QuickChick
The G Monad
QuickChick provides a number of primitives for building generators. First, returnGen takes a constant value and yields a generator that always returns this value.
Check returnGen.
===> returnGen : ?A -> G ?A
(* Sample (returnGen 42). *)
===> [42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42]
Next, given a random generator for A and a function f taking an A and yielding a random generator for B, we can plumb the two together into a generator for B that works by internally generating a random A and applying f to it.
Check bindGen.
===> bindGen : G ?A -> (?A -> G ?B) -> G ?B
With these two primitives in hand, we can make G an instance of the Monad typeclass.
Instance GMonad : `{Monad G} | 3 :=
{
ret := @returnGen;
bind := @bindGen
}.
{
ret := @returnGen;
bind := @bindGen
}.
Primitive generators
Check @choose.
===> @choose : forall A : Type, ChoosableFromInterval A -> A * A -> G A
The ChoosableFromInterval typeclass describes primitive types A, like natural numbers and integers (Z), for which it makes sense to randomly generate elements from a given interval.
Print ChoosableFromInterval.
===> Record ChoosableFromInterval (A : Type) : Type := Build_ChoosableFromInterval { super : Ord A; randomR : A * A -> RandomSeed -> A * RandomSeed; ... }.
(* Sample (choose (0,10)). *)
===> [ 1, 2, 1, 9, 8, 10, 3, 6, 0, 1, 8 ]
Lists
Check listOf.
(* Sample (listOf (choose (0,4))). *)
===> [ [ 0, 3, 2, 0 ], [ 1, 3, 4, 1, 0, 3, 0, 2, 2, 3, 2, 2, 2, 0, 4, 2, 3, 0, 1 ], [ 3, 4, 3, 1, 2, 4, 4, 1, 0, 3, 4, 3, 2, 2, 4, 4, 1 ], [ 0 ], [ 4, 2, 3 ], [ 3, 3, 4, 0, 1, 4, 3, 2, 4, 1 ], [ 0, 4 ], [ ], [ 1, 0, 1, 3, 1 ], [ 0, 0 ], [ 1, 4 ], [ 4, 3, 2, 0, 2, 2, 4, 0 ], [ 1, 1, 0, 0, 1, 4 ], [ 2, 0, 2, 1, 3, 3 ], [ 4, 3, 3, 0, 1 ], [ 3, 3, 3 ], [ 3, 2, 4 ], [ 1, 2 ], [ ], [ ] ]
The second combinator, vectorOf, receives an additional numeric argument n, the length of the list to be generated.
Check vectorOf.
===> vectorOf : nat -> G ?A -> G (list ?A)
(* Sample (vectorOf 3 (choose (0,4))). *)
===> [ [0, 1, 4], [1, 1, 0], [3, 3, 3], [0, 2, 1], [1, 3, 2], [3, 3, 0], [3, 0, 4], [2, 3, 3], [3, 2, 4], [1, 2, 3], [2, 0, 4] ]
This raises a question. It's clear how vectorOf decides how big to make its lists (we tell it!), but how does listOf do it? The answer is hidden inside G.
Internally, G A is just a synonym for nat → RandomSeed → A.
Inductive G (A:Type) : Type :=
| MkG : (nat → RandomSeed → A) → G A.
| MkG : (nat → RandomSeed → A) → G A.
When it is searching for counterexamples, QuickChick progressively
tries larger and larger values for the size bound n, to
gradually explore deeper into the search space.
Each generator can choose to interpret the size bound however it
wants, and there is no enforced guarantee that generators pay
any attention to it at all; however, it is good practice to
respect this bound when programming generators.
Custom Generators
Inductive color := Red | Green | Blue | Yellow.
Instance show_color : Show color :=
{| show c :=
match c with
| Red ⇒ "Red"
| Green ⇒ "Green"
| Blue ⇒ "Blue"
| Yellow ⇒ "Yellow"
end
|}.
Instance show_color : Show color :=
{| show c :=
match c with
| Red ⇒ "Red"
| Green ⇒ "Green"
| Blue ⇒ "Blue"
| Yellow ⇒ "Yellow"
end
|}.
To generate a random color, we just need to pick one of the constructors Red, Green, Blue, or Yellow. This is done using elems_.
Check elems_.
===> elems_ : ?A -> list ?A -> G ?A
Definition genColor' : G color :=
elems_ Red [ Red ; Green ; Blue ; Yellow ].
(* Sample genColor'. *)
elems_ Red [ Red ; Green ; Blue ; Yellow ].
(* Sample genColor'. *)
To make the common case smoother, QuickChick provides convenient
notations that automatically extract the default element.
Armed with elems, we can write a color generator the way we'd
hope.
" 'elems' [ x ] " := elems_ x (cons x nil) " 'elems' [ x ; y ] " := elems_ x (cons x (cons y nil)) " 'elems' [ x ; y ; .. ; z ] " := elems_ x (cons x (cons y (.. (cons z nil)))) " 'elems' ( x ;; l ) " := elems_ x (cons x l)
Definition genColor : G color :=
elems [ Red ; Green ; Blue ; Yellow ].
(* Sample genColor. *)
elems [ Red ; Green ; Blue ; Yellow ].
(* Sample genColor. *)
===> [Red, Green, Blue, Blue, Red, Yellow, Blue, Red, Blue, Blue, Red]
For more complicated ADTs, QuickChick provides more combinators.
Inductive Tree A :=
| Leaf : Tree A
| Node : A → Tree A → Tree A → Tree A.
Instance showTree {A} `{_ : Show A} : Show (Tree A) :=
{| show := let fix aux t :=
match t with
| Leaf ⇒ "Leaf"
| Node x l r ⇒
"Node (" ++ show x ++ ") ("
++ aux l ++ ") ("
++ aux r ++ ")"
end
in aux
|}.
| Leaf : Tree A
| Node : A → Tree A → Tree A → Tree A.
Instance showTree {A} `{_ : Show A} : Show (Tree A) :=
{| show := let fix aux t :=
match t with
| Leaf ⇒ "Leaf"
| Node x l r ⇒
"Node (" ++ show x ++ ") ("
++ aux l ++ ") ("
++ aux r ++ ")"
end
in aux
|}.
Check oneOf_.
===> oneOf_ : G ?A -> list (G ?A) -> G ?A
Next, Coq's termination checker will save us from shooting ourselves in the foot!
Fixpoint genTree {A} (g : G A) : G (Tree A) := oneOf [ ret Leaf ;; liftM3 Node g (genTree g) (genTree g) ].
The solution is to use the standard "fuel" idiom that Coq users are familiar with. We add a natural number sz as a parameter. We decrease this size in each recursive call, and when it reaches O, we always generate Leaf. Thus, the initial sz parameter serves as a bound on the depth of the tree.
Fixpoint genTreeSized {A} (sz : nat) (g : G A) : G (Tree A) :=
match sz with
| O ⇒ ret Leaf
| S sz' ⇒
oneOf [
ret Leaf ;
liftM3 Node g (genTreeSized sz' g) (genTreeSized sz' g)
]
end.
match sz with
| O ⇒ ret Leaf
| S sz' ⇒
oneOf [
ret Leaf ;
liftM3 Node g (genTreeSized sz' g) (genTreeSized sz' g)
]
end.
(* Sample (genTreeSized 3 (choose(0,3))). *)
===> [ Leaf, Leaf, Node (3) (Node (0) (Leaf) (Leaf)) (Node (2) (Leaf) (Node (3) (Leaf) (Leaf))), Leaf, Leaf, Leaf, Node (1) (Leaf) (Node (1) (Leaf) (Node (0) (Leaf) (Leaf))), Leaf, Node (3) (Leaf) (Leaf), Node (1) (Leaf) (Leaf), Leaf, Leaf, Node (0) (Leaf) (Node (0) (Leaf) (Node (2) (Leaf) (Leaf))), Node (0) (Node (2) (Node (3) (Leaf) (Leaf)) (Leaf)) (Leaf), Node (0) (Leaf) (Leaf), Leaf, Leaf, Leaf, Leaf, Leaf ]
We can obtain bigger trees more often if we skew the distribution towards Nodes using a more expressive QuickChick combinator, freq_.
Check freq_.
===> freq_ : G ?A -> seq (nat * G ?A) -> G ?A
As with oneOf, we usually use a convenient derived notation, called freq, that takes a list of generators, each tagged with a natural number that serves as the weight of that generator. For example, in the following generator, a Leaf will be generated 1 / (sz + 1) of the time and a Node the remaining sz / (sz + 1) of the time.
Fixpoint genTreeSized' {A} (sz : nat) (g : G A) : G (Tree A) :=
match sz with
| O ⇒ ret Leaf
| S sz' ⇒
freq [ (1, ret Leaf) ;
(sz, liftM3 Node g (genTreeSized' sz' g)
(genTreeSized' sz' g))
]
end.
match sz with
| O ⇒ ret Leaf
| S sz' ⇒
freq [ (1, ret Leaf) ;
(sz, liftM3 Node g (genTreeSized' sz' g)
(genTreeSized' sz' g))
]
end.
(* Sample (genTreeSized' 3 (choose(0,3))). *)
===> [ Node (3) (Node (1) (Node (3) (Leaf) (Leaf)) (Leaf)) (Node (0) (Leaf) (Node (3) (Leaf) (Leaf))), Leaf, Node (2) (Node (1) (Leaf) (Leaf)) (Leaf), Node (0) (Leaf) (Node (0) (Node (2) (Leaf) (Leaf)) (Node (0) (Leaf) (Leaf))), Node (1) (Node (2) (Leaf) (Node (0) (Leaf) (Leaf))) (Leaf), Node (0) (Node (0) (Leaf) (Node (3) (Leaf) (Leaf))) (Node (2) (Leaf) (Leaf)), Node (1) (Node (3) (Node (2) (Leaf) (Leaf)) (Node (3) (Leaf) (Leaf))) (Node (1) (Leaf) (Node (2) (Leaf) (Leaf))), Node (0) (Node (0) (Node (0) (Leaf) (Leaf)) (Node (1) (Leaf) (Leaf))) (Node (2) (Node (3) (Leaf) (Leaf)) (Node (0) (Leaf) (Leaf))), Node (2) (Node (2) (Leaf) (Leaf)) (Node (1) (Node (2) (Leaf) (Leaf)) (Node (2) (Leaf) (Leaf))), Node (2) (Node (3) (Node (2) (Leaf) (Leaf)) (Leaf)) (Node (0) (Node (2) (Leaf) (Leaf)) (Leaf)), Leaf, Node (2) (Node (3) (Node (3) (Leaf) (Leaf)) (Leaf)) (Leaf), Leaf, Node (1) (Leaf) (Leaf), Leaf, Node (1) (Node (2) (Leaf) (Node (3) (Leaf) (Leaf))) (Node (0) (Leaf) (Node (1) (Leaf) (Leaf))), Leaf, Node (3) (Node (0) (Node (0) (Leaf) (Leaf)) (Leaf)) (Node (0) (Leaf) (Node (2) (Leaf) (Leaf))), Node (2) (Node (2) (Node (0) (Leaf) (Leaf)) (Leaf)) (Node (1) (Leaf) (Node (2) (Leaf) (Leaf))), Leaf ]
Checkers
Fixpoint mirror {A : Type} (t : Tree A) : Tree A :=
match t with
| Leaf ⇒ Leaf
| Node x l r ⇒ Node x (mirror r) (mirror l)
end.
match t with
| Leaf ⇒ Leaf
| Node x l r ⇒ Node x (mirror r) (mirror l)
end.
To formulate a property about mirror, we need a structural equality test on trees. We can obtain that with minimal effort using the Dec typeclass and the dec_eq tactic.
Instance eq_dec_tree (t1 t2 : Tree nat) : Dec (t1 = t2).
Proof. constructor; dec_eq. Defined.
Proof. constructor; dec_eq. Defined.
Definition mirrorP (t : Tree nat) := (mirror (mirror t)) = t?.
Now we want to use our generator to create a lot of random trees and, for each one, check whether mirrorP returns true or false.
First, we need a type of test results -- let's call it Result. For the moment, think of Result as just an enumerated type with constructors Success and Failure.
Inductive Result := Success | Failure.
Instance showResult : Show Result :=
{
show r := match r with Success ⇒ "Success" | Failure ⇒ "Failure" end
}.
Instance showResult : Show Result :=
{
show r := match r with Success ⇒ "Success" | Failure ⇒ "Failure" end
}.
Definition Checker := G Result.
That is, a Checker embodies some way of performing a randomized
test of the truth of some proposition, which, when applied to a
random seed, will yield either Success (meaning that the
proposition survived this test) or Failure (meaning that this
test demonstrates that the proposition was false).
Sampling a Checker many times with different seeds will cause
many different tests to be performed.
Actually, we will be wanting to build Checkers based on many
different types. So let's begin by defining a typeclass
Checkable, where an instance for Checkable A will provide a
way of converting an A into a Checker.
To check mirrorP, we'll need a way to build a Checker out of a function from trees to booleans.
Class Checkable A :=
{
checker : A → Checker
}.
{
checker : A → Checker
}.
Instance checkableBool : Checkable bool :=
{
checker b := if b then ret Success else ret Failure
}.
{
checker b := if b then ret Success else ret Failure
}.
That is, the boolean value true passes every test we might
subject it to (i.e., the result of checker is a G that returns
true for every seed), while false fails all tests.
Let's see what happens if we sample checkers for our favorite booleans, true and false.
(* Sample (CheckerPlayground1.checker true). *)
===> [Success, Success, Success, Success, Success, Success, Success, Success, Success, Success, Success]
(* Sample (CheckerPlayground1.checker false). *)
===> [Failure, Failure, Failure, Failure, Failure, Failure, Failure, Failure, Failure, Failure, Failure]
A decidable Prop is not too different from a boolean, so we should be able to build a checker from that too.
Instance checkableDec `{P : Prop} `{Dec P} : Checkable P :=
{
checker p := if P? then ret Success else ret Failure
}.
{
checker p := if P? then ret Success else ret Failure
}.
(The definition looks a bit strange since it doesn't use its
argument p. The intuition is that all the information in p is
already encoded in P!)
Now suppose we pose a couple of (decidable) conjectures:
Conjecture c1 : 0 = 42.
Conjecture c2 : 41 + 1 = 42.
Conjecture c2 : 41 + 1 = 42.
The somewhat astononishing thing about the Checkable instance
for decidable Props is that, even though these are
conjectures (we haven't proved them, so the "evidence" that Coq
has for them internally is just an uninstantiated "evar"), we can
still build checkers from them and sample from these
checkers! (Why? Technically, it is because the Checkable
instance for decidable properties does not look at its
argument.)
(* Sample (CheckerPlayground1.checker CheckerPlayground2.c1). *)
===> [Failure, Failure, Failure, Failure, Failure, Failure, Failure, Failure, Failure, Failure, Failure]
(* Sample (CheckerPlayground1.checker CheckerPlayground2.c2). *)
===> [Success, Success, Success, Success, Success, Success, Success, Success, Success, Success, Success]
Now let's go back to mirrorP.
Definition forAll {A B : Type} `{Checkable B}
(g : G A) (f : A → B)
: Checker :=
a <- g ;;
checker (f a).
(g : G A) (f : A → B)
: Checker :=
a <- g ;;
checker (f a).
Let's try this out!
Definition isRed c :=
match c with
Red ⇒ true
| _ ⇒ false
end.
match c with
Red ⇒ true
| _ ⇒ false
end.
Since we can generate elements of color and we have a
Checkable instance for bool, we can apply forAll to isRed
and sample from the resulting Checker to run some tests.
(* Sample (CheckerPlayground3.forAll genColor isRed). *)
===> [Success, Failure, Failure, Failure, Success, Failure, Failure, Success, Failure, Failure, Success]
Now back to a more realistic example. What about mirrorP?
(*
Sample (CheckerPlayground3.forAll
(genTreeSized' 3 (choose(0,3)))
mirrorP).
*)
Sample (CheckerPlayground3.forAll
(genTreeSized' 3 (choose(0,3)))
mirrorP).
*)
===> [Success, Success, Success, Success, Success, Success, Success, Success, Success, Success, Success]
Let's instead try defining a bad property and see if we can detect that it's bad...
Definition faultyMirrorP (t : Tree nat) := (mirror t) = t ?.
(*
Sample (CheckerPlayground3.forAll
(genTreeSized' 3 (choose(0,3)))
faultyMirrorP).
*)
(*
Sample (CheckerPlayground3.forAll
(genTreeSized' 3 (choose(0,3)))
faultyMirrorP).
*)
===> [Failure, Success, Failure, Success, Success, Success, Failure, Success, Failure, Failure, Success]
There's only one little issue: What are the tests that are failing? We can tell by looking at the samples that the property is bad, but we can't see the counterexamples!
Inductive Result :=
| Success : Result
| Failure : ∀ {A} `{Show A}, A → Result.
Instance showResult : Show Result :=
{
show r := match r with
Success ⇒ "Success"
| Failure a ⇒ "Failure: " ++ show a
end
}.
Definition Checker := G Result.
| Success : Result
| Failure : ∀ {A} `{Show A}, A → Result.
Instance showResult : Show Result :=
{
show r := match r with
Success ⇒ "Success"
| Failure a ⇒ "Failure: " ++ show a
end
}.
Definition Checker := G Result.
The failure cases in the bool and Dec checkers don't need to
record anything except the Failure, so we put tt (the sole
value of type unit) as the "failure reason."
Instance checkableBool : Checkable bool :=
{
checker b := if b then ret Success else ret (Failure tt)
}.
Instance checkableDec `{P : Prop} `{Dec P} : Checkable P :=
{
checker p := if P? then ret Success else ret (Failure tt)
}.
{
checker b := if b then ret Success else ret (Failure tt)
}.
Instance checkableDec `{P : Prop} `{Dec P} : Checkable P :=
{
checker p := if P? then ret Success else ret (Failure tt)
}.
The interesting case is the forAll combinator. Here, we do have some useful information to record in the failure case -- namely, the argument that caused the failure.
Definition forAll {A B : Type} `{Show A} `{Checkable B}
(g : G A) (f : A → B)
: Checker :=
a <- g ;;
r <- checker (f a) ;;
match r with
Success ⇒ ret Success
| Failure b ⇒ ret (Failure (a,b))
end.
(g : G A) (f : A → B)
: Checker :=
a <- g ;;
r <- checker (f a) ;;
match r with
Success ⇒ ret Success
| Failure b ⇒ ret (Failure (a,b))
end.
Note that, rather than just returning Failure a, we package up a together with b, which explains the reason for the failure of f a. This allows us to write several forAlls in sequence and capture all of their results in a nested tuple.
(*
Sample (CheckerPlayground4.forAll
(genTreeSized' 3 (choose(0,3)))
faultyMirrorP).
*)
Sample (CheckerPlayground4.forAll
(genTreeSized' 3 (choose(0,3)))
faultyMirrorP).
*)
===> [Failure: (Node (2) (Node (3) (Node (2) (Leaf) (Leaf)) (Leaf)) (Node (0) (Node (2) (Leaf) (Leaf)) (Leaf)), tt), Success, Failure: (Node (2) (Node (3) (Node (3) (Leaf) (Leaf)) (Leaf)) (Leaf), tt), Success, Success, Success, Failure: (Node (1) (Node (2) (Leaf) (Node (3) (Leaf) (Leaf))) (Node (0) (Leaf) (Node (1) (Leaf) (Leaf))), tt), Success, Failure: (Node (3) (Node (0) (Node (0) (Leaf) (Leaf)) (Leaf)) (Node (0) (Leaf) (Node (2) (Leaf) (Leaf))), tt), Failure: (Node (2) (Node (2) (Node (0) (Leaf) (Leaf)) (Leaf)) (Node (1) (Leaf) (Node (2) (Leaf) (Leaf))), tt), Success]
Sampling repeatedly from a generator is just what the QuickChick command does, except that, instead of running a fixed number of tests and returning their results in a list, it runs tests only until the first counterexample is found.
(*
QuickChick
(forAll
(genTreeSized' 3 (choose(0,3)))
faultyMirrorP).
*)
QuickChick
(forAll
(genTreeSized' 3 (choose(0,3)))
faultyMirrorP).
*)
===> QuickChecking (forAll (genTreeSized' 3 (choose (0, 3))) faultyMirrorP) Node (0) (Node (0) (Node (2) (Leaf) (Leaf)) (Node (1) (Leaf) (Leaf))) (Node (1) (Node (0) (Leaf) (Leaf)) (Leaf)) *** Failed after 1 tests and 0 shrinks. (0 discards)
However, these counterexamples themselves still leave something to be desired: they are all much larger than is really needed to illustrate the bad behavior of faultyMirrorP.
Shrinking
Here is a shrinker for colors.
Instance shrinkColor : Shrink color :=
{
shrink c :=
match c with
| Red ⇒ [ ]
| Green ⇒ [ Red ]
| Blue ⇒ [ Red; Green ]
| Yellow ⇒ [ Red; Green; Blue ]
end
}.
{
shrink c :=
match c with
| Red ⇒ [ ]
| Green ⇒ [ Red ]
| Blue ⇒ [ Red; Green ]
| Yellow ⇒ [ Red; Green; Blue ]
end
}.
Writing a shrinking instance for trees is equally straightforward: we don't shrink Leafs, while for Nodes we can either return the left or right subtree, or shrink the payload, or shrink one of the subtrees.
Open Scope list.
Fixpoint shrinkTreeAux {A}
(s : A → list A) (t : Tree A)
: list (Tree A) :=
match t with
| Leaf ⇒ []
| Node x l r ⇒ [l] ++ [r] ++
map (fun x' ⇒ Node x' l r) (s x) ++
map (fun l' ⇒ Node x l' r) (shrinkTreeAux s l) ++
map (fun r' ⇒ Node x l r') (shrinkTreeAux s r)
end.
Instance shrinkTree {A} `{Shrink A} : Shrink (Tree A) :=
{| shrink x := shrinkTreeAux shrink x |}.
Fixpoint shrinkTreeAux {A}
(s : A → list A) (t : Tree A)
: list (Tree A) :=
match t with
| Leaf ⇒ []
| Node x l r ⇒ [l] ++ [r] ++
map (fun x' ⇒ Node x' l r) (s x) ++
map (fun l' ⇒ Node x l' r) (shrinkTreeAux s l) ++
map (fun r' ⇒ Node x l r') (shrinkTreeAux s r)
end.
Instance shrinkTree {A} `{Shrink A} : Shrink (Tree A) :=
{| shrink x := shrinkTreeAux shrink x |}.
With shrinkTree in hand, we can use the forAllShrink property combinator, a variant of forAll that takes a shrinker as an additional argument, to test properties like faultyMirrorP.
(*
QuickChick
(forAllShrink
(genTreeSized' 5 (choose (0,5)))
shrink
faultyMirrorP).
*)
QuickChick
(forAllShrink
(genTreeSized' 5 (choose (0,5)))
shrink
faultyMirrorP).
*)
===> Node (0) (Leaf) (Node (0) (Leaf) (Leaf)) *** Failed! After 1 tests and 8 shrinks
Putting it all Together
Class Gen (A : Type) :=
{
arbitrary : G A
}.
{
arbitrary : G A
}.
Intuitively, Gen is a way of uniformly packaging up the generators for various types so that we do not need to remember their individual names -- we can just call them all arbitrary.
Instance gen_color : Gen color :=
{
arbitrary := genColor
}.
{
arbitrary := genColor
}.
Next, for convenience we package Gen and Shrink together into an Arbitrary typeclass that is a subclass of both.
Class Arbitrary (A : Type) `{Gen A} `{Shrink A}.
(The empty "body" of Arbitrary is elided.)
We can use the top-level QuickChick command on quantified propositions with generatable and decidable conclusions, stating just the property and letting the typeclass machinery figure out the rest.
Conjecture every_color_is_red : ∀ c, c = Red.
Since we have already defined Gen and Shrink instances for
color, we automatically get an Arbitrary instance. The Gen
part is used by the checker instance for "forall" propositions to
generate random color arguments.
To show that the conclusion is decidable, we need to define a
Dec instance for equality on colors.
Instance eq_dec_color (x y : color) : Dec (x = y).
Proof. dec_eq. Defined.
Proof. dec_eq. Defined.
(* QuickChick every_color_is_red. *)
===> QuickChecking every_color_is_red Green *** Failed after 1 tests and 1 shrinks. (0 discards)
Sized Generators
Definition sized {A : Type} (f : nat → G A) : G A :=
MkG
(fun n r ⇒
match f n with
MkG g ⇒ g n r
end).
MkG
(fun n r ⇒
match f n with
MkG g ⇒ g n r
end).
To streamline assembling generators, it is convenient to introduce one more typeclass, GenSized, whose instances are sized generators.
Class GenSized (A : Type) :=
{
arbitrarySized : nat → G A
}.
{
arbitrarySized : nat → G A
}.
We can then define a generic Gen instance for types that have a
GenSized instance, using sized:
Instance GenOfGenSized {A} `{GenSized A} : Gen A :=
{
arbitrary := sized arbitrarySized
}.
{
arbitrary := sized arbitrarySized
}.
Instance genTree {A} `{Gen A} : GenSized (Tree A) :=
{| arbitrarySized n := genTreeSized' n arbitrary |}.
{| arbitrarySized n := genTreeSized' n arbitrary |}.
Finally, with the Arbitrary instance for trees, we can supply just faultyMirrorP to the QuickChick command.
(* QuickChick faultyMirrorP. *)
Automation
Derive Arbitrary for Tree.
===> GenSizedree is defined ===> ShrinkTree is defined
Print GenSizedTree.
Print ShrinkTree.
Derive Show for Tree.
Print ShrinkTree.
Derive Show for Tree.
===> ShowTree is defined
Print ShowTree.
Collecting Statistics
Check @collect.
===> @collect : forall A prop : Type, Show A -> Checkable prop -> A -> prop -> Checker
For example, suppose we measure the size of Trees like this:
Fixpoint size {A} (t : Tree A) : nat :=
match t with
| Leaf ⇒ O
| Node _ l r ⇒ 1 + size l + size r
end.
match t with
| Leaf ⇒ O
| Node _ l r ⇒ 1 + size l + size r
end.
We can write a dummy property treeProp to collect the sizes of
the trees we are generating.
Definition treeProp (g : nat → G nat → G (Tree nat)) n :=
forAll (g n (choose (0,n))) (fun t ⇒ collect (size t) true).
(* QuickChick (treeProp genTreeSized 5). *)
forAll (g n (choose (0,n))) (fun t ⇒ collect (size t) true).
(* QuickChick (treeProp genTreeSized 5). *)
===> 4947 : 0 1258 : 1 673 : 2 464 : 6 427 : 5 393 : 3 361 : 7 302 : 4 296 : 8 220 : 9 181 : 10 127 : 11 104 : 12 83 : 13 64 : 14 32 : 15 25 : 16 16 : 17 13 : 18 6 : 19 5 : 20 2 : 21 1 : 23 +++ OK, passed 10000 tests
Compare this with genTreeSized'.
(* QuickChick (treeProp genTreeSized' 5). *)
===> 1624 : 0 571 : 10 564 : 12 562 : 11 559 : 9 545 : 8 539 : 14 534 : 13 487 : 7 487 : 15 437 : 16 413 : 6 390 : 17 337 : 5 334 : 1 332 : 18 286 : 19 185 : 4 179 : 20 179 : 2 138 : 21 132 : 3 87 : 22 62 : 23 19 : 24 10 : 25 6 : 26 2 : 27 +++ OK, passed 10000 tests
Exercise: 2 stars, standard (genTreeSized'')
Write a generator genTreeSized'' that generates fewer empty trees. Check your results using collect.
(* FILL IN HERE *)
☐
☐
Dealing with Preconditions
Fixpoint sorted (l : list nat) :=
match l with
| [] ⇒ true
| x::xs ⇒ match xs with
| [] ⇒ true
| y :: ys ⇒ (x <=? y) && (sorted xs)
end
end.
Fixpoint insert (x : nat) (l : list nat) :=
match l with
| [] ⇒ [x]
| y::ys ⇒ if x <=? y then x :: l
else y :: insert x ys
end.
match l with
| [] ⇒ true
| x::xs ⇒ match xs with
| [] ⇒ true
| y :: ys ⇒ (x <=? y) && (sorted xs)
end
end.
Fixpoint insert (x : nat) (l : list nat) :=
match l with
| [] ⇒ [x]
| y::ys ⇒ if x <=? y then x :: l
else y :: insert x ys
end.
Definition insert_spec (x : nat) (l : list nat) :=
sorted l ==> sorted (insert x l).
Definition test_insert_spec :=
forAll (choose (0,10)) (fun x ⇒
forAll (listOf (choose (0,10))) (fun l ⇒
insert_spec x l)).
(* QuickChick test_insert_spec. *)
sorted l ==> sorted (insert x l).
Definition test_insert_spec :=
forAll (choose (0,10)) (fun x ⇒
forAll (listOf (choose (0,10))) (fun l ⇒
insert_spec x l)).
(* QuickChick test_insert_spec. *)
===> QuickChecking test_insert_spec +++ Passed 10000 tests (17325 discards)
But the wasted effort is the least of our problems! Let's take a peek at the distribution of the lengths of generated lists using collect.
Definition insert_spec' (x : nat) (l : list nat) :=
collect (List.length l) (insert_spec x l).
Definition test_insert_spec' :=
forAll (choose (0,10)) (fun x ⇒
forAll (listOf (choose (0,10))) (fun l ⇒
insert_spec' x l)).
collect (List.length l) (insert_spec x l).
Definition test_insert_spec' :=
forAll (choose (0,10)) (fun x ⇒
forAll (listOf (choose (0,10))) (fun l ⇒
insert_spec' x l)).
(* QuickChick test_insert_spec'. *)
===> QuickChecking insert_spec' 3652 : (Discarded) 7 3582 : 0 3568 : (Discarded) 6 3542 : 1 3469 : (Discarded) 5 3200 : (Discarded) 4 2830 : (Discarded) 3 1842 : 2 1606 : (Discarded) 2 716 : 3 248 : 4 61 : 5 8 : 6 1 : 7 +++ Passed 10000 tests (18325 discards)The vast majority of inputs have length 2 or less, while the larger lists are almost always discarded!
When dealing with properties with preconditions, it is common practice to write custom generators for well-distributed random data that satisfy the property.
Fixpoint genSortedList (low high : nat) (size : nat)
: G (list nat) :=
match size with
| O ⇒ ret []
| S size' ⇒
if high <? low then
ret []
else
freq [ (1, ret []) ;
(size, x <- choose (low, high);;
xs <- genSortedList x high size';;
ret (x :: xs)) ] end.
: G (list nat) :=
match size with
| O ⇒ ret []
| S size' ⇒
if high <? low then
ret []
else
freq [ (1, ret []) ;
(size, x <- choose (low, high);;
xs <- genSortedList x high size';;
ret (x :: xs)) ] end.
We use a size parameter as usual to control the length of generated
lists.
Finally, we can use forAllShrink to define a property using the new generator:
Definition insert_spec_sorted (x : nat) :=
forAllShrink
(genSortedList 0 10 10)
shrink
(fun l ⇒ insert_spec' x l).
forAllShrink
(genSortedList 0 10 10)
shrink
(fun l ⇒ insert_spec' x l).
QuickChick insert_spec_sorted.
===> QuickChecking insert_spec_sorted 947 : 0 946 : 4 946 : 10 938 : 6 922 : 9 916 : 2 900 : 7 899 : 3 885 : 8 854 : 5 847 : 1 +++ Passed 10000 tests (0 discards)
Does this mean we are happy?
Exercise: 5 stars, standard, optional (uniform_sorted)
Using "collect", find out whether generating a sorted list of numbers between 0 and 5 is uniform in the frequencies with which different numbers are found in the generated lists.
(* FILL IN HERE *)
☐
☐
Another Precondition: Binary Search Trees
Fixpoint isBST (low high: nat) (t : Tree nat) :=
match t with
| Leaf ⇒ true
| Node x l r ⇒ (low <? x) && (x <? high)
&& (isBST low x l) && (isBST x high r)
end.
match t with
| Leaf ⇒ true
| Node x l r ⇒ (low <? x) && (x <? high)
&& (isBST low x l) && (isBST x high r)
end.
Fixpoint insertBST (x : nat) (t : Tree nat) :=
match t with
| Leaf ⇒ Node x Leaf Leaf
| Node x' l r ⇒ if x <? x' then Node x' (insertBST x l) r
else Node x' l (insertBST x r)
end.
match t with
| Leaf ⇒ Node x Leaf Leaf
| Node x' l r ⇒ if x <? x' then Node x' (insertBST x l) r
else Node x' l (insertBST x r)
end.
We would expect that if we insert an element that is within
the bounds low and high into a binary search tree, then
the result is also a binary search tree.
Definition insertBST_spec (low high : nat) (x : nat) (t : Tree nat) :=
(low <? x) ==>
(x <? high) ==>
(isBST low high t) ==>
isBST low high (insertBST x t).
(low <? x) ==>
(x <? high) ==>
(isBST low high t) ==>
isBST low high (insertBST x t).
(* QuickChick insertBST_spec. *)
===> QuickChecking insertBST_spec 0 5 4 Node (4) (Leaf) (Leaf) *** Failed after 85 tests and 1 shrinks. (1274 discards)
However we are wasting too much testing effort. Indeed, if we fix the bug ...
Fixpoint insertBST' (x : nat) (t : Tree nat) :=
match t with
| Leaf ⇒ Node x Leaf Leaf
| Node x' l r ⇒ if x <? x' then Node x' (insertBST' x l) r
else if x' <? x then Node x' l (insertBST' x r)
else t
end.
Definition insertBST_spec' (low high : nat) (x : nat) (t : Tree nat) :=
(low <? x) ==> (x <? high) ==> (isBST low high t) ==>
isBST low high (insertBST' x t).
match t with
| Leaf ⇒ Node x Leaf Leaf
| Node x' l r ⇒ if x <? x' then Node x' (insertBST' x l) r
else if x' <? x then Node x' l (insertBST' x r)
else t
end.
Definition insertBST_spec' (low high : nat) (x : nat) (t : Tree nat) :=
(low <? x) ==> (x <? high) ==> (isBST low high t) ==>
isBST low high (insertBST' x t).
... and try again...
(* QuickChick insertBST_spec'. *)