Level up your property based testing library

In the previous post, I talked about implementing a minimal property-based testing library. You can check it if you want to know how QuickCheck and PropEr work. This post is a follow-up where I’ll try to explain how some auxiliary features are implemented. Keep in mind that the code snippets are simplifications of the original ones.

Generating functions with Quickcheck

Quickchek can check higher-order functions. These are functions that take other functions as arguments or that return new functions as results. Let’s say we have a function foldBelowThreshold. It takes a function Integer -> Integer, an array of integers, and a threshold number. It applies the function to each element of the array then it sums the results. It returns the sum only if it’s smaller than the threshold.

A property for foldBelowThreshold can be:

propCheckBelowThreshold :: Fun Integer Integer -> Bool
propCheckBelowThreshold (Fun _ f) = foldBelowThreshold(f, [8, 23, 3], 200) < 200

To check this, QuickCheck generates many functions f and checks the property with each of them.

Another example from the library tests is prop:

prop :: Fun String Integer -> Bool
prop (Fun _ f) = f "snake" == f "tiger" || f "tiger" == f "elephant"

It checks whether every function f :: String -> Integer that takes a string and returns an integer satisfies an equality check.

Running quickCheck gives us:

*** Failed! Falsified (after 3 tests and 134 shrinks):
{"elephant"->1, "snake"->1, _->0}

The counterexample is a function that returns 1 when the input is "elephant" or "snake", and 0 when it’s "tiger".

The philosophy

A function is a set of input/output pairs, or range/domain pairs if we are to use the right terms. Creating a new function is all about generating a new such set. This set can be very big when the input space is huge. Think about a String or a list as an input type. What we can do is to partition the space. We say that one-character strings will return 1. Two-character strings will return 4, and so on…

Creating functions this way won’t be practical enough for testing. It’s too simple to expose bugs. We need variability. QuickCheck simplifies these types using a divide-and-conquer approach. A string is a list of characters. A character can be represented as a Word8, which is between 0 and 255. Word8 is an 8-bit bounded unsigned integer that can be mapped to an integer. It might not be mapped to a static integer, but it can be an input to a random numbers generator (and yes random number generators need variability too).

Given a string, a generated function will go through it character by character. Each character contributes to the return value. A character is mapped to a Word8. This number increases the variability of the return value.

We reduce the number of output values during shrinking by making more and more inputs return the same output. A maximally shrunk function will return 0 for every value of String. Shrinking the function is all about shrinking the table that maps every input to an output.

A function that takes a one-character string can be thought of as a function that takes a character and a table of characters/outputs. It looks for the character in the table and returns the value associated with it.

The same function, when it takes a two-character string, can be thought of as a function that also takes a one-character string (the first one) and a table of characters/outputs. It does the same. It looks for the occurrence of the one-character string in the table and it returns the value associated with it. But this time, the value will be itself a table. It’s the table that we pass to the function that accepts one-character strings. Same for multiple-character strings, the function is always the same. It’s the table that differs.

To implement all this, QuickCheck structures functions using the required type. f, from the definition of prop above, is the reification of a function structure. And then to shrink a function, we shrink the structure. We pass a null table (a table that returns the same result for all the characters) instead of the multi-output normal table. A shrunk function structure will produce a function that ignores the last characters of the string. f "Hello World!" will be the same as f "Hello".

If the function continues failing as the structure shrinks, we’ll end up with a function structure that is equivalent to the null table. It’ll always return f "" for any string.

Generating a function

The function operator in Haskell is ->. We can write a function type as a -> b or as ((->) a) b. For QuickCheck to be able to create instances for a type, the latter should implement the class Arbitrary. It should define two methods arbitrary and shrink that create and shrink instances.

a -> b implements Arbitrary:

instance (CoArbitrary a, Arbitrary b) => Arbitrary (a -> b) where
  arbitrary = do
    eval <- delay
    return (liftM eval (\a -> coarbitrary a arbitrary))

The implementation creates a function (\a -> ...) that builds a b generator using arbitrary. Then, it calls coarbitrary a to build another b generator.

This inlined function, whose type is a -> Gen b is lifted using eval into a function whose type is a -> b.

delay generates functions of the type Gen b -> b:

delay :: Gen (Gen b -> b)
delay = MkGen (\r n g -> unGen g r n)

eval is one such function:

eval :: Gen b -> b
eval = \g -> unGen g aRandomNumGenerator aSize

This is the value of eval we’re using for the lifting with liftM. If we write a -> w as (((->) a) w), we can consider ((->) a) a Monad whose type argument is w.

The type of liftM is:

liftM :: (Gen b -> b) -> (((-> ) a) Gen b) -> ((-> a) b)

We can simplify the notation:

liftM :: (Gen b -> b) -> (a -> Gen b) -> (a -> b)

To sum up, we create a Gen b then we integrate a value of a into it. We make it depends on a. We define a function a -> Gen b in-line to provide the value of a. The value of a will be provided by the user. This value will be used to create a value of b through Gen b. We transform a -> Gen b into a -> b. Then, we call return to make it Gen (a -> b).

The last two steps are kind of “type plumbing”. We’re just fitting the structure to the type. We “delay” the evaluation of b from getting a first then getting a random number generator, to getting a random number generator, a size, and then a value of a.

This idea of delaying evaluation is widespread in functional codebases. It’s implemented either by changing the order of the arguments or by introducing proxy functions, that is functions that take some arguments and pass them as they are to another function.

It’s quite an interesting pattern of state encapsulation. Have the values, do not use them directly (keep them attached to a monad for example), wait for a trigger (another value for example), then inject and evaluate. You can also create other “delays” or wrap the whole process inside other monads. Such monads can modify the encapsulated values, or they can transform/ignore the trigger we’re waiting for.

It’s interesting also to compare this to the OOP idioms of implementing state encapsulation. There, we keep the state local and we handle async messages. The boundaries are explicit. And whoever manages the state decides what should be done: Fewer things move under the carpet, which might mean more code, but also fewer structure transformations that serve only the purpose of adhering to an interface, which means less time debugging and exploring the logic.

It’s a question of how we want to handle the logic and the representation that falls outside our mental model and outside the implicit design decisions we make.

Back to coding!

To implement what we talked about previously, a implements CoArbitrary. The motivation behind CoArbitrary a is to create functions a -> b that return different output values for different input values.

CoArbitrary a should define:

coarbitrary :: a -> Gen b -> Gen b

No matter which type b is, coarbitrary takes a value of a and a generator of b. It returns a new generator of b that assimilates a.

A question might arise here. If we already have the value of a, and we want the value of b for it, why isn’t the signature:

coarbitrary :: a -> Gen b -> b

?

and why not:

coarbitrary :: Gen b -> a -> b

?

The return type is Gen b instead of b because a alone is not enough to create b. We need to generate a b value from the b generator. The generation requires a random number generator.

It’s the delay pattern again.

If we want a b result, we can define coarbitrary as:

coarbitrary :: a -> QCGen -> Gen b -> b

With this definition, coarbitrary needs to peek inside Gen definition and call unGen to generate a b. This is more-or-less what delay and eval are doing. They’re providing b generator with a random number generator and a size.

All coarbitrary definitions for the types handled by QuickCheck follow the same pattern. Each tries to deconstruct a given value recursively.

During each recursive iteration, it returns:

variant n (coarbitrary simplifiedValue givenGenerator)

The last iteration returns:

variant n givenGenerator

simplifiedValue is the next simplified value. Given an array, the definition takes the head and calls itself with the tail. Given an Either or a Maybe, it calls itself with the wrapped value.

n is usually 0 or 1. It depends on the variants of the type. For Either, it’s 0 for Left and 1 for Right. For a list, it’s 0 for an empty list and 1 for a non-empty one.

The documentation says:

  -- You should use 'variant' to perturb the random generator;
  -- the goal is that different values for the first argument will
  -- lead to different calls to 'variant'.
  -- ...
  -- The logic behind 'variant'. Given a random number seed, and an integer, uses
  -- splitting to transform the seed according to the integer.

It’s defined as:

variant :: Integral n => n -> Gen a -> Gen a
variant k (MkGen g) = MkGen (\r n -> g (integerVariant (toInteger k) $! r) n)

The body of variant creates a generator whose unGen returns a value generated by the given generator g. The tricky part is the generation of the random number generator that’ll be passed.

It’s not the regular generator.

integerVariant takes the variation number and the regular generator.

 -- The $! operator ensures that the second argument r is evaluated to weak head
 -- normal form (WHNF) before being passed to the function integerVariant.
 -- This means that the value of r is fully evaluated (to the outermost constructor)
 -- before the function is applied.

Making sure r, the random number generator, is evaluated to WHNF guarantees that the same string will give the same returned value. Splitting it in the same way will give the same random numbers.

integerVariant documentation says:

  -- Use one bit to encode the sign, then use Elias gamma coding
  -- (https://en.wikipedia.org/wiki/Elias_gamma_coding) to do the rest.
  -- Actually, the first bit encodes whether n >= 1 or not;
  -- this has the advantage that both 0 and 1 get short codes.

A random number generator can be split into two generators. This is a feature of RandomGen in Haskell.

Elias gamma coding works for numbers greater or equal to 1. integerVariant uses 1 - n if the value of n we pass in the first argument is less than 1.

The goal of this function is to inject an amount of variability equivalent to the number we pass in. Elias gamma coding transforms a number into a bit stream. The function takes the given generator and follows the bit stream, splitting the generator for each bit, taking the left split if the bit is 0, and taking the right split if it’s 1. The last split is returned.

Structuring a function

Function a is a class of types a that can be an input to a function structure.

To instantiate Function for a type a, we define a function named function for it. It’ll structure a function a -> b into a :-> b:

function :: (a -> b) -> (a :-> b)

b can be any type. It does not matter as long as we can calculate it by providing the right value of a to the input function a -> b.

:-> is a data type that takes two type arguments, a and b. We can write it either as a :-> b or as ((:-> ) a) b. The result is a structure a :-> b that generates as many functions a -> b as we want.

QuickCheck offers six constructors to create values of such type:

data a :-> c where
  Pair  :: (a :-> (b :-> c)) -> ((a,b) :-> c)
  (:+:) :: (a :-> c) -> (b :-> c) -> (Either a b :-> c)
  Unit  :: c -> (() :-> c)
  Nil   :: a :-> c
  Table :: Eq a => [(a,c)] -> (a :-> c)
  Map   :: (a -> b) -> (b -> a) -> (b :-> c) -> (a :-> c)

If a is String and c is Integer, we can create a value of String :-> Integer:

Table [("Hello", 5), ("world", 8)]

This is not how QuickCheck structures a function String :-> Integer. A function structure can be a huge recursive construct where the leaves are non-recursive values created with Unit, Nil, or Table.

The library structures a function that takes a String and returns an Integer as a Map.

The Map constructor maps a structure a :-> b to a new one y :-> b. y is a different, but equivalent, type that represents a. String :-> Integer is mapped to Either () (Char, String) :-> Integer. The left side of Either handles empty strings. Its right side handles non-empty ones. A string is represented as a pair of its first character and its remaining characters.

You can take a look again at the definition of Map constructor above. We create it by providing a function String -> Either () (Char, String), a function Either () (Char, String) -> String, and a structure Either () (Char, String) :-> Integer.

We create a function structure by transforming a function b -> c into b :-> c. Quickcheck needs a way to map a String into an Either value. It uses this mapping during reification. It calls the reification of b :-> c with the result of transforming a value of a using the first function.

But, the function we want to structure initially is String -> Integer. Even the function Either () (Char, String) -> Integer should get its returned value from the initial one. That’s why we pass the second function. Internally, the function we use to create the structure b :-> c maps an Either to a String and calls the initial function.

To structure Either () (Char, String) :-> Integer, function structures each of () and (Char, String) separately. It builds two structures: one from an empty string to an integer () :-> Integer, and one from a non-empty string to an integer (Char, String) :-> Integer. Then, it joins them with a :+: constructor.

That is:

structureOfEmptyString :+: structureOfNonEmptyString

To structure () :-> Integer, function createas uses Unit, that is Unit f (). f is the function String -> Integer, or a -> b, we pass to function in the beginning.

To structure(Char, String) :-> Integer, function creates a Pair structure. As the definition of Pair states, we need a structure Char :-> (String :-> Integer) with three types.

Such a wrapped structure is created with Table constructor. The elements are pairs. Each left side contains a character between Word8 minBound and its maxBound. Each right side contains a structure String :-> Integer.

This latter is the same Map structure we started with in the first place. Again, we’ll map it into a structure Either () (Char, String) :-> Integer. Then, we’ll split this one into () :-> Integer and (Char, String) :-> Integer. The difference is in the function we’ll use now to build these two structures.

During the first iteration, it was f (h b). f is the initial function we generate and pass function. h transforms an Either () (Char, String) into a String. b is the input of the function Either () (Char, String) -> Integer.

Here, it’ll be f1 (h b). h and b are the same. f1 is currentCharacter -> f (firstCharcter ++ currentCharacter). firstCharacter is the character we used in the previous structure. That is the first character of the string.

The structure we get at the end is huge and recursive, an infinite big tree. Each character creates a table with all the possible characters. And inside each element of that table, there’s a new table, also with all possible characters. And so it on.

Reifying a function structure

Fun is defined as:

data Fun a b = Fun (a :-> b, b, Shrunk) (a -> b)

mkFun creates Fun values:

mkFun :: (a :-> b) -> b -> Fun a b
mkFun p d = Fun (p, d, NotShrunk) (abstract p d)

The first argument (a:->b, b, Shrunk) is used to display the function and to orient shrinking. We need it because a -> b is not showable.

The second argument is the reified function. We extract and use it in the body of the property. abstract takes a function structure a :-> b and a value of the return type b. It returns a function a -> b.

abstract, when the given a Map structure, is:

abstract (Map g _ p) d x = abstract p d (g x)

Map is a bridge between the function we want to structure and an equivalent function that QuickCheck knows how to structure.

\x -> abstract p d (g x) is the function returned by abstract p d. It takes a string x and returns an integer. g x maps the string input into a value Either () (Char, String). If x is the string "go!", g x will be Right ("g", "o!"). If it’s an empty string, g x will be Left (). abstract p d is the function that reifies the structure Either () (Char, String) :-> Integer. It’s a function that takes an Either () (Char, String) and returns an integer.

For :+:, which structures a function that takes a value of Either, abstract is:

abstract (p :+: q)   d exy   = either (abstract p d) (abstract q d) exy

It depends on exy, the value of type Either. If it’s Left, the first structure is reified. If it’s Right, the second one is picked for reification.

For a Pair, it’s:

abstract (Pair p)    d (x,y) = abstract (fmap (\q -> abstract q d y) p) d x

The type of p is a :-> (b :-> c). The body first reifies the inner structure b :-> c and gets a function b -> c. Then, it reifies the structure a :-> (b -> c) and gets a function a -> (b -> c).

The implementation of fmap for :-> applies the function to the second argument. Using fmap f on a structure a :-> b gives a result of type a :-> (f b). fmap (\q -> abstract q d y) p takes care of the inner function.

Other implementations are:

abstract (Unit c)    _ _     = c
abstract Nil         d _     = d
abstract (Table xys) d x     = head ([y | (x',y) <- xys, x == x'] ++ [d])

The second argument d, a value of the return type, is used only when the structure is Nil. A nil structure returns a randomly generated value.

For Unit, the value inside the structure is returned no matter what the input is.

For Table, we search for a pair whose left side is the input and we return its right side.

Initially (before shrinking), the function returned by abstract is equivalent to the generated function f.

For our structure String :-> Integer, the reified function will be a similarly huge recursive structure. But, as only the needed path is evaluated, for the input "go!", the runtime evaluation will be:

result = (\char3Input -> char2Input -> char1Input -> f (char1Input:(char2Input:(char3Input:[])))) "!" "o" "g"

This is because inside abstract, when the structure is Pair, the inner structure is reified and called with the tail of the string (y in the definition). Then the outer structure is reified and called with the first character of the string, x in the definition.

This is another occurrence of the delay pattern. Here it’s implemented with recursion instead of monads. The evaluation is delayed until the characters of the string are collected. When all characters are there, the function is evaluated.

Checking a property (How it all fits together)

For QuickCheck to generate an instance of Fun for testing, Fun needs to instantiate the class Arbitrary.

It should provide:

  • a function arbitrary that returns a generated instance Gen (Fun a b)
  • a function shrink that returns a list of simplified Fun values given a Fun instance that fails the property.

The complete definition of Arbitrary for Fun is:

instance (Function a, CoArbitrary a, Arbitrary b) => Arbitrary (Fun a b) where
  arbitrary =
    do p <- arbitrary
       d <- arbitrary
       return (mkFun p d)

  shrink (Fun (p, d, s) f) =
    [ mkFun p' d' | (p', d') <- shrink (p, d) ] ++ [ Fun (p, d, Shrunk) f | s == NotShrunk ]

Inside arbitrary, we compute p and d by calling arbitrary twice. p creates a value Gen (a :-> b), or Gen (String :-> Integer). d creates a value Gen Integer.

Then, we create a result Gen (Fun String Integer) using mkFun.

shrink can be written also as:

  shrink (Fun (p, d, s) f) =
    [ Fun (p', d', NotShrunk) (abstract p' d') | (p', d') <- shrink (p, d) ]
    ++
    [ Fun (p, d, Shrunk) f | s == NotShrunk ]

In deconstructs a given instance of Fun into:

  • p, generated a :-> b instance
  • d, generated b instance
  • s, Shrunk/NotShrunk value,
  • f, generated function a -> b

To shrink an instance, QuickCheck creates two lists, one for NotShrunk instances, that is instances that can be shrunk further and one for Shrunk instances.

The second list is a singleton list. It contains the value we want to shrink as-is but with Shrunk instead of NotShrunk as a third argument.

To create the first list, the function shrinks the pair (p, d). This returns a list of pairs, each containing a simplified function structure a :-> b and a simpler value for the result type b. The library then shrinks each pair and creates a Fun from each result of pair-shrinking.

For shrinking the pair (p', d'), QuickCheck first shrinks p', gets a list of a :-> b, and creates a list where each of these shrunk structures is paired with the original d'. Then it does the same for d'. It creates an array of shrunk values then it pairs each of them with the original value of p'.

Shrinking functions structures is all about inserting Nil in the middle of recursive structures to create smaller trees.

To shrink a :+: value:

shrinkFun shr (p :+: q) =
  [ p .+. Nil | not (q == Nil) ] ++
  [ Nil .+. q | not (p == Nil) ] ++
  [ p  .+. q' | q' <- shrinkFun shr q ] ++
  [ p' .+. q  | p' <- shrinkFun shr p ]
 where
  Nil .+. Nil = Nil
  p   .+. q   = p :+: q

For a value that can be either p or q, it generates many shrunk values. One with p and Nil, one with Nil and q, some with the initial p and with different shrunk values of q, and some with shrunk values of q and with the initial value of q.

For Unit:

shrinkFun shr (Unit c) = [ Nil ] ++ [ Unit c' | c' <- shr c ]

The result is an array of Unit values where each returns a shrunk value of the initial returned value.

Targeted testing with PropEr

PropEr follows the same logic for both normal and targeted testing. It generates, tests, shrinks, and loops. The abstractions are solid. The inputs and the outputs of each step are the same. The way PropEr represents types, generators, and shrinkers is good enough to model both scenarios seamlessly.

During normal testing, the library generates instances of candidates for testing randomly. It checks the property with big numbers, small and very small integers, and then big again, … During targeted Property-Based Testing, it follows a generation strategy. A set of rules pushes the values toward greater chances of finding failed instances.

The philosophy of targeted testing

The main components of targeted testing are the state and the target.

The state describes where we are: the current step, the size of steps when updating the state, and the current temperature value. The temperature is a value between 0 and 1. The number of steps is a value between 0 and the maximum number, which is 1000 by default. As the number of steps grows, the temperature approaches 0.

The target contains the generated instances. It orients the next generations.

To generate an integer, for example, we call:

make_inrange(LastGeneratedValue, Offset, Min, Max)

The next integer will either be LastGeneratedValue + Offset or LastGeneratedValue - Offset, as long as the result is in the range Min..Max. Offset is a function of the temperature. Min and Max are stored inside the integer type definition. We can override them.

The library provides a function to create Offset from a temperature, a scaling function. It maps a number between 0 and 1 to an offset we can add or subtract from a generated value. We can also define our scaling function.

The temperature, and thus the offset, changes when the fitness changes. We move the temperature to 0 by updating the fitness. If we provide a valid value for the fitness, the temperature is updated. We can think of fitness as the value we use to move the temperature.

Whether a given fitness is valid or not depends on the strategy. For hill-climbing, a value is valid if it’s bigger than the last value we used for the update. This is why, in the implementation of MAXIMAZE update, the fitness value is the last generated value.

Generation is managed by a neighborhood function instead of a generator. Such a function calls the raw type generator internally. However, it takes into consideration the last generated value and the current temperature when returning an instance.

To shrink a value during targeted testing, we either shrink it using the normal-testing shrinker, or we provide generators that might also take the last generated value into account.

Targeted testing can be done with one or multiple workers. Multiple workers need communication. They need to synchronize the generation of instances so that each generation, from whichever worker, pushes the values toward a target. Such communication is performed with a server component, a shared worker that generates the values for all the testing workers.

A bridge between normal testing and a search strategy

Targeted testing is managed by a module named proper_target. This module implements a gen_server.

This behavior module provides the server of a client-server relation. A generic server process (gen_server) implemented using this module has a standard set of interface functions and includes functionality for tracing and error reporting. It also fits into an OTP supervision tree. For more information, see section gen_server Behaviour in OTP Design Principles.

As opposed to a normal worker, which acts as a peer in a distributed network, the target testing module acts as a server in a client/server architecture. It manages the access to a shared resource.

If we perform testing with one worker, we won’t need it to be a server. A worker with a local state or a module with stateless functions that take the state as arguments will be enough.

The server encapsulates the current state of generation. It maintains:

  • a structure named target that contains the current generation state
  • a structure named data that encapsulates some variables that guide the generation and a reference toward a strategy module.

data is initialized to:

#sa_data{k_max = Steps, p = get_acceptance_function(), temp_func = get_temperature_function()}

k_max is the maximum step count. It’s 1000 by default. Each temperature update is a new step. Steps after the maximum one are ignored.

The acceptance function is returned by get_acceptance_function() and it’s kept inside p. Such a function checks whether a new state (a new value of Fitness and Temperature) can build the next state or not. The temperature function, returned by get_temperature_function(), calculates the next temperature value. We’ll get back to these functions at the end.

target is initialized to:

{ok, InitialValue} = proper_gen:safe_generate(First),
#sa_target{first = First, next = Next, last_generated = InitialValue}

first is the input type of the property. This is the type before wrapping. last_generated contains the last generated instance. It’s initiated to a generated instance InitialValue.

next contains the function that’ll generate the next instance. It takes the last generated instance and the current temperature and returns a new instance. The last generated value orients the testing.

The targeted testing server acts as a bridge between normal testing logic and a search strategy. quickcheck() takes a search strategy as an option. It can be a reference to a module or an atom describing a primitive strategy. The default value is proper_sa. It denotes an internal module that implements Simulated Annealing.

Each message the server gets is a request for a value. It has an equivalent function in the strategy module. To respond to a request, the targeted testing module calls a function from the strategy module, passes the current state, and expects a return value and an updated state back. It returns the value and updates the local state.

Targeted generation

Before generating a new instance, PropEr wraps the initial input type using:

Target = proper_target:targeted(RawType),

Generation itself is similar during both normal and targeted testing. There’s always an input type that’s passed to proper_gen:safe_generate which generates an instance that PropEr uses to check the proprety. During normal testing, this is the type of the property input. We’ll call it the “raw type”. During targeted testing, it’s a custom type that wraps the raw type. We’ll call it the “wrapper type”.

This custom type delegates both generation and shrinking to the targeted testing server. Its generator looks like this:

TargetserverPid = get('$targetserver_pid'),
gen_server:call(TargetserverPid, gen)

The library sends a message gen to the targeted testing module and returns the response. $targetserver_pid is the pid of the server.

In the handler of a gen message, the targeted testing module uses the next function from the target structure to generate an instance.

This next function is created before the testing starts. To create it, the module looks for a replacer generator that works with the property raw type inside the list of generators it maintains. A “replacer generator”, or a “neighborhood function”, is the generator of the wrapper type.

The replacer generator of integers looks like this:

#{max := MaxD} = get(?GEN_NEXT_DEPTH),
Temp = if
          MaxD =:= 1 -> Temperature;
          true       -> 0.25 + 0.25 * 1/(1-MaxD) * Temperature * (Depth-1);
       end,
OffsetLimit = trunc(abs(Min - Max) * Temp * 0.1) + 1,
Offset = proper_arith:rand_int(-OffsetLimit, OffsetLimit),
make_inrange(Base, Offset, Min, Max)

This code fetches the values of Min and Max from the raw type structure, finds an offset, and then pushes the last generated value up or down within the offset range.

make_inrange takes a value, an offset, and a range. It returns either value + offset when value value + offset stays inside the range. It returns value - offset when value + offset falls outside. And if any of these additions falls outside the range, the lower or the higher bound of the range is returned.

Offset is a random number between -OffsetLimit and OffsetLimit. OffsetLimit grows as Temp gets bigger. That is, as we make more steps.

Temperature in the second line is Data#sa_data.temperature. It’s the current temperature in the server state.

Depth is the order of the current replacer generator in the recursive generation. A replacer generator might call other replacer generators. For example, to generate a list of integers, its replacer generator calls the integer replacer to generate each element. Depth is initially 1. It can incremented before the list generator generates a new element in the list. And, it’s always incremented at the beginning of the inner type replacer generator.

GEN_NEXT_DEPTH is a macro that evaluates to the atom proper_gen_next_cache_depth. It’s defined as:

-define(GEN_NEXT_DEPTH,  proper_gen_next_cache_depth).

The macro is a shorthand for the whole atom name. This reduces the probability of conflicts because the name is big and because it’s stored inside the process dictionary.

GEN_NEXT_DEPTH is initialized to 1. Temp will be the value of Temperature for the first generation. It’s incremented if we adjust the temperature by calling adjust_temperature. The latter is often called when a replacer generator uses a replacer generator of another type. To create a list of integers, the replacer generator of a list increases GEN_NEXT_DEPTH and calls the replacer generator of an integer to generate an element.

Each replacer generator creates a new type, we’ll call it the “replacer type”. To sum up, the generator inside the “wrapper type” uses the current temperature and the last instance to create a “replacer type”. The replacer generates an instance of this latter using proper_gen:safe_generate. The returned value is used to check the property.

The creation of a replacer type is done in two steps, one at the beginning during initialization and one during the generation of an instance.

The first looks like this:

{ok, Replacer} = get_replacer(InitialType),
UnrestrictedGenerator = Replacer(InitialType),
RestrictedGenerator = apply_constraints(UnrestrictedGenerator, InitialType),
TemperaturedGenerator = apply_temperature_scaling(RestrictedGenerator),

Replacer is the replacer selected from the maintained list. UnrestrictedGenerator is the generator that takes an instance and a temperature and returns a new instance.

apply_constraints makes sure the generated instance satisfies the constraints of the raw type. It creates a function that executes the replacer generator. If the value returned by this generator does not satisfy all the constraints. It tries to execute it again and get a new value. If it does satisfy, it returns it.

apply_temperature_scaling takes the generator and generates a new one itself. The new generator calls RestrictedGenerator with an incremented depth if this generation is not the root generation. It sets the depth to 1 otherwise.

The second step looks like this:

BaseType = new_type({ generator, () -> TemperaturedGenerator(Base, Temperature) }, wrapper),
{ok, Parameters} = proper_types:find_prop(parameters, InitialType),
Type = proper_types:with_parameters(Parameters, BaseType)

Base is the last generated value. Temperature is the current temperature. Type is the replacer type. TemperaturedGenerator is the generator created in the first step.

Targeted shrinking

The “wrapper type” is created as:

?SHRINK(
  proper_types:exactly(?LAZY(targeted_gen())),
  [get_shrinker(Type)]
)

The first argument of ?SHRINK is the “wrapper type”. The second argument is the list of shrinkers we’ll include in this type. That list contains only get_shrinker(Type) here.

Here are some examples from shrinking tests:

{?SHRINK(pos_integer(),[0]), ...},
{?SHRINK(float(),[integer(),atom()]), ...},

The second argument of ?SHRINK is the list of alternative generators.

%%%   ...the generators in `<List_of_alt_gens>' are first run to produce
%%%   hopefully simpler instances of the type. Thus, the generators in the
%%%   second argument should be simpler than the default. The simplest ones
%%%   should be at the front of the list, since those are the generators
%%%   preferred by the shrinking subsystem. Like the main `<Generator>', the
%%%   alternatives may also evaluate to a type, which is generated recursively.

get_shrinker is the response of sending the message shrinker to the targeted testing module:

TargetserverPid = get('$targetserver_pid'),
gen_server:call(TargetserverPid, shrinker)

The handler of this message returns either the raw type of the property or a new type created by evaluating user_nf property. user_nf is an option we can pass initially to quickcheck() if we want to override the default replacer generator/neighborhood function.

For example, when the user neighborhood function is:

?USERNF(exactly(0), fun (Base, _) -> Base + 1 end)

The first generated instance will be 0, the next will be 1, and so on. Each newly generated instance is an incremented value of the previous one.

Here is another occurrence of the “delay” pattern. The alternative types are created when shrinking. They’re selected at the moment of creating the “wrapper type”. For this operation to handle the list similarly, the alternative generator is encapsulated as the return of a function.

This macro is used:

-define(DELAY(X), fun() -> X end).

Another way to implement this is to create an aggregator. We put each value inside the moment it’s ready.

The type returned by get_shrinker is not used directly during shrinking.

The “wrapper type” shrinkers are unwrap_shrinker and alternate_shrinker. They unwrap the “wrapper type”, get a list of types, then look for the type of the target instance in this list. The alternative generator is the first element in the list. They are followed by the raw type of the property.

The type both shrinkers look for is called the “plausible type”. It splits the list of types in two.

unwrap_shrinker follows the same shrinking approach as normal testing. The “plausible type” is a type structure that contains some shrinkers, usually simple shrinkers. unwrap_shrinker uses them to simplify the failing instance.

alternate_shrinker generates values using the generators of the “plausible type” and the types that preceded it in the list of types generator. Each generated instance of these types is a candidate shrinking value of the wrapped type.

Manually pushing the values up

prop_int() ->
  ?FORALL_TARGETED(
    I,
    int_user_nf(),
    (I) ->
      ?MAXIMIZE(I),
      I < 500
  ).

?MAXIMIZE is called at the beginning of the body to push the generated values of I up. It sends a message update_fitness() to the targeted testing server. All interactions with targeted testing go through a message to the server. The handler calls update_fitness() to update the structure that contains the temperature and the fitness:

{ NewTarget, NewData } = Strategy:update_fitness(Fitness, Target, Data),

Target is the existing target structure. Data is the existing data structure. Fitness is the body of the message update_fitness. This value is set by the test author when he calls MAXIMIZE or MINIMIZE in the test body.

A test where we change the fitness might also look like this:

prop_target() ->                 % Try to check that
  ?EXISTS(Input, Params,         % some input exists
          begin                  % that fulfills the property.
            UV = SUT:run(Input), % Do so by running SUT with Input
            ?MAXIMIZE(UV),       % and maximize its Utility Value
            UV < Threshold       % up to some Threshold.
          end).

Here, Fitness is the value of UV.

update_fitness checks whether a state with this new fitness value is valid or not using the acceptance function. If the state is valid or during the first call to update_fitness, it creates a new temperature with the temperature function and updates the last generated value, the temperature, and the current step. If it’s not, it creates a new temperature with the temperature function and updates the temperature and the current step.

The acceptance function differs depending on the search strategy. During hill-climbing, it returns true when the new fitness value is bigger than the old. Otherwise, it returns false. During simulated annealing, it returns true when the new fitness value is bigger or when the probabilistic acceptance is bigger than a random probability threshold.

The random probabilistic value is a fixed value generated initially (rand:uniform())[https://www.erlang.org/doc/man/rand#uniform-0]. The acceptance value is:

1 / (1 + math:exp(abs(EnergyCurrent - EnergyNew) / Temperature))

The default temperature function, that is the one that’ll be used if no such function is set explicitly, handles both accepted and rejected states similarly. It moves the temperature to 0 as the current step gets bigger, and it increments the current step by 1:

{1.0 - min(1, K_Current / K_Max), K_Current + 1}.

K_Current is the current step. K_MAX is the number of search steps.

PropErp type server

Types in PropEr are structures. Due to the good meta-programming support, they can be passed also as strings.

%%% PropEr can parse types expressed in Erlang's type language and convert them
%%% to its own type format. Such expressions can be used instead of regular type
%%% constructors in the second argument of `?FORALL's.

The module implements a server as multiple testing workers can parse spec tests. Spec tests are signature-based properties. The signature of spec is parsed as a String. It’s then transformed into type structures by this module.

At the core, creating a spec test is:

Test = ?FORALL(
  Args,
  FinType,
  apply_spec_test(MFA, FunRepr, SpecTimeout, FalsePositiveMFAs, Args)
)
  • FinType is the result of converting the input type of the property to a PropEr type structure.
  • MFA is a structure that contains the module, the name, and the arity of the spec to test.
  • FunRepr is the entry inside the state exp_specs that parses the target spec.
  • SpecTimeout is a timeout that ensures the spec execution time is bounded.
  • FalsePositiveMFAs is a function that’s called with the result of the spec. The returned value of the function that implements the spec might have a different type than the spec return type. If such a value is tested and marked valid by FalsePositiveMFAs, the check succeeds.

apply_spec_test uses ?DELAY to create a function that takes no arguments and that evaluates apply(Mod, Fun, Args). The created function returns true when the type of the outcome of this application is the same as spec return type, or when FalsePositiveMFAs considers it valid if it has a different type.

The type server starts before checking a property and stops at the end. It maintains a state that contains the translated/parsed types, named cached. It’s a dictionary where the key is the raw type and the value is the final type. Given a type to translate, the module checks the cache first.

The state contains another dictionary for modules, named exp_types. The keys are module names. The values are references to the types inside the modules. Each type reference is composed of a type name (an atom) and an arity value. The intersection of module/name/arity identifies a type.

Erlang offers meta-programming utilities such as the functions inside erl_scan and erl_parse. erl_scan:string takes a string and tokenizes it. erl_parse:parse_form transforms the tokens into an abstract syntax tree.

The given string might be a valid expression but still not a type. parse_type tokenizes then parses a fake type definition created with the type:

"-type mytype() :: " ++ GivenString ++ "."

It deconstructs the AST tree returned by this and extracts the type expression.

The core logic of translating a string describing a type is:

{Mod,Str} = ImmType,
{ok,TypeForm} = parse_type(Str),
{ok,NewState} = add_module(Mod, State),
{ok,FinType, #state{cached = Cached} = FinalState} = convert(Mod, TypeForm, NewState),

add_module checks whether the module exists in the current state of the server. That is, if there’s an element in the dictionary exp_types for it.

If not, add_module parses the module. It saves its exported types and spec tests to its state. Using the same utilities used to parse a type, the library parses the module source code and searches for exported expressions.

It iterates over the abstract code using a foldl operator. It tries to match each element with one of the expected structures.

An exported type is detected using the first atom attribute:

{attribute,_Line,export_type,TypesList}

Here TypesList is added to the list of exported types inside the state.

A spec has the atom spec as a third argument:

{attribute,_Line,spec,{RawFunRef,[RawFirstClause | _Rest]}}

The function name and its arity are extracted from RawFunRef. Its range and domain are extracted from RawFunRef. Then a new entry is added to the dictionary of specs where the key is the pair {Name,Arity} and the value is the pair {Domain,Range}.

convert computes the updated state and the final type structure. The letter is a structure we can use to generate instances for a property using proper_gen:safe_generate(). The type server deconstructs the AST node TypeForm. Depending on the atom that represents the type name, it uses the same constructs the main module uses to define types.

If the given type is a list of integers, the server expects TypeForm to be:

{type,_,list,[ElemForm]}

and ElemFrom to be:

{integer,_,_Int}

To create a type structure, it uses:

proper_types:list(proper_types:exactly(Int))

which is inlined to:

new_type(
  [
    {generator, {typed, fun list_gen/2}},
    {is_instance, {typed, fun list_is_instance/2}},
    {
      internal_type,
      new_type([{env, Int}, {generator, {typed, fun exactly_gen/1}}], basic)
    },
  ],
  container
)
  • Say Hi
  • If you want to get a notification when I write a post, join my newsletter: