Template Training

Assertive Defensive Programming

Overview

  • Teaching: 10 min
  • Exercises: 5 min

Questions

  • How can we compare observed and expected values?

Objectives

  • Assertions are one line tests embedded in code.
  • Assertions can halt execution if something unexpected happens.
  • Assertions are the building blocks of tests.

From your cloned library on notebooks.azure.com, run the library and launch a new Python3.6 notebook.

Assertions are the simplest type of test. They are used as a tool for bounding acceptable behavior during runtime. The assert keyword in python has the following behavior:

In [29]:
assert True == False
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-29-2baf78a17125> in <module>
----> 1 assert True == False

AssertionError: 

Assert raises and Assertion Error if the test condition is not True:

In [30]:
assert False
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-30-a871fdc9ebee> in <module>
----> 1 assert False

AssertionError: 

That is, assertions halt code execution instantly if the comparison is false. If the test condition is True the assert is passed silently:

In [31]:
assert True == True
assert False == False
assert True

These are therefore a very good tool for guarding the function against foolish (e.g. human) input. Let's consider a variation on the mean function that we created earlier:

In [32]:
def mean(sample):
    '''
    Takes a list of numbers, sample
    
    and returns the mean.
    '''
    
    sample_mean = sum(sample) / len(sample)
    return sample_mean

Let's perform a check to see whether the function calculates the mean as we would expect:

In [33]:
numbers = [1, 2, 3, 4, 5]

mean(numbers)
Out[33]:
3.0

Now let's create an empty list, and pass it to our function. What is going to happen and why?

In [34]:
no_numbers = []
mean(no_numbers)
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-34-73a4370edc7d> in <module>
      1 no_numbers = []
----> 2 mean(no_numbers)

<ipython-input-32-4a639206b527> in mean(sample)
      6     '''
      7 
----> 8     sample_mean = sum(sample) / len(sample)
      9     return sample_mean

ZeroDivisionError: division by zero

We receive a ZeroDivisionError: division by zero error and an indication of which line the error occurs on.

Use assert to catch divide by zero

With a short code it is relatively trivial to track down the source of this error, however in more complicated codes this may not be the case. To mitigate having to spend time we can spend tracking down errors we can use assert to help us track this down. Re-write the mean function using an assert instruction to raise an AssertionError instead of a ZeroDivisionError.

Solution

The function now exits with an assertion error and indicates where the assertion happened and provides a message to help identify what caused the problem.

The advantage of assertions is their ease of use. They are rarely more than one line of code. The disadvantage is that assertions halt execution indiscriminately and the helpfulness of the resulting error message can be quite limited.

Also, input checking may require decending a rabbit hole of exceptional cases. What happens when the input provided to the mean function is a string, rather than a list of numbers?

First let's make sure that we have the most recent version of our mean function defined:

In [35]:
def mean(sample):
    '''
    Takes a list of numbers, sample
    
    and returns the mean.
    '''
    
    assert len(sample) != 0, "Unable to take the mean of an empty list."
    sample_mean = sum(sample) / len(sample)
    return sample_mean

Now create a new list containing numbers and string(s) and call the function on it:

In [36]:
word_and_numbers = [1, 2, 3, 4, "apple"]

print(mean(word_and_numbers))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-36-c96bf4e677cc> in <module>
      1 word_and_numbers = [1, 2, 3, 4, "apple"]
      2 
----> 3 print(mean(word_and_numbers))

<ipython-input-35-f53d0118bf0c> in mean(sample)
      7 
      8     assert len(sample) != 0, "Unable to take the mean of an empty list."
----> 9     sample_mean = sum(sample) / len(sample)
     10     return sample_mean

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Note that now we do not see an assertion exit but a TypeError.

Checking type

It would be useful to perform a check that the list, only contains integers.

Hint: One option is to use the isinstance function.

Using isinstance or otherwise, modify your mean function so that it raises as AssertionError and provides a suitable advisory message.

Solution

Asserting with numbers (Integers)

Try the following:

In [37]:
assert 2 == 2
assert 2 + 2 == 4
assert 2 + 2 == 5
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-37-83186924aa22> in <module>
      1 assert 2 == 2
      2 assert 2 + 2 == 4
----> 3 assert 2 + 2 == 5

AssertionError: 

This is exactly what we expect would happen, as long as we are not living in 1984 inspired dystopia. These assertions are comparing integer values.

Asserting with numbers (Floats)

Let's now try a similar experiment with floats:

In [38]:
assert 0.1 == 0.1
assert 0 + 0.1 == 0.1
assert 0.1 + 0.1 == 0.2
assert 0.1 + 0.2 == 0.3
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-38-4dd6fff0e6c8> in <module>
      2 assert 0 + 0.1 == 0.1
      3 assert 0.1 + 0.1 == 0.2
----> 4 assert 0.1 + 0.2 == 0.3

AssertionError: 

Perhaps we are living in a dystopian post-truth world afterall. Actually something more subtle is going on, to do with how a computer represents numbers in memory, remember we mentioned that the precision of floats can lead to unexpected behaviour, this is an example. This makes it dangerous, to make direct comparisons of floating point numbers, which is explored further in the remainder of this episode.

Testing 'near equality': Numpy to the rescue

Thankfully we do not have to worry about how to deal with numerical recision ourselves. The wonderful numpy developers have provided a module to help us.

We we want to compare two floats we must not test 'equality' but 'proximity'. We can do this with numpy's assert_almost_equal function:

In [39]:
from numpy.testing import assert_almost_equal
assert_almost_equal(0.1, 0.1)
assert_almost_equal(0 + 0.1, 0.1)
assert_almost_equal(0.1 + 0.1, 0.2)
assert_almost_equal(0.1 + 0.2, 0.3)

Key Points

  • Assertions are one line tests embedded in code.
  • The assert keyword is used to set an assertion.
  • Assertions halt execution if the argument is false.
  • Assertions do nothing if the argument is true.
  • Assertions allow us to provide an error message to help track down the source(s) of errors
  • The numpy.testing module provides tools for numeric testing.
  • Assertions are the building blocks of tests.