ICS 32 Winter 2022
Notes and Examples: Test-Driven Development
What is test-driven development?
Test-driven development encourages you to build a program one small feature at a time, taking small steps from one piece of stable ground to another. The notion of "small feature" is open to debate, though a good guideline is to prefer features as simple as "The size of a newly-created collection of songs is zero" over features as complex as "A class to represent a collection of songs" or "A graphical user interface." The goal is to write a test that verifies the behavior of the new feature, then to write the code that implements the feature, using the test as a guide to indicate when you're done. At this point, you'll have a feature that is complete and tested, which means you've taken a step on to stable ground; more importantly, you have a test that you can keep until the feature's required behavior changes, which you'll be able to run repeatedly to ensure that your feature still works as you make other changes and add new features to your program. (Contrast this approach to the one you've taken as you've worked on your programs to date. With your current approach, how do you know that some part of your program is finished? How do you ensure that it continues to work correctly as you continue to make changes to your program? The answer, for most students, is some form of rote, mechanical testing and repeated re-testing.)
In lecture, we went through a step-by-step example as a group, developing portions of a SongCollection class using a test-driven development process. We did our best to follow all of the steps, though we sometimes forgot (or took liberties in the interest of time). Because it's so different from the programming style we're accustomed to, it takes a little time to adjust and get into the rhythm of test-driven development. But don't let the learning curve chase you away! It doesn't take long to get adjusted, and the benefits are higher-quality code — in terms of both how well it works and how well it's designed — and the ability to make changes to your program with confidence.
The steps in the test-driven development process are as follows.
After going through one iteration of this process, you'll have added one new feature to your program, verified that the feature works as expected, and cleaned up any brewing design problems before they become significantly bigger problems later. Each subsequent iteration adds new functionality, while verifiably preserving old functionality. Meanwhile, your design will likely need to be pretty clean — unit testing demands a design in which the individual pieces are broken down and know as little as possible about one another, which is a good goal — and the tests will form a lasting record of your understanding of how your code is supposed to work.
Test-driven development is most likely very different than what you've done in the past, but it leads to a very different kind of result, too.
What is unit testing?
Unit testing is one kind of testing that you might perform on a program you're writing, with the goal of verifying that small, individual pieces of its behavior are correct, outside of the effect of all the other pieces around it. We focus our attention not just on individual modules in a Python program, but on individual behaviors; moreso than just individual functions, we focus on each way that the functions may behave (i.e., there are usually multiple unit tests that contribute to the testing of one function).
What tools do we need?
Performing unit testing is a valuable thing to be able to do; with it, we can gain a level of confidence in the quality of code we write that is harder to achieve without it. But how do we actually do it?
One way is to start a Python shell, load a module into it, and then start running our tests manually, by typing them in and looking at the output. One nice thing about Python is that the Python shell gives us a tool for this kind of thing; we don't need to write a full-fledged program to see the output of individual functions. However, this should nonetheless strike you as a poor choice. It's boring, tedious work — typing in some expressions, then verifying that the output is what we expected.
But the nice thing about boring, tedious work is that it tends to be the kind of work that is most amenable to automation. We should be able to write programs that test our programs for us! Then, any time we want to re-test everything, all we need to do is run our test program and see what happens.
A unit testing framework is a library that helps us to write programs like this. The Python standard library includes one, which is called unittest. It handles a few of the more repetitive chores for us:
A step-by-step example of test-driven development
In lecture, we worked through several iterations of a test-driven development process, where we wrote portions of two classes we called SongCollection and Song, starting with nothing and using tests to drive our decision-making. We used the unittest module in the Python standard library to write our tests. While it took us most of a lecture to get that code written and tested, that was mainly because I was describing a set of techniques that I expected to be new to you. In practice, each of those iterations would have likely taken no more than a few minutes; if it was me working on my own, I'd have finished the simplest of them in something more like 30-45 seconds, though they aren't usually that simple, of course.
As promised in lecture, I'm providing a step-by-step account of what we did and why we did it. While it's possible that this won't be identical to what we did in lecture — this example tends to turn out a little differently every time I do it — this will certainly capture the spirit of what we were doing, and the "why" is much more important here than the "what."
What if I still discover a bug?
We didn't talk in lecture about what should be done if you discover a bug in your program, even if you've faithfully adhered to a test-driven strategy. Naturally, using a test-driven development process does not guarantee that a program will work, for a variety of reasons, even if you have no failing unit tests. Following this process allows the tests to help you avoid many mistakes, but there are many other aspects of software development that this process doesn't do much to improve. First of all, your program only works as well as your tests say it will; if one of your tests expects behavior that is incorrect (e.g., the size of an empty collection is 1) and you write code that passes the test, that doesn't mean that the code makes sense in a broader context. Similarly, tests can't verify that the program's requirements are appropriate; if you are tasked with building software that won't meet the business needs of your customer, tests won't help you identify the issue. In short, testing helps verify that a program is correct, but the notion of "correct" often isn't a black-and-white one.
So, unfortunately, there will still be bugs. The question is what should be done when you discover one. The following steps can guide you through your bug-fixing:
Now you can have confidence that you've not only fixed the problem, but also haven't broken anything else that previously worked. You'll again reach stable ground quickly, and you'll have assurance that you'll know if this bug ever resurfaces; your new test would then start failing again.
Testing side effects
Where test-driven development excels most is in testing functions that are pure. Pure functions are those take inputs and give outputs that are calculated only from those inputs; they're like mathematical functions, in the sense that they always return the same outputs given the same inputs. As you might imagine, these are a lot easier to test than the alternative, because there's no need to think about doing things in a particular sequence, or to worry that the behavior of one function will have affected the outcome of another.
However, functions do quite often have side effects, so it's reasonable to wonder how you might test them. Side effects are anything other than calculating a result from the inputs, which can include printing output to the Python shell, reading input from the keyboard, drawing graphics, writing to files, playing sounds, or even just adding a value to a list. Even the add() method in our SongCollection class had a side effect, because it took the Song object we gave it and added it to a list, which affected the result of subsequently-called methods on that SongCollection.
So, suffice it to say, we can't avoid writing functions with side effects, which means we need to consider how we might write unit tests for them. How you do it requires a two-pronged approach.
Additional thoughts
Give this process a genuine try when you work on Project #4, even if it feels less productive — or just plain strange — when compared to your usual strategy for writing your programs. Trust me; for the kind of thing you're building in Project #4 (particularly the game mechanics), if you can get yourself into a rhythm, you will find yourself writing higher-quality code more quickly, with fewer mistakes early on and less debugging to do at the end. As we learned from our experience in lecture, test-driven development works very nicely with pair programming. I sometimes made mistakes in my haste to get code written while still explaining everything to you, but with you folks working collectively as my "partner," we ended up with virtually no mistakes that lasted longer than a few seconds.
You'll definitely find, though, that not all kinds of programs lend themselves to these techniques. For example, some of the graphical portions of Project #5 will probably not be easily testable this way; it's not so simple to write a unit test that demonstrates that the image drawn in a PyGame window is precisely the right image. But to the extent that you can separate this code a bit — the way we did in our PyGame examples in lecture, where we had most of the interesting decisions made in a "model" module (separate from our "view") — you'll find that substantial portions of it might be very testable, even if the outermost layer that talks to PyGame is not.
Above all, have fun! Developing software should be an exciting, enjoyable, and stimulating experience. Test-driven development, when used appropriately, can take away a good deal of the frustration involved, allowing you to concentrate on understanding the problem and constructing a clean solution for it. It's not a silver bullet — nothing in software is — but it is nonetheless a wonderfully useful technique to have under your belt.
Finding more information about unittest
Like other modules in the Python standard library, unittest has a set of documentation that describes its behavior in detail. That document is linked below.