ICS 33 Spring 2024, Notes and Examples: Revisiting Recursion

Single, direct recursion

Let's start with a short but recursive function factorial that, given an integer n, calculates the factorial of n (i.e., the product of the integers from 1 through n, inclusive).


def factorial(n: int) -> int:
    if n == 0:
        return 1
    else:
        return n * factorial(n - 1)

There's some terminology that we ought to agree on before we go any further, which can be used to describe the patterns that are in use here.

We say that this function exhibits recursion, because part of what happens during the execution of the function's body is another call to the same function.
We say, further, that this function exhibits direct recursion, because factorial calls factorial itself. (This is in opposition to indirect recursion, or mutual recursion, in which factorial would call another function that would, in turn, call factorial again.)
Finally, we say that factorial exhibits single recursion, because each call to factorial is responsible for at most one recursive call to factorial. (That recursive call might additionally make its own recursive call, but that's not what we're considering when we say that recursion is single.)

If we call this function in the Python shell, it dutifully returns a result, but offers us no details describing the mechanics used to determine it.


>>> factorial(5)
    120

But we ought to be sure we understand those mechanics — even if we think this function is simple enough that we don't need them — because those same mechanics underlie more complex recursive functions, as well, and because only an understanding of those mechanics will allow us to understand the performance characteristics of a function like this one.

The first thing to realize is that a recursive call to a function is no different from any other call to a function. Whether recursive or not, calling a function triggers the same sequence of events.

Evaluate the arguments that will be passed to the called function.
Pause the execution of the calling function.
Given the arguments, execute the called function in its own local scope, producing a result.
Only when the called function has finished, continue the calling function from where it left off, but with the called function's result becoming the value of the call expression.

The only difference in a recursive call is the slightly mind-bending realization that more than one call to the same function can be active simultaneously, with one of them executing and the others waiting for results to be returned to them. Each active call to that function will have its own local scope, meaning its own local variables — separate from the others, even though they have the same names — allowing them to do their work independently, except to the extent that one of them passes arguments to the next one and returns its result to the previous one. (This is one of the hallmarks of recursion's mechanics: The various calls to the same function communicate with each other only through arguments and results.)

Early in the process of executing factorial(5), we'd reach this point.

factorial(5) = 5 * factorial(4)
factorial(4) = ???

At this point, factorial(5) has called factorial(4) — the same function with a different argument — and is waiting for an answer, which hasn't been determined yet. Until it gets that answer, it can't perform its multiplication, since that answer is one of the operands of the multiplication.

As the process unfolds further, we'll reach a point more like this.

factorial(5) = 5 * factorial(4)
factorial(4) = 4 * factorial(3)
factorial(3) = 3 * factorial(2)
factorial(2) = 2 * factorial(1)
factorial(1) = 1 * factorial(0)
factorial(0) = ???

At this point, there are six calls to factorial active, with five of them waiting for a result and factorial(0) being the one that's just beginning to execute its body. But this is where the tide turns, because factorial(0) has an answer that doesn't require recursion to calculate; we've decided that its answer is always 1, which will be returned back to factorial(1), which can now perform its multiplication and be ready to return a result back to its caller.

factorial(5) = 5 * factorial(4)
factorial(4) = 4 * factorial(3)
factorial(3) = 3 * factorial(2)
factorial(2) = 2 * factorial(1)
factorial(1) = 1 * 1

One by one, the pending multiplications are unblocked by one call to factorial returning an answer to another, until, finally, we reach the point where factorial(5) can perform its multiplication and return its overall result of 120.

Structurally, the function's body is made up of two distinct parts, only one of which is executed in each call, chosen based on the value of n passed to it. Those two parts correspond to the two phases of the function's "tide" that we saw in action.

The recursive case is the situation where calculating an answer requires us to first calculate an answer to a smaller version of the same problem. This is the situation in which the depth of our calls continues increasing; the "tide" is going out.
The base case is the situation where we have a known answer, which is usually the smallest or simplest version of the problem we might face. Reaching this case is where we saw the "tide turn" during factorial's execution, and that "tide" began coming back in.

The phrase "smaller version of the same problem" there is an important one: If we aren't making progress toward the base case at every recursive step, the recursion will never end. Given the way we wrote our factorial function, if we pass a negative argument to it to start with, n will continue to decrease but never reach zero, which suggests that factorial might run forever. Yet, when we try to evaluate factorial(-1), something else happens instead.


>>> factorial(-1)
    Traceback (most recent call last):
      ...
      File "<input>", line 5, in factorial
      File "<input>", line 5, in factorial
      File "<input>", line 5, in factorial
      [Previous line repeated 986 more times]
      File "<input>", line 2, in factorial
    RecursionError: maximum recursion depth exceeded in comparison

When a function f calls a function g, f is paused until g returns, which makes it necessary for Python to remember where f left off, what the values of its local variables are, and so on. If g calls h, g is also paused similarly. This information about each active function call is stored in a data structure that's sometimes called the run-time stack, so named because the information about each function is "stacked up" on top of the information about the function that called it, with information only ever being added or removed at the top of the stack. Like most programming languages, Python establishes a limit on how many calls to functions can be paused in this way simultaneously, i.e., it limits the size of the run-time stack; exceeding that limit is sometimes called stack overflow.

What we saw in the failure of factorial(-1) was essentially stack overflow: We reached Python's limit on how many function calls could be simultaneously active. While we could certainly clean this up in some way — for example, by validating that n is non-negative before proceeding with a recursive solution — we'll defer that issue for now, and leave with an understanding of the issue's cause. When n started out negative, subtracting one from n did not cause us to progress toward the base case of n being zero; it instead caused us to progress away from it.

Analyzing the costs of single, direct recursion

Now that we have an understanding of the mechanics of our factorial function, we're armed with the knowledge we need to do an asymptotic analysis of it. As n grows, how does the time required to run it grow? How does the amount of memory needed grow? Let's take those questions in turn.

It's no accident that factorial(5) required a total of six calls to factorial and five multiplications. Had we assessed factorial(10) in detail instead, we would have seen eleven calls to factorial (one more than n) and ten multiplications (equal to n) instead. As n grows, the number of function calls and the number of multiplications — which, together, make up essentially all of the work being done by our function — grows in proportion to n. So, our closest-fit O-notation for the time spent on factorial(n) would be O(n).

What about memory? To answer this question, we need to understand what memory is being spent on. Each call to factorial needs one parameter, n, which we'll assume to be an integer of constant size (i.e., we'll continue to assume that all integers are the same size). But, since there might be as many as n + 1 calls active, we'll need enough memory to store n + 1 integers. Overall, our closest-fit O-notation for the memory required for factorial(n) is O(n).

So, from an asymptotic perspective, we need proportionally more time and proportionally more memory as we tackle progressively larger problems. The idea that we need linear time isn't so strange — unless we can derive a shortcut mathematical formula, we'll need to perform the multiplications, whether we're using recursion or not — but the linear amount of memory seems wasteful. Reducing the memory requirement to a constant one seems in the realm of possibility, but single, direct recursion leaves that possibility beyond our reach.

Additionally, we're confronted with the practical reality that there's a limit in Python on the number of function calls that can be "stacked up" before we run out of available space, which means we're bound by an artificially small limit on n that otherwise might not be there. There was vastly more memory available than would be consumed by roughly 1,000 calls to our factorial function, yet Python refused to let our recursion run any deeper than about 1,000 calls, which limits the kinds of problems for which recursion is suited in Python.

How might we do better? One answer — which, to be clear, is probably the best answer in Python — is to use a loop and a local variable to solve this problem, in lieu of using recursion, which would certainly allow us to solve it using O(n) time and O(1) memory. That's not because loops are definitively better than recursion for this kind of problem; it's because the design of Python makes loops a more suitable solution than recursion when solving the problem in Python. But not all programming languages make the same tradeoffs, so let's stop for a moment to understand how we might achieve a better result recursively.

Tail recursion

Regardless of the implementation details of our programming language, the memory costs of our factorial function in the previous example were fundamental, arising directly from one aspect of how we designed it. Each call to factorial made another recursive call to it, but would then need to pick up where it left off afterward by performing a multiplication. In other words, each call to factorial had work left to be done after its recursive call was finished. Consequently, it was necessary to remember the state of every active call to factorial as the tide went out — all of the local variables, where execution left off, and so on — and only when the tide turned as the base case was reached could any of these details be forgotten, one by one. So, the enemy here wasn't entirely Python. In this case, it was also us.

A somewhat different technique holds the potential to mitigate this problem. We say that a tail call to a function is one in which the called function's result also becomes the calling function's result, which is to say that the calling function will have no work to do once the called function is finished. Consider these three nonsensical Python functions.


def f(a, b):
    return (a * a, b * 2)

def g(a, b):
    return f(a, -b)

def h(a, b):
    return list(f(a, b))

Both g and h call f. Which is a tail call and which isn't? To answer that question, we need to consider the order in which g and h will actually do their work — not the order in which their code is written, but the order in which it will be executed.

g will evaluate the arguments it'll pass to f (including the negation of b), then it'll call f, and finally will return whatever f returns unchanged.
h will evaluate the arguments it'll pass to f, then it'll call f, then it'll take f's result and pass it to the list constructor, and finally return the resulting list.

So, we see that g's call to f is a tail call, but h's call to f is not, because g has no work left to do after its call to f is finished, while h does.

This seemingly minor technicality can be more important than it might sound, though, because it addresses the root cause of the memory problem we had in our factorial function. We were spending memory on remembering where we left off in each call to factorial. But if a function has nothing left to do except return the result that's returned to it, there's nothing to remember. When g calls f, we can arrange for f to return its result directly to g's caller, rather than returning it to g only for g to return it to its own caller. This doesn't just save time, but it also saves us from having to remember g's state while f runs; there's no call to g left to return to, anyway. So, once f is called, the call to g can be immediately and harmlessly forgotten.

Programming languages that implement tail calls this way are said to be performing a technique called tail call elimination. This term is a little misleading, because we're still calling the function; we're just using simpler and faster mechanics to do so, because we have no need to restore the caller's state when the called function is finished and, therefore, we have no need to save it in the meantime.

Using tail calls in a recursive function

The payoff in tail call elimination is proportional to the number of tail calls it's being applied to, so we'd expect a recursive function to be a particularly useful place to employ this technique, since every call to it would derive that benefit. Tail recursion is the technique of writing a recursive function whose recursive calls are tail calls, meaning that each time the function calls itself, it's done, so that whatever the recursive call returns will be its result, too.

There's a tradeoff, though: We have to design our function to make this possible, which means that we can't have any leftover work to do after a recursive call returns. We can't skip the things that need to be done, but we also can't defer them until after the recursive call is finished; we instead have to figure out a way to do them ahead of time. How might we do that in our factorial function? The answer lies in being sure the multiplications happen while the tide is going out, rather than on its way back in. Since the multiplications are the only operations that need to be done, if they're done on the way out, that would leave nothing remaining to be done on the way back in; the recursive calls would become tail calls.

As we've seen, in a recursive function, the communication between calls happens in two ways.

Communication while the tide is going out happens through the arguments passed to each call.
Communication while the tide is coming back in happens through the results returned to each call.

So, if we want to perform the multiplications on the way out, we'll need to carry an additional parameter that carries the results of those multiplications along. We could do this with a parameter that tracks the product of all of the multiplications performed so far. Each recursive call will need a different value of n, as before, but will also need a different value for that running product. An extra parameter like this is sometimes called an accumulator, because it accumulates the ongoing result as the recursion proceeds, similar to how a local variable might do the same job if we implemented this function iteratively instead of recursively.

A version of factorial that uses an accumulator parameter could be written like this.


def factorial(n: int, product: int) -> int:
    if n == 0:
        return product
    else:
        return factorial(n - 1, product * n)

What are the mechanics of our new factorial function when it calculates the factorial of 5?

factorial(5, 1)   = factorial(4, 5)
factorial(4, 5)   = factorial(3, 20)
factorial(3, 20)  = factorial(2, 60)
factorial(2, 60)  = factorial(1, 120)
factorial(1, 120) = factorial(0, 120)
factorial(0, 120) = 120

Once the tide turns upon reaching the base case, 120 will be returned all the way back through all of the recursive calls and become the overall result.

Unfortunately, we've made the function more difficult to use, because it now requires two arguments. What we could have written before as factorial(5) would now need to be written factorial(5, 1). We could partially mitigate that by giving the product parameter a default argument of 1, but that still leaves open the possibility of someone passing a value other than 1 — which, in this case, has no real utility, so would always be a mistake. So, a better solution would be to have two functions: A non-recursive one that can only be called with one argument, and a recursive one used only internally that accepts two. A natural way in Python for a function to be used only within another function is to nest it. Meanwhile, the non-recursive "outer" function would be a natural place to validate the arguments passed to it, so that validation would only have to be done once.

Putting all of those ideas together leads to this tail recursive version of factorial.


def factorial(n: int) -> int:
    def factorial_tail(n: int, product: int) -> int:
        if n == 2:
            return product * 2
        else:
            return factorial_tail(n - 1, product * n)

    if n < 0:
        raise ValueError(f'cannot calculate factorial of negative value: {n}')
    elif n == 0 or n == 1:
        return 1
    else:
        return factorial_tail(n, 1)

The technique of wrapping a recursive function with a non-recursive one is quite common, whether the recursive function is tail recursive or not. The parameters in a recursive function have the job of carrying necessary information into the recursion, but that information is not always available or obvious to the initial caller of the function; it's often best initialized within the function instead. The technique we see here offers a way to remove this burden from the initial caller, while still allowing the extra parameters to be available throughout the recursion as needed.

The effect of tail recursion

So, now that we've rewritten factorial to be tail recursive, we expect that it might be able to handle much larger problems than it could before. Let's see what happens.


>>> factorial(5)
    120
>>> factorial(6)
    720       # It seems to be working fine for small values of n, as before.
>>> factorial(3000)
    Traceback (most recent call last):
      ...
    RecursionError: maximum recursion depth exceeded in comparison

Unfortunately, as it turns out, Python does not perform tail call elimination, so this particular technique is of little benefit to us in Python. As before, factorial(3000) will require around 3,000 function calls to stack up — and for each of them to store information on the run-time stack — which exceeds the depth limit on the particular Python implementation I'm using here.

Python's refusal to perform tail call elimination is intentional, because this optimization involves a tradeoff. When we make a function call without remembering where we came from, it affects the traceback we receive when a subsequent call raises an exception. Once a tail call has been eliminated from the stack, it will appear as though it was never there — which is a good thing, from a performance perspective, but can be problematic when you're trying to debug a thorny problem and all you have is a misleading traceback that skips steps. So, ultimately, Python's designers made a reasoned choice, preferring clarity over performance. (Many programming language design choices are like this; there are few choices where we get something without giving something else up.) Since there are plenty of recursively shaped problems where the necessary depth of the recursion is well within Python's limits, we'll still be able to use recursion to solve those; for others, where loops are a natural fit, Python's design will require us to lean toward using loops instead.

But despite the choices that Python's designers made, it's worth noting that there are programming languages where tail calls are guaranteed to be eliminated, and others where it's done optionally as an optimization, which makes this a great technique to have in our repertoire when it can be used profitably. If Python performed tail call elimination, our tail-recursive factorial function would still have required O(n) time to run — it still performs the multiplications, as before — but only O(1) memory, since there would be no "paused" function call information to store on the run-time stack. (Even factorial's call to factorial_tail is a tail call.) Any time we can reduce something linear to something constant — whether it's time or memory — that's a big win, so we should be on the lookout for those opportunities wherever we can find them.

Multiple recursion

We say that a function exhibits multiple recursion if one call to that function is directly responsible for multiple calls to that function. Such a function makes one recursive call, directly or indirectly, pauses until it gets a result, and then, after obtaining that result, makes another recursive call. Our factorial function did not require multiple recursion, simply because the nature of the problem didn't require it, but some recursively shaped problems will certainly require it.

One of the universally demonstrated examples of this arises from the Fibonacci sequence, sometimes written as F(n), which is a sequence of integers in which each integer in the sequence is the sum of the two integers that precede it in the sequence. For the sequence to be entirely defined, we say that the 0^th value in the sequence (i.e., F(0)) is 0, and that the 1^st value in the sequence (i.e., F(1)) is 1. So, all in all, a straightforward mathematical definition of F(n) could be written this way.

       ⎧  0                   , when n = 0
F(n) = ⎨  1                   , when n = 1
       ⎩  F(n - 1) + F(n - 2) , otherwise

Given what we've been discussing here, we'll immediately recognize that this problem is described recursively, which means it accommodates a straightforward recursive solution in Python. If we have a definition of F(n) and all we want is a function that produces the correct answer for any given n, we can't get there much more simply than this.


def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fibonacci(n - 1) + fibonacci(n - 2)

But, of course, there's a difference between believing that something can work, testing that it works, and understanding how it works; in this course, we're in the business of considering all three, so let's focus our attention on the mechanics of our new fibonacci function.

As our fibonacci function executes, we'll reach a point early on that looks like this.

fibonacci(5) = fibonacci(4) + fibonacci(3)
fibonacci(4) = fibonacci(3) + fibonacci(2)
fibonacci(3) = ???

At this point, fibonacci(5) is waiting for an answer to fibonacci(4), which is, in turn, waiting for an answer to fibonacci(3). The tide, at this point, is on its way out.

Eventually, that call to fibonacci(3) will be determined to have a result of 2, which will lead the tide to roll in a bit, and we'll end up here.

fibonacci(5) = fibonacci(4) + fibonacci(3)
fibonacci(4) = 2 + fibonacci(2)

What happens next in fibonacci(4)? It can't perform its addition yet, because it needs two operands, but only one of them has a value so far. So, its next move is to call fibonacci(2) and wait for its result.

fibonacci(5) = fibonacci(4) + fibonacci(3)
fibonacci(4) = 2 + fibonacci(2)
fibonacci(2) = ???

Think about how fibonacci(3) would have been calculated, though. Wouldn't part of that calculation have included determining fibonacci(2)? So, aren't we incurring an unnecessary cost by re-calculating it again? The answer to both of these questions is an emphatic "Yes!", and, in fact, this is the Achilles heel of this particular version of fibonacci. It's straightforward, but it's incredibly slow, with its execution time being O(F(n)) — proportional to its result, which is a function that grows exponentially — for reasons that aren't particularly important here. The cause of that slowness is the mountains of repeated work that it does, a problem that compounds massively as n grows.

It's important to note that the use of multiple recursion is not the problem here, per se, so we shouldn't carry away the lesson that we should avoid multiple recursion at all costs. The lesson here is that we should avoid repeatedly solving the same problem at all costs, an issue we'll examine in more detail shortly. But not all problems that can be solved using multiple recursion have this characteristic.

Summing the values in a recursively nested list

When I teach ICS 32 or ICS 32A, I'm fond of demonstrating how we can solve the problem of summing the values in an arbitrarily nested list of numbers (i.e., the list can have sublists, and those sublists can have sublists, with no depth limit, but all non-list values are numeric). There are three reasons why I chose that particular example.

It lines up nicely alongside a similar problem that I ask students to solve in a simultaneously assigned programming project, in which they're meant to search for files on a filesystem. Traversing a filesystem is a lot like traversing an arbitrarily nested list.
The structure of the problem is best understood recursively, since we have no a priori knowledge of the depth of the list's nesting.
As a practical matter, recursion is a good choice, because it's unlikely to run up against Python's recursion depth limit, since that depth will likely be fairly small.

One solution to that problem — a little different from the one I show in that course, since we know more Python than those students do, but in the same spirit — is this one.


def nested_sum(nested_list):
    return sum(nested_sum(element) if type(element) is list else element \
               for element in nested_list)

While there's only one place in the body of nested_sum that calls nested_sum, take note of the fact that it occurs within a comprehension that's iterating across nested_list. That means that this function exhibits multiple recursion, by recursing into as many sublists as there are in nested_list, whether that's a few or a lot.

Note, too, that we aren't recursing over the list's length, but only over the list's depth. This is no accident. We reasonably expect the list's depth to be small enough that Python's recursion depth limit will not stand in our way, but we can't make the same assumption about the list's length.

So, what can we say about the performance of nested_list? There are two aspects to its performance, time and memory, so let's take them in turn.

The time spent adding up the list's elements is going to be determined by the overall total of how many elements we encountered. Some of them will be numbers and others will be sublists, but each will be encountered exactly once. Aside from the time spent on any recursive call, each takes a constant amount of time to process — determining whether it's a list, then either obtaining its value or passing its value to a recursive call, and finally adding the result to a running sum. All in all, this will take O(n) time if the list contains a total of n elements — though, to be clear, n counts not only the numbers, but also the lists.

Memory usage is predominantly determined by the depth of the recursion. Within each recursive call, there is a running sum (maintained by the sum function) and a generator working its way across a pre-existing list. These each require a constant amount of memory, so the question boils down to how many there are, which is a question of how deep the recursion goes. We might say, then, that we need O(d) memory, where d is the list's depth. Worth noting is that d is a different variable than we used to describe the number of elements in the list, because they can vary independently: We can have 100 elements in a list of depth 1, or 75 elements in a list of depth 50. It's not enough to say that something takes "linear memory" without also being clear about the question of "linear with respect to what?"

Still, we wouldn't expect to be able to do any better than that by eliminating the recursion and implementing it by hand. We'd still need to keep track of where we were in multiple lists at once, so we'd know where to go when we reached the end of one list and we needed to go back to where we came from previously. So, ultimately, we would need to simulate what Python's function calling mechanism gives us automatically: the run-time stack. Better to just let Python solve the problem for us, since its solution is the same one we would be aiming for anyway.

Memoization and dynamic programming

Before we leave the topic of recursion, there's one more idea worth exploring. When we discussed multiple recursion, one of the things we recognized is that the formulation of some recursive problems includes repeated instances of the same subproblems. Our fibonacci function, for example, required so many repetitive calculations for the same values of n that its execution time exploded to an extraordinary O(F(n)) to find the n^th Fibonacci number. With F(35) being more than nine million and F(70) being more than 100 trillion, a short, simple function became totally impractical.

Our search for a better approach for our fibonacci function would be centered on finding ways to avoid solving the same subproblems repeatedly. In doing so, we might learn techniques that we could apply more generally.

Memoization

The reason why our fibonacci function degrades so badly as n grows is that it spends so much of its time re-calculating things that have already been calculated previously. For example, to calculate fibonacci(5), both fibonacci(4) and fibonacci(3) are calculated. But part of calculating fibonacci(4) is to calculate fibonacci(3) again, along with fibonacci(2), which will also be calculated by another call to fibonacci(3), and so on. This leads to an explosion of duplicated work that gets dramatically worse at each step.

But why do we need to re-calculate things we already know? If we've already obtained an answer for fibonacci(3), why doesn't Python just give us the answer it already calculated? The answer is that Python's function call mechanism doesn't provide for this situation. When we call a function, we pass arguments to it, its body is executed, a result is returned, and that's the end of it. If we want our program to remember that result for reuse later, it's up to our program to remember it; if we want our function to reuse what we remembered, it's up to our function to do so. (There's a good reason for this limitation: Python has no way to know definitively whether a function will always return the same result given the same arguments. So, Python does the safe thing automatically — re-calculating the result every time — but if we know it's safe to do something fancier, we can arrange for it specially.)

Memoization is the technique of remembering the previous result of a function call and then using that result later in place of re-calculating it. (Weirdly, this is not a misspelling, even though the word memorization describes a similar concept. The term we use in computer science for remembering the previous results of a function is "memoization," without the R.) For a function that always returns the same result given the same arguments, this technique is safe, but whether it's a useful one revolves around evaluating the tradeoff we're making, which is one of the classic tradeoffs in computer science: spending memory to store previously calculated results to save the time spent in re-calculating them.

How might we write a memoized version of fibonacci? One approach that works is this one.


def fibonacci(n):
    # Since we'll be using n as a list index, best to eliminate negative inputs,
    # since negative indices are legal in Python, but not what we want here.
    if n < 0:
        raise ValueError(f'argument cannot be negative, but was {n}')

    # Every value in results[0:n] starts out as None.  In other words, we don't
    # yet know any results, but we have a place to store them as we determine them.
    # Since we need a place to put the base case values, we'll make sure that the
    # list has at least two elements in it.
    results = [None] * max(n + 1, 2)

    # Store the base case results ahead of time, since we already know what
    # they're going to be.
    results[0] = 0
    results[1] = 1

    # The recursive function is nested, so that the initialization of the results
    # only happens once.

    def fibonacci_memo(n):
        # If we have a result already, use it.  Otherwise, calculate it,
        # but store it before returning it.
        if results[n] is not None:
            return results[n]
        else:
            result = fibonacci_memo(n - 1) + fibonacci_memo(n - 2)
            results[n] = result
            return result

    # Call the memoized recursive function and return its result.  Note that we
    # don't need to pass the results list, since it's in fibonacci_memo's enclosing
    # scope and we always want to use the same one throughout.
    return fibonacci_memo(n)

When calling fibonacci(100) or fibonacci(300), we now find that we receive a nearly instantaneous result. To be clear, though, it's not actually instantaneous; it's still taking O(n) time and O(n) memory. But O(n) time is worlds better than the O(F(n)) time that we saw before, since F(n) is such a fast-growing function. We'll likely run into Python's recursion depth limit before any call to fibonacci will take a human-perceptible amount of time.

Again, we should remember that the reason this technique works for our fibonacci function is because of a characteristic of the problem: fibonacci(n) always returns the same result for any given n. Not all recursively shaped problems have this characteristic, so memoization is not universally applicable, but it certainly moved the needle in a big way here.

Dynamic programming

In addition to the characteristic that makes memoization possible, our fibonacci problem has another characteristic that not all recursively shaped problems have. For any given n, the result of fibonacci(n) is entirely determined by n and the results of fibonacci for smaller values of n. We won't get a different result when calculating fibonacci(n) as a smaller part of calculating fibonacci(n + 1) or fibonacci(n + 2) or anything else. Wherever we do it, fibonacci(n) is fibonacci(n): Its result will be the same regardless, and we'll be able to determine that result by determining the result of smaller problems (i.e., fibonacci with a smaller value of n as an argument). The fancy term for this characteristic is optimal substructure; not all problems have it, but fibonacci does.

For recursively described problems with optimal substrucutre like fibonacci, there's another technique we can employ. Rather than calculating a result recursively, by having fibonacci(n) call fibonacci(n - 1) and fibonacci(n - 2), we can instead reverse the order of the calculations altogether, by first determining (and storing) fibonacci(0), then fibonacci(1), and so on. Each time we need to determine the next value in the sequence, we'll already have the results that allow us to calculate it. Consequently, there will be no need to make a recursive call at all, which means not only that we'll avoid the costs of making function calls altogether, but also that we'll avoid Python's recursion depth limit.


def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        # Start by storing fibonacci(0) and fibonacci(1) in a list.
        results = [0, 1]

        # We calculate the subsequent values by adding the last two elements
        # of the list, appending each result to the list as we go.
        for i in range(2, n + 1):
            results.append(results[-2] + results[-1])

        # When we're done, the result we want will be at the end of the list.
        return results[-1]

The technique we've seen here is known by the strangely ambiguous name of dynamic programming, a term that's borrowed from mathematics — where the word programming has a different meaning and this name is less confusing than it is to students learning about programming in a dynamically typed programming language like Python. We can opt for this technique when solving recursively described problems that have optimal substructure.

There's one further optimization we can make to our solution. Our new fibonacci function requires O(n) time to perform the additions and append them to a list, along with O(n) memory to store the ever-growing list. But isn't it true that we only ever need the most recent two values from that list? If so, then what purpose is there in storing the others? Instead of a list, we could simply use two variables — one to represent the current (i.e., most recent) result and another to represent the previous one — then shuffle the values amongst those variables where appropriate.


def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        # Start by storing the base cases into the two variables.
        previous = 0
        current = 1

        # At each step, the current value becomes the previous one, while the
        # new current value becomes the sum of the previous and current ones.
        for i in range(2, n + 1):
            previous, current = current, (previous + current)

        # When we're done, current will be the result we want.
        return current

We've now reduced our memory footprint to O(1), because all we're storing are two integer variables, no matter how big n is. This, for fibonacci, is about as good as it gets for calculating it using a sequence of additions: linear time and constant memory. Note, too, that not all dynamic programming solutions can be optimized this far — our ability to do it here depended on the quirk that each step in fibonacci only needed the previous two values, so all the others could be safely forgotten.