ICS 33 Spring 2024, Notes and Examples: Functional Programming

Background

We might describe Python as a language based around object-oriented programming, because a Python program is organized around interactions between objects. Objects arise from classes, and objects of one class behave differently from objects of another; when we ask an object to do a job for us, what happens next is determined largely by its class. These are among the characteristics of what is commonly thought of as "object-oriented," though we'll defer a longer discussion about that for another day.

While you'll no doubt find similarities between one programming language and another if you study more than one of them, not all languages are designed around the same set of concepts, both because they're intended to be used to solve different kinds of problems, and because their designers have different ideas about what combinations of language features lead to their ideal image of a well-written program. Since those ideas evolve over time, you'll find that languages designed in bygone eras might look and feel quite different from languages designed more recently, but you might also be surprised how many ideas underlying today's languages date back to the 1950s, and how many of today's popular languages (Python included) have a history that stretches back decades even if they've evolved significantly in the years since.

Both historically and currently, object-oriented programming is not the only style of programming you'll encounter. Later in a computing curriculum, you'll likely embark on a comparative study of programming languages, after which you'll have seen a broader range of what's possible, will better understand which techniques might be useful in which circumstances, and will have gained at least some sense of how different language features are implemented behind the scenes. (A study of these topics can be lifelong and still bear fruit; for me, it's still my favorite area of computer science to read and learn about, yet there's still more material to study than I'll digest in my lifetime.)

But we can foreshadow that journey by discussing a somewhat different programming style than you've seen so far, one called functional programming. In its purest form, functional programming is characterized by a few ideas.

Programs are organized predominantly around functions, which accept arguments and return values.
Functions are pure — the way they are in mathematics — which means that their result is determined solely by their arguments. There are no side effects, which means that functions can have no impact on anything other than the result they calculate. This means that all forms of data will be immutable and there will be no assignment statements, so updating a value requires building a new one, whether it's a small-sized integer or a large-scale data structure. (On the other hand, if we know the values in a data structure can't change, we might be able to keep the parts that haven't changed and replace only the parts that did, so that multiple copies of the same data structure can substantially share their storage, which can significantly mitigate the latter cost.)
Functions are first-class, which means that they're values just like integers, strings, or anything else, and that they can be passed as arguments and returned as results. This means there will be functions whose role is to operate on other functions; we call these higher-order functions. Higher-order functions are where functional programming derives a lot of its power, by making it easy to combine a collection of simple, reusable functions into powerful, single-purposed ones.

Functional programming's purity leads to at least three potential benefits.

Clarity, because functions can be understood more easily without regard to the others. Without side effects, a function f only impacts the meaning of a function g if one calls the other or one is passed as an argument to the other, either directly or indirectly.
Testability, because functions can be tested in isolation without concern about the prior effects of others. (Because of its roots in mathematics, there is also the related benefit of provability, in the sense that we might be able to use mathematics to prove that our programs meet their requirements, instead of only being able to test them.)
Parallelizability, meaning that more than one part of a program can be executed simultaneously on hardware that allows it — as most computers allow nowadays, in one form or another. In the absence of side effects, we can be sure that executing two functions simultaneously is safe, unless one of them needs the result of the other.

Of course, real-world practicality is not to be ignored, unless we have the luxury of working outside of real-world constraints. A pure form of functional programming turns out to be relatively impractical for entire programs, for the simple reason that parts of most programs need to have side effects. We need to be able to print output to a shell, write output to a file, create sockets and connect them to a server, interact with a database, or draw graphics within a window, but these are all examples of side effects.

Still, the lines dividing programming paradigms have become blurrier over time, as some predominantly object-oriented languages have adopted features once confined to functional languages, while some predominantly functional languages have allowed their users to implement side effects (with varying degrees of control around when and how it can be done) so they're available when needed. Python, for its part, has gradually improved its support for functional programming over its history. While this doesn't mean that you'll write Python programs purely functionally, we can borrow these ideas when they're useful to us, but use other techniques when they're a better fit.

To do that, though, we'll need to know some of those techniques. So, let's dive in.

Lambda expressions

A lambda expression is an expression whose result is what we sometimes call an anonymous function. There are a couple of terms that appear in that previous sentence that are worth calling out in some detail.

When we include the word expression in the phrase lambda expression, we mean that lambdas are evaluated and return a result, just like other expressions like i + j or list(range(10)) are. We also mean that lambdas must follow Python's syntactic rules regarding expressions, which limits what we can write in the body of the function we're building.
The word anonymous doesn't mean that we can't know what the resulting function does; it just means that the function doesn't have a name.

A lambda expression begins with the keyword lambda, followed by the names of its parameters (separated by commas), then a colon, and finally a single expression that is the function's body. Calling the function causes its body expression to be evaluated, with the result of that expression returned as the function's result.


>>> lambda n: n * n
    <function <lambda> at 0x000001812D9FDCF0>
             # ^^^^^^ The function's name is not lambda.  It doesn't have a name.
>>> square = lambda n: n * n
                  # ^^^^^^^^ The function takes one argument, returning its square.
>>> type(square)
    <class 'function'>
               # The result of a lambda is a function, just like the functions we build
               # with a def statement.
>>> square(3)
    9          # If we call the result of a lambda, it's just like calling any other function.

One thing to be careful to understand is the word "expression" that keeps appearing in this section. For example, in our lambda expression above, we didn't use a return statement to return its result, but could we?


>>> lambda n: return n * n
    Traceback (most recent call last):
      ...
      lambda n: return n * n
                ^^^^^^
    SyntaxError: invalid syntax

The answer is an emphatic "No!", because the body of a lambda must be a single expression, but return n * n is not an expression; it's a statement. This rules out many other things you might normally write in the body of a function.


>>> lambda n:
...     if n == 0:
...         print('Zero')
...     else:
...         print('Non-zero')
...
    Traceback (most recent call last):
      ...
      lambda n:
               ^
    SyntaxError: invalid syntax

So, generally, we need to be aware of what lambdas are and what they aren't. A lambda expression is a way to write a function whose body is a single expression and for which we don't need a name. Anything else is better written using a def statement, like the functions we've written previously.

Lambdas are able to express the full richness of Python's parameter-accepting mechanism, though, which makes them able to express certain patterns very succinctly. Tuple- and dictionary-packing parameters, positional- and keyword-only parameters, default arguments, and so on, are all available using the same syntax you'd use in a function's parameter list.


>>> make_reversed_list = lambda *values: list(reversed(values))
>>> make_reversed_list(1, 2, 3, 4, 5)
    [5, 4, 3, 2, 1]
>>> make_dict = lambda **kwargs: dict(kwargs)
>>> make_dict(a = 1, b = 2, c = 3)
    {'a': 1, 'b': 2, 'c': 3}

Lambda expressions may seem like a bit of a curiosity, because it may seem unnecessary to write short, anonymous functions like these. When this feature is exceptionally useful is when we use it in conjunction with another feature called higher-order functions.

Higher-order functions

As we've seen, functions are objects in Python, just like integers, strings, lists, and ranges are. They have a different type — the class is called function — and they serve a particular purpose — they can be called — but they're otherwise objects like any other, which means they have attributes that can store values, they can themselves be stored in attributes, stored in local variables, passed to other functions as arguments, and returned from functions as results. The line between code and data may have felt bright and clear when you began with Python, but it turns out to be a somewhat blurry one: Functions and integers are both objects, and they share more characteristics than may have seemed initially evident.

We say that a higher-order function is one that operates on other functions, by either accepting other functions as arguments, returning other functions as their results, or both. While this can be an odd concept when you first encounter it, it's actually an enormously powerful one, though we'll need to acquaint ourselves with some of the details before we'll see why. For good reason, higher-order functions are one of the hallmarks of functional programming.

Functions that accept other functions as arguments

If functions are objects, then it stands to reason that we should be able to pass them as arguments to other functions, with the mechanism being just like any other argument-passing we might like to do: a parameter receives the function, and we can then call it like any other.


>>> def square(n):
...     return n * n
...
>>> def transform_all(f, values):
...     for value in values:
...         yield f(value)
...             # ^^^ If f is a function, we should be able to call it,
                #     as long as it can accept the one argument we passed it.
>>> list(transform_all(square, [1, 2, 3]))
    [1, 4, 9]
>>> list(transform_all(len, ['Boo', 'is', 'happy', 'today']))
    [3, 2, 5, 5]
>>> list(transform_all(int, ['1', '2', '3', '4']))
    [1, 2, 3, 4]
>>> list(transform_all(lambda n: -n, [1, 3, 5]))
    [-1, -3, -5]     # ^^^ A lambda builds a function, so we should be able to pass it
                     #     to transform_all.

That transform_all is a higher-order function makes it a lot more powerful than the usual functions we write. It doesn't solve a single problem; instead, it solves a kind of problem. By passing it different functions, we can configure it to suit entirely different purposes, in ways that we might not have needed or imagined when we first wrote transform_all. If we need to transform a collection of values in some way, transform_all can do it; the magic is in the function we pass to it, which specifies what transformation will be done.

Meanwhile, transform_all itself is indifferent not only about the transformation, but also about what data structure stores the original values, because it relies only on its values parameter being iterable, and how the transformed values will be used or stored, because it returns a generator that will produce them one at a time without regard to what will happen to them. That humble little function is a powerful one indeed.

It's also as performant — from the perspective of asymptotic analysis — as any function like this can be. To transform n values, it'll require O(n) time, but only O(1) memory, since the values are transformed one at a time, generated, and then are no longer considered by transform_all. (In our examples above, we passed the result to list, which would have stored the values in a list, but that's not a requirement of transform_all; that arose from how we used transform_all, so would not be part of its analysis.)

I should point out, too, that we could have implemented it even more simply without compromising its flexibility or asymptotic analysis, by replacing the loop in its body with a generator comprehension instead.


def transform_all(f, values):
    return (f(value) for value in values)

Functions that return other functions

Just as we can write functions that accept other functions as arguments, we can also write functions that build and return new functions. Let's first investigate what that might look like.


>>> def make_function():
...     def another_function():
...         return 'Hello Boo!'
...     return another_function
...
>>> make_function
    <function make_function at 0x000001F29D48DCF0>
>>> make_function()
    <function make_function.<locals>.another_function at 0x000001F29D48D000>
>>> make_function()()
    'Hello Boo!'

We know that the def statement builds a function, gives it a name, and stores it in a variable or attribute (of the same name) within the scope where the def statement appears. When we defined make_function, we did so in the Python shell, so it was stored in the make_function attribute of the __main__ module. But what happens when another_function is defined? When is it defined?


>>> another_function
    Traceback (most recent call last):
      ...
    NameError: name 'another_function' is not defined

We see that another_function has no value within the Python shell. That's no accident, though. If another_function is created within the body of make_function, then it's not created unless and until make_function is called, and is stored in a local variable within make_function (only while make_function is still executing). Since make_function returns another_function, it goes on living, but only to the extent that the return value is stored somewhere else.

We can confirm that another_function is built freshly each time make_function is called, too, by verifying that two calls to make_function return objects that have a different identity.


>>> f1 = make_function()
>>> f2 = make_function()
>>> f1 is f2
    False        # They're not the same object!

So, those are some of the mechanics behind functions that return other functions, but there's a much more important question to consider. What purpose does a technique like this serve? In fairness, it probably serves no purpose if the function is equivalent every time. Since make_function returns an equivalent function that returns 'Hello Boo!' every time it's called, this doesn't accomplish anything that we couldn't have accomplished by simply writing another_function outside of my_function.

But there's no reason why the function we're building has to be equivalent every time. If the inner function is being built freshly each time the outer function is called, then why couldn't the inner function be different each time? If it was, then we'd have something a lot more powerful: a tool that can build many similar functions for us, so we don't have to write them ourselves.

We saw previously that Python provides built-in functions any, which takes an iterable argument and tells you whether any of its values are truthy, and all, which instead tells you whether all of an iterable's values are truthy.


>>> any(x > 0 for x in [-1, -2, -3, -4])
    False
>>> any(x > 0 for x in [-1, -2, -3, -4, 0, 1, 3])
    True
>>> all(x > 0 for x in [-1, -2, -3, -4, 0, 1, 3])
    False
>>> all(x > 0 for x in [1, 3, 5, 7])
    True

Missing from Python's collection of built-in functions is another related function that tells you whether none of an iterable's values are truthy. We might call such a function none, and we might expect it to behave this way. (One might also debate on the wisdom of naming the function none when Python already has a built-in value named None and a type named NoneType, which might go a long way toward explaining why this function is missing in the first place.)


>>> none(x > 0 for x in [-1, -2, -3, -4])
    True
>>> none(x > 0 for x in [-1, -2, -3, -4, 0, 1, 3])
    False

While we could write our none function from first principles, using a for loop and iterating the elements by hand, none has an interesting relationship with any. Whenever any returns True, none would return False; whenever any returns False, none would return True. We might then describe none as the negation of any. That property leads to a simpler implementation: none can call any and negate its result.


>>> def none(iterable):
...     return not any(iterable)
...

But the idea that we'd like to take a boolean-returning function and build its negation is one that recurs. So, rather than writing these functions, it would be nice if we could build a tool that builds them for us, both to keep us from having to write them, and to give us a terse syntax for passing an existing function's negation to another function as an argument. (For similar reasons we want lambda expressions to allow us to write simple one-off functions, we might also want to be able to build simple one-off functions without having to write them at all.)


>>> def negate(f):
...     def execute(n):
...         return not f(n)
...     return execute
...
>>> none = negate(any)
>>> none(x > 0 for x in [-1, -2, -3, -4])
    True
>>> none(x > 0 for x in [-1, -2, -3, -4, 0, 1, 3])
    False

Our negate function is a little longer than our none function was, but that's partly because it has a superpower: It can be used to build the negation of many different one-argument functions, rather than just being the negation of one function. We can shorten our implementation, too, using a lambda expression, giving us a very terse notation that belies its power.


>>> def negate(f):
...     return lambda n: not f(n)
...

Mathematics offers an idea that gives us a more thorough example here. We say that the composition of two functions is a function that takes one input, applies one of the functions to it, then passes its output as the input to the other. This is sometimes written mathematically using the ∘ symbol, such as (f ∘ g)(n) = f(g(n)).

That idea turns out to be a valuable one in the programs we write, too. We'd like to be able to chain functions together, so that the output of one function becomes the input to another, because it's frequently useful as a way to combine simpler functions into more powerful ones. If want to write a function that calculates twice the square of its input, one way to do that is to write a function that does just that: square its input, then multiply it by two, then return the result. But if we already had a function that squared its input and another that multiplied its input by two, then we could compose them together into our double-of-the-square function, without having to write a new function by hand.


>>> def compose(f, g):
...     def execute(n):
...         return f(g(n))
...     return execute
...
>>> double_square = compose(lambda n: n * 2, lambda n: n * n)
>>> double_square(3)
    18
>>> double_square(5)
    50
>>> formatted_length = compose(len, str)
>>> formatted_length(1)
    1          # When formatted, 1 becomes '1', which has length 1.
>>> formatted_length([1, 2])
    6          # When formatted, [1, 2] becomes '[1, 2]', which has length 6.

Like our transform_all function in the previous section or the negate function above, compose is a powerful function, because it can solve a kind of problem instead of a particular one. In this case, compose can glue any two functions together, as long as they can each accept one argument, and as long as they can be combined compatibly (i.e., the type of result returned by one can be accepted as an argument by the other).

We can take this idea one step further, as well, by instead writing a function that composes an arbitrary number of functions together — as long as there's at least one — instead of exactly two. Such a function might read a little more clearly if the functions are executed in the order specified, instead of the reverse of that order (as compose did), which would also allow the functions to be iterated inexpensively, even if they're coming from a generator. But, since we've now deviated from the mathematical notation of function composition, perhaps we should use a different name for our function, as well. So, let's say that a pipeline is a sequence of functions, in which the output of the first becomes the input to the second, the output of the second becomes the input to the third, and so on. Given that, we can name our function make_pipeline.


>>> def make_pipeline(first, *rest):
...    def execute_pipeline(n):
...        current = first(n)
...        for f in rest:
...            current = f(current)
...        return current
...    return execute_pipeline
...
>>> square_of_string_length = make_pipeline(str, len, square)
>>> square_of_string_length(10)
    4
>>> square_of_string_length(100)
    9

Partially called functions

When we call a function, we pass all of its necessary arguments to it. Those arguments, in turn, are bound to the function's parameters, then the body of the function is executed in a scope that includes those parameters. Whatever the function returns becomes the result of the function call. This technique is among the first ones you learn when studying Python, because it's such a fundamental part of Python's design.

But what happens if we don't pass all of its necessary arguments? Your prior study of Python means you probably know the answer to this question already: An exception is raised.


>>> def multiply(n, m):
...     return n * m
...
>>> multiply(3)
    Traceback (most recent call last):
      ...
    TypeError: multiply() missing 1 required positional argument: 'm'

Our multiply function presents us with a kind of contract: "If you give me two values, n and m, I'll multiply them and tell you their product." As long as we hold up our end of the bargain, by providing values for both n and m, then multiply can fulfill its promise and return its result. But if we leave one of the arguments out, we haven't met our obligation, so multiply won't be able to meet its own. This is similar to the contractual agreements we make in our day-to-day lives, such as the ones you make when you shop online, where a merchant says "If you pay me $30 today, I'll ship you five widgets that will arrive on Wednesday." As long as your credit card transaction is approved, they'll ship your goods; if not, the deal is off.

But there's another way to think about an expression like multiply(3), even though Python treats it as an out-and-out failure. We haven't held up the entirety of our bargain, but we've certainly held up some of it. To purchase a $30 item, I could pay someone $10 today, followed by $20 next week, at which point they would be willing to ship me the item. Why couldn't functions work the same way? Why couldn't I pass one of multiply's arguments now, then pass the other one later, and, only then, multiply would give me its answer? We'd need a mechanism that makes that work, but the concept isn't an unreasonable one.

When we call a function without passing all of its arguments, we say that we've partially called it. Some functional programming languages support this directly, meaning that you can call any function with fewer arguments than it requires, then pass the others later. Python doesn't provide this feature directly — multiply(3) will always raise an exception in Python if multiply requires two arguments — but it provides us a way to achieve it nonetheless. Whether the feature is supported directly in the language or not, what we need is a way to defer the execution of the function until the point in time where we've provided all of its arguments. Why couldn't it work this way?

When we partially call multiply, it means that we've passed some but not all of its arguments. The result could be another function — which we might call a partial function — that remembers the values of the arguments we already passed, then requires only the remaining arguments to be passed to it.
When we later call the partial function, the arguments we pass to it are combined with the arguments we passed originally. If all of multiply's arguments are accounted for, multiply is called, and its result becomes the result of our partial function. If not, we get back yet another partial function, which we could complete subsequently.

If function calls worked this way in Python — they don't! — then we could see this play out in the Python shell like this.


>>> multiply_three_by = multiply(3)
                  # multiply(3) would return a partial function that multiplies 3
                  # by its argument.
>>> multiply_three_by(8)
    24            # multiply(3, 8) would be called behind the scenes, returning 24.

But we're not out of luck here, because Python does provide us the ability to write higher-order functions. This means we could write a function that partially calls another function, returning a partial function that completes the call for us later.


>>> def partially_call(f, n):
...     def complete(m):
...         return f(n, m)
...     return complete
...
>>> multiply_by_three = partially_call(multiply, 3)
>>> multiply_by_three(8)
    24

The partially_call function could be made more general than this, though, by allowing us to pass any number of positional arguments to it, then returning a partial function that also accepts any number of positional arguments. (That way, we could partially call a four-parameter function by passing it two arguments now and two more later.) Tuple-packing parameters and iterable-unpacking arguments combine nicely to fit that purpose.


>>> def partially_call(f, *args):
...    def complete(*remaining_args):
...        return f(*args, *remaining_args)
...    return complete
...
>>> multiply_by_three = partially_call(multiply, 3)
>>> multiply_by_three(8)
    24
>>> print_boo_description = partially_call(print, 'Boo', 'is')
>>> print_boo_description('happy', 'today')
    Boo is happy today
>>> print_boo_description('always', 'totally', 'perfect')
    Boo is always totally perfect

Partial functions are nonetheless functions, so we can partially call them, as well, with all of the arguments from all of the partial calls being combined as we'd expect.


>>> print_boo_happy_reason = partialy_call(print_boo_description, 'happy', 'because')
>>> print_boo_happy_reason('of', 'cool', 'weather')
    Boo is happy because of cool weather

We could take this a couple of steps further — supporting keyword arguments, for example — but, fortunately, Python and its standard library provide a fair number of common functional programming tools, including one that solves this problem in its full depth, so we'll stop here for now. By now, though, you'll have seen the power of writing functions that build and return functions; they automate repetitive patterns that we might otherwise have to write by hand every time we need them.

Now that we've acquainted ourselves with the concepts underlying higher-level functions, let's take a look at what's built into Python already.

Common higher-order functions

Built-in functions: map and filter

In programming languages that are meant to support functional programming, there are three common higher-order functions that you'll always find in some form; the names will sometimes be different, but the ideas will always be there. Two of these three are built into Python, so they're available without importing any modules from its standard library. The first is called map, which takes a function and one or more iterables. When called with one iterable, which is how it's most commonly used, it returns a generator that calls that function on each element of the iterable, one at a time, yielding each of the results.


>>> list(map(lambda n: n * n, range(5)))
    [0, 1, 4, 9, 16]

The built-in map function is similar in concept to the transform_all function we wrote earlier, but you can also pass multiple iterables to map, in which case it iterates all of them, passing one element from each iterable as a collection of arguments to the function instead. Naturally, the function will need to accept as many arguments as there are iterables.


>>> list(map(lambda a, b, c: a + b * c, [1, 3, 5], [2, 4, 8], [-1, 1, -1]))
    [-1, 7, -3]        # 1 + (2 * -1) = -1,  3 + (4 * 1) = 7,  5 + (8 * -1) = -3
>>> list(map(lambda a, b: a * b, [1, 2, 3], [4, 5]))
    [4, 10]            # 1 * 4 = 4,  2 * 5 = 10
                       # Once one iterable runs out of elements, that's the end of the output.

Another of the most common higher-order functions you'll find in functional programming languages is one that's often called filter, whose job is to process a sequence of values, keeping some while discarding others. Python's built-in filter function does something similar; you feed it a function and an iterable, and it'll yield all of the values from the iterable where the function returns a truthy value.


>>> def is_positive(n):
...     return n > 0
...
>>> list(filter(is_positive, [1, -2, 3, -4, 5, -6]))
    [1, 3, 5]          # When the function returns a truthy result, we keep the value.
                       # Otherwise, it's discarded.
>>> list(filter(negate(is_positive), [1, -2, 3, -4, 5, -6]))
    [-2, -4, -6]       # The negate function we wrote previously comes in handy here.

functools.reduce

The third of the most common higher-order functions is not built into Python, though you'll find it in the standard library in a module called functools, whose role is to provide tools for operating on functions. The name of this function varies from one programming language to another — reduce, fold, aggregate, and accumulate are common names you'll see — but Python calls its version functools.reduce.

Reduction, in this case, means to take a sequence of values and reduce it to a single value, by combining each value with a running result, then returning the final result after they've all been combined. The combining operation is a function than can accept two arguments: the next value in the sequence and the running total obtained by reducing all of the previous values. (The first argument is the running total; the second is the next value.) For example, if we reduce the values 1, 2, 3, and 4 using a function that adds a value to a running sum, we'd be calculating ((1 + 2) + 3) + 4, whose result is 10.


>>> import functools
>>> functools.reduce(lambda a, b: a + b, [1, 2, 3, 4])
    10                  # Adding the elements together yields their sum.
>>> functools.reduce(lambda a, b: a + b, [[1, 2], [3, 4], [5, 6]])
    [1, 2, 3, 4, 5, 6]  # Adding lists together flattens them.
>>> functools.reduce(max, [3, 11, 7, 2, 1, 9])
    11                  # If each combining operation returns the larger of the new value
                        # and the largest so far, we'll have determined their maximum.

So, as it turns out, reduction is a surprisingly powerful operation. A lot of loops you might write can be expressed this way instead, and in some functional programming languages that don't offer loops as a feature, you'll find yourself writing a lot of reductions instead.

There's one edge case worth thinking about, though. If reduction combines values together, what happens if there aren't any values at all?


>>> functools.reduce(lambda a, b: a + b, [])
    Traceback (most recent call last):
      ...
    TypeError: reduce() of empty iterable with no initial value

An exception was raised, and the failure is actually a sensible one: If we have no values, there's no reasonable default, since there's no way for functools.reduce to know what types of values we intended to pass. The addition function can add numbers, lists, strings, and many other types, so what should it return? 0? An empty list? Rather than guessing, functools.reduce fails instead.

But the error message suggests a way out: passing an initial value. Perhaps that's an additional (optional) argument we could pass.


>>> functools.reduce(lambda a, b: a + b, [], 0)
    0                    # When we pass an initial value, it's returned when there are no values.

Notice, though, that they called it an initial value rather than a default value — words have meaning that suggests purpose, which is why we want to think carefully about the identifiers we choose and the way we write our error messages — so it's reasonable to expect that the initial value has an impact even when the iterable isn't empty.


>>> functools.reduce(lambda a, b: a + b, [1, 2, 3], 12)
    18                   # Yep!  12 + 1 + 2 + 3 = 18.

So, functools.reduce is one function that can perform a dizzying array of tasks. But there's one more thing about it that's cumbersome. When we wanted to tell it to use addition as its combining operation, we had to pass a function that took two arguments and added them together, the shortest expression of which was lambda a, b: a + b. That expression is chocked full of punctuation, which makes it difficult to figure out, at a glance, where the first argument ends and the second one begins. What would be better is if we could just say "Use addition" in a shorter (and less punctuation-heavy) way. Why not just use +?


>>> functools.reduce(+, [1, 2, 3])

This, unfortunately, is not syntactically legal. + is an operator rather than a function, so there are limitations on where we can use it in an expression. But the idea that we'd like to have functions that mirror our operators is a sound one.

Operators as functions

The operator module in Python's standard library provides a collection of functions that mirror Python's operators, so that you can avoid writing functions manually in cases like this.


>>> import operator
>>> functools.reduce(operator.add, [1, 2, 3])
    6                # The sum of the reduced values.
>>> list(map(operator.mul, [1, 2, 3], [4, 5, 6]))
    [4, 10, 18]      # The product of each corresponding pair of values.
>>> list(filter(operator.truth, [-3, -2, -1, 0, 1, 2, 3]))
    [-3, -2, -1, 1, 2, 3]
                     # 0 is missing because it's the only value that isn't truthy.
>>> list(filter(negate(operator.truth), [-3, -2, -1, 0, 1, 2, 3]))
    [0]              # If we negate operator.truth with our negate function,
                     # we're asking to keep the values that are falsy instead.

Having a collection of operators-as-functions means that we can easily use operators alongside higher-order functions, so that we can express complex ideas tersely but effectively.

Partially calling functions with functools.partial

We discussed the notion of partially called functions previously, so it's worth pointing out that functools provides a tool for exactly this purpose, with some additional abilities — such as properly handling keyword arguments — that our version lacked.


>>> print_boo_description = functools.partial(print, 'Boo', 'is')
>>> print_boo_description('happy', 'today')
    Boo is happy today
>>> type(print_boo_description)
    <class 'functools.partial'>       # Strangely, this isn't a function, yet we can call it!
>>> print_boo_description.args
    ('Boo', 'is')                     # Arguments from the partial call are stored, to be used later.
>>> print_boo_description.func
    <built-in function print>         # The original function is stored, to be called later.
>>> print_boo_description.keywords
    {}                                # If we had passed keyword arguments, they would have been stored here.
>>> print_with_xs = functools.partial(print, sep = 'X')
>>> print_with_xs.keywords
    {'sep': 'X'}                      # Yep!  There's one.
>>> print_with_xs('Hello', 'there')
    HelloXthere                       # The keyword argument is included when I complete the call later.

The functools.partial function's design rides on the same ideas that we've seen here, but we saw one slightly surprising aspect of it. It returns an object whose type is functools.partial, yet we were able to call it as though it was a function. We've observed previously that functions are objects — so that we can pass them as arguments to other functions, store them in variables, and so on — but this is a variation we've yet to see. Even objects that aren't functions can be treated, at least in some cases, as though they are. By what mechanism does that work? Let's take a look.

Implementing the mechanics of calling

When we discussed the Python Data Model, we noticed a recurring theme in Python's design. When we want to customize some aspect of how Python behaves when applied to objects of a class of ours — such as their length, the meanings of the operators when applied to them, or the conditions under which they're considered to be truthy — we can do so by adding the appropriate dunder methods to the class. The key, ultimately, is to know which dunder methods are associated with each Python language feature, then to understand the details of how the use of that language feature becomes calls to those dunder methods.

What about the mechanism for calls themselves? We've seen that functions can be called, but we've now noticed that other objects can be called, too. (In truth, we actually knew this already, because we knew that classes can also be called.) We say that an object is callable if it can be called like a function, i.e., we can syntactically follow it with parentheses and pass arguments to it, in which case it will either return a result or raise an exception. We can check whether an object is callable using Python's built-in callable function.


>>> def square(n):
...     return n * n
...
>>> callable(square)
    True       # Functions are callable.
>>> callable(int)
    True       # Classes are callable.
>>> callable(square(3))
    False      # square(3) returns 9, which is an integer.  Integers are not callable.
>>> import functools
>>> callable(functools.partial(print, sep = 'X'))
    True       # The partial functions returned by functools.partial are callable.

So, as we'd expect, some kinds of objects are callable and others aren't. What makes an object callable is the presence of a dunder method named __call__, which is called when an object of its class is called, with any arguments passed becoming arguments to __call__. If an object has no __call__ method, it will not be possible to call it.


>>> class Square:
...     def __call__(self, n):
...         return n * n
...
>>> Square(3)
    Traceback (most recent call last):
      ...
    TypeError: Square() takes no arguments
                  # This is an attempt to construct a Square object, rather than
                  # a call to one.  Since Square has no __init__ method,
                  # arguments can't be passed when constructing one.
>>> s = Square()
                  # But we can construct a Square object and store it in a variable.
>>> callable(s)
    True          # Square objects are callable, because of the __call__ method.
>>> s(3)
    9             # Calling a Square object calls its __call__ method.

The parameters of a __call__ method can use all of the same features as any other function: keyword parameters, positional-only parameters, tuple- and dictionary-packing parameters, and so on. So, theoretically, any function could also be implemented by a class with a corresponding __call__ method. There would be little reason to do so, since that's just a longer-winded way of saying something that could be said more simply with a def statement or lambda expression, but that's the mechanism at work. Calls to objects turn into calls to their __call__ method.


>>> s.__call__(3)
    9           # We can call Square's __call__ method directly.
>>> square.__call__(3)
    9           # Functions also have a __call__ method, so we can call it directly, too.

While we won't find ourselves directly calling the __call__ method often, implementing a class with a __call__ method is more common than you might imagine. When we write code that builds code, as we did when we wrote higher-order functions that return functions, we'll sometimes need all of the tools at our disposal, so it's best for us to understand the whole mechanism that makes functions work.