Writing Classes (continued)

In this lecture we will look at the scope of names: the places that the names
can be used and how Python determines the object associated with a name. We
will first look at scope in modules. Then we will look at class names refering
to value objects (not methods). Finally we will examine how to define one
function inside another and discuss the full generality of the LEGB rule
(Local, Enclosing, Global, and Builtins). We close with a discussion of the
different kinds of names in Python.

------------------------------------------------------------------------------

Scope in Modules: Global and Local Names for Functions

Modules (that do not define/import classes) typically define names that refer
to value objects and names that refer to functions, as is illustrated below.

  x = 1

  def factorial(n):
      answer = 1
      for i in irange(2,n):
          answer *= i
      return answer

  g = factorial

print(factorial(3),g(3))

In this code Python binds names (x, factorial, g) to objects (a value object
that is an instance of the class int, a function object). Note that after
defining g = factorial, the names g and factorial both refer to (share) the same
function object. So calling either factorial(3) or g(3) calls the same function
object with the argument 3 (matching its parameter n) and both return a result
of 6 (which is 3!=1*2*3). So this script prints: 6 6

Names defined at the module level are called "global" names. So in this module,
x, factorial, and g are all examples of global names.

Functions also define names: parameter names and local names (we might more
accurately define parameter names as a special kind of local name). When a
function is called, all its parameter names are bound to the arguments from the
call; when the function returns, these names disappear, only to be rebound when
the function is called again. The factorial function defines only the parameter
n.

When a function executes, it can define local names which are initialized and
used within the function. As with parameters, when the function returns, these
names disappear, only to be redefined again when the function is executed again.
The factorial function defines the local names answer and i.

    Contrast this with the instance/self names for objects; when an object is
    constructed in the __init__ method, it defines instance/self names which
    stay with the object through all its method calls: in fact most method
    calls examine, change the state, or rebind of these instance/self names.

The script outside a function cannot refer to any parameters or local names
defined inside the fucntion: these names are for use only within that function
(inside the scope of the function). Also two functions that are both global
cannot refer to/examine any names defined in the other. At the end of this
lecture we will see how to define one function INSIDE another, in which case
the inner function can examine the names in the outer function (in the outer
function's scope).

There is one interesting interaction between global names and names used in
functions. A function may use (refer to, but not change the binding of) a global
name. For example, we can trivially use the global name x in the function f.

  x = 1

  def f(n):
      print(n,x)

  f(3)

This script prints: 3 1

In a more interesting way, we could define

  tracing = True

  def f(n):
      if tracing:
          print('f called with argument', n)
      ....

  f(3)

  tracing = False

  f(3)

When the first f(3) is called, the global name tracing is bound to True, so the
function's if test evaluates to True and the function prints

  f called with argument 3.

But the next statement rebinds the name tracing to False, so when the second
f(3) is called, the global name tracing is bound to False, so the function's if
test evaluates to False and it skips the print.

Generally, when a function is called, and it refers to a non-local name, it
tries to find a binding of that global name.

Of course, it is better style to not use global names in a function, but
instead pass it as an argument to some parameter in the function. But in the
case of a global tracing name (which might affect many functions), sometimes it
is easier to just define it as a global name and use that global name in each
function. Beginners often use global names too frequently: they are sometimes
useful, but functions should mostly use only information passed to their
parameters.

Now we will come to something confusing, but we will learn the right way to
think about it. I'm cutting this example down to the bare bones. In previous
examples we examined functions that examined a global name; here it "looks like"
we are trying to change the binding of such a global name (from 1 to 2).

  x = 1

  def f(n):
      x = 2
      print(n,x)

  f(3)

  print(x)

But what actually happens is that x = 2 defines a new local name in the
function f; it DOES NOT change the binding of the global name (but we will see
how this can be done soon). So, the function prints the value associated with
this local name (2) and then the function terminates and the local name goes
away. The print in the script after the function call refers to the global name
x whose binding is unchanged, and prints its value (1).

If we want to change the binding for the global name x inside the function f, we
can do so, but it requires a special "global" declaration.

  x = 1

  def f(n):
      global x
      x = 2
      print(n,x)

  f(3)

  print(x)

Here, the global x declaration means that the function should not define x as
a local name, so x = 2 should rebind the global name x, not introduce a local
name x in the function; so it DOES change the binding of the global name. The
function prints the value associated with this updated global name (2) and then
the function terminates. The print in the script after the function call refers
to the global name x whose binding the function changed and prints its new
value (2).

In fact, if we didn't start by defining a global name x, but still wrote

  def f(n):
      global x
      x = 2
      print(n,x)

  f(3)

  print(x)

Then the definition x = 2 inside the function f defines a (new) global name and
binds it to 2, so this code prints what the code above prints.

To summarize what we know at this point

(1) We can refer to global names in a function without a special mechanism, but
       not rebind the values of global names (see 2-3)

(2) If we assign to a global name in a function, the function instead creates
       a local name and leaves the global name untouched (bound to the same
       value).

(3) If we declare a name global in a function then in contrast to (2), any
       assignment to that name binds (if the name is not already bound)
       or rebinds (if the name is already bound) the global name to a new value

Finally, these rules combine to create a confusing situation. Look at the code
below, which has no "global x" declaration. There are two natural ways to
interpret what is happening: the simple one is WRONG, the more complicated one
is RIGHT.

  x = 1

  def f(n):
      print(n,x)
      x = 2
      print(n,x)

  f(3)

  print(x)

WRONG: We might think that Python binds the global x to 1; then calls the
function f, which prints the value 1, bound to the global name (by 1 above) and
then defines and intializes to 2 a local name (by 2 above); and then prints 2,
the value bound to the local name. But this is not what Python does.

RIGHT: Python raises an exception on the first print(n,x) in f:

   UnboundLocalError: local variable 'x' referenced before assignment

Python's reasoning goes as follows. When f is declared (before it is called)
Python determines that it doesn't declare x to be global, so it knows that x
will be defined as a local name, and ALL USES of x will be to that local name.
But the first use of x occurs BEFORE that local name is defined and initialized
to 2, so Python raises an exception.

By this reasoning, examine the following code, which is the same but includes
a global decaration for x at the very end of the function.

  x = 1

  def f(n):
      print(n,x)
      x = 2
      print(n,x)
      global x

  f(3)

  print(x)

This code calls f, which first prints the value 1, of the global name x; then
it rebinds the global name x to the value 2, then it prints the new value 2,
of the global name x, and terminates the function. Then the script prints the
value 2, bound to the global name x.

Python's reasoning goes as follows. When f is declared (before it is called)
Python determines that it DOES declare x to be global (even though this
declaration is written at the end of the function (good programming practice
would place it first in the function), so it knows that x will mean the global
name x, so ALL USES of x will be to that global name.

Summary: Python has special rules for how functions use global and local names.
But these rules can be a source of confusion and errors. If possible, use NO
global names in functions; if a function must use information bound to a global
name, use that name as an argument to pass that information to the parameter of
a function.

Can you carefully use the rules above to predict (and explain) what is printed
by the following code. The problem is compounded by the fact that we are using
names of function objects, not names of value objects, but in Python objects
are objects.

  def f():
      print(1)

  def g():
      global f
      def f():
          print(2)
      f()
    
  g()
  f()


------------------------------------------------------------------------------

Class Names: Methods and Values

We have learned that we can define classes in modules: doing so defines the
class name, which refers to the class object from which instance of the class
can be constructed, when called). When we define a class, primarily we define
all the methods in the class. Every object constructed from a class has its
state initialized by Python automatically calling the __init__ method (which
defines and initializes instance/self names in the object's namespace). And
then we can call class methods on the object in the form o.m(arguments).

But classes can also define other names besides methods: names that are bound
to values. In the methods of the class we refer to such names by using qualified
name syntax: writing them as the name of the class, followed by a period,
followed by the name. Outside the class methods, we can refer to such names via
the same qualified name, or via an object of the class. Again Python translates
the access o.n as type(o).n. A name defined in this way is shared by all the
objects constructed from the class (unlike instance names which are different
for each object constructed from the class).

Here is an example of a class C that defines and manipulates a name bound to a
value object. This class does nothing but use this name to count how many
objects are constructed from the class: how many times __init__ is called.
Sometimes this counting is useful, and this is the standard way to do it.
Note that this class uses no intstance names (no self.name)

class C:
    objects_created = 0
    
    def __init__(self):
        C.objects_created += 1
        
    def how_many(self):
        return C.objects_created

Here objects_created is defined and initialized in the class, just as the
__init__ and how_many methods are defined. Notice how this name is referred in
the methods as C.objects_created.

Here are 5 statements that use the class.

a = C()
b = C()

print(a.how_many()) 
print(C.objects_created)
print(c.objects_created)

The last three all print 2, but in different ways:

a.how_many()      calls a method that returns the value
C.objects_created refers to this name directly
a.objects_created uses the fundamental equation of object-oriented to refer to
                  this name: Python translates a.objects_created  into
                  type(a).objects_created into C.objects_created

------------------------------------------------------------------------------

Defining Functions with Functions and the full LEGB Rules

Python allows us to define functions within functions. Often this is done to
write a helper function (and later we will see how we can use this technique 
to write functions that return functions). In this lecture we will examine one
function defined in a function to study the scope rules involved.

The following function takes two lists as a parameter: a list of values and a
list of checks. It returns a list the same size as the check list: for every 
value in the second list, the result list shows the number of times that value
appears in the first list.

count_all([1, 5, 3, 5, 2, 6, 5, 3, 2, 7, 4], [3, 4, 5]) returns [2, 1, 3]
because 3 occurs 2 times in the first list, 4 occurs 1 time in the first list,
and 5 occurs 3 times in the first list.

def count_all(alist,checks):
    # computed result (by appending values)
    result = []

    # inner function
    def count_one(check):
        count = 0
        for v in alist:
            if v == check:
                count += 1
        result.append(count)

    # body of count_all, to compute the result
    # result changes in count_one (alist is examined there)
    for c in checks:
        count_one(c)
    return result

Here are the names the outer/check_all and inner/check_one functions define
  (1) count_all defines two parameter names: alist and checks
  (2) count_all defines three local names: result, c, and the function count_one

  (3) count_one defines one parameter name: check
  (3) count_one defines two local names: count and v

The interesting scoping issue here is that the inner/count_one function uses
the names alist and result, which refer to names defined in the outer/count_all
function (alist is defined there as a parameter, result is defined there as a
local variable).

Why can an inner function refer to names defined in an outer function?
Generally, Python uses the LEGB rule to determine what name to use (and what
names are legal to use) in a scope. These abbreviate Local, Enclosing, Global,
and Builtins. Whenever we use a name in Python, it looks at the following scopes
IN THE FOLLOWING ORDER:

(1) Local scope (of a module or function)
(2) Enclosing scope (count_all is the enclosing scope of count_one)
(3) Global scope (in the module; discussed at length above)
(4) Builtins (names defined in the builtins module, automatically imported)

So, when defining function inside functions, the inner function (by rule 2) can
refer to names defined in the outer function. We have already discussed rules
1 and 3 earlier in this lecture. And rule 4 much earlier in the quarter. But we
can now put all these rules together under the LEGB name.

Note that although we referred to and mutated result in count_one, we never
changed its binding. If we wrote result = ['strange'] instead of 
result.append(count), we might first imagine that count_all would return
['strange']? But it would return [] (the value initially assigned to result).

This is because like with global names, names in an enclosing scope can be
examined, but if we assign to them Python creates a new local name in the
enclosed scope. So writing result = ['strange'] doesn't change the binding of
the name in the enclosing scope, it defines a new name in the enclosed scope.
As with the "global" declaration, there is a "nonlocal" declaration. If we
declare nonlocal result in check_one, then the assignment result = ['strange']
would change the binding of the result defined as a local name in the enclosing
scope, and the function would ultimately return that strange value.

Finally, now is a good time to review a classification of names:

(1) global names   : defined in a module (via =, def, or class)
(2) parameter names: defined in the header of functions/methods
(3) local names    : defined in the body of functions/methods
(4) instance names : defined in class methods (typically in __initi__ via
                       self.name = value) and used in class method
(5) class names    : defined in a class (see the discussion above)