Writing Classes

Every value object in Python is an instance of a class: for example, the value
object 'abc' is an instance of the str class. If o is a name that refers to a
value object, then type(o) refers to the class that value object is an instance
of (the class the value object was constructed from). Type('abc') is str, as is
type (s) if s = 'abc'.

In this lecture we will primarly learn about instance names, which are defined
in the namespace/dictionary of a value object; and learn about method names,
which are defined in the namespace of a class. Specifically, we will learn how
to initialize instance names in the __init__ method defined in a class, and how
to manipulate (both examine and update) instance names in other methods defined
in a class. In a large sense, these two features are what writing a class is
all about. In the process we will learn about special names preceded by one or
two underscore characters.

We will continue to use the Dice class as our primary example. All of the
knowledge we need to gain about classes can be observed and discussed in this
class, which uses its information in a relatively straightforward way. We have
seen that to construct a Dice object modeling two 6-sided dice, we can define

d = Dice([6,6])

When we construct a value object from a class, as above, we treat the name of
the class like a function and we call it, passing it arguments (in this example,
one argument: the list [6,6]). Python does three things to construct a value
object:

(1) It calls a special function that creates a value object (that is the
    instance of the class we are constructing. Note that this object
    automatically has an empty namespace associated with it: no names.

(2) It calls the special __init__ method for the class, passing the empty object
    created in (1) to the first parameter (always named self) of __init__, and
    following this argument with all the other arguments used in the call to
    construct this instance. This special method checks the arguments, and if
    they are OK, uses them to initialize the namespace of the value object.
    In summary, the __init__ method defines and initializes instance names
    in a value object's namespace.

(3) A reference to the object that was created in (1) and initialized in (2) is
    returned as the result of constructing the object. Typically such a
    reference is either bound to a variable or passed to a function call: e.g.,
    defining d = Dice([6,6]) or calling experiment(Dice([6,6]), 100000) are
    uses of .

------------------------------------------------------------------------------

The __init__ method

Generally any method lead and followed by two underscores is a special method
that Python will call automatically in certain circumstances (we can, but
generally don't call such methods directly). In classes, __init__ is the
critical method to understand, and we describe it now (and describe __str__ and
__repr__ later; ICS-33 describes dozens of such methods).

The dice class defined its __init__ method as follows. Let's follow how Python
executes the __init__ methods for Dice([6,6]).

def __init__(self : Dice, max_pips : [int]):
    assert len(max_pips) >= 1, 'Dice.__init__: max_pips is empty'
    for i in range(0,len(max_pips)):
        p = max_pips[i]
        assert p >= 1, 'Dice.__init__: max_pips['+str(i)+']='+str(p)+': must be an int >= 1'
    self._max_pips   = max_pips[:]       #Copy to avoid aliasing
    self._pips      = [0]*len(max_pips)
    self._roll_count = 0

By (1) Python creates a value object with a empty namespace

By (2) Python calls the __init__ method defined in the Dice class, passing the
new value object it created to the first parameter (always named self) and the
list [6,6] to the second parameter named max_pips. Recall len(max_pips)
determines how many dice (here, 2) and the values in the max_pips list
determines how many sides are on each die (here, 6 and 6). Now Python executes
the body of this method to ultimately define and initialize instance names in
this value object.

The first four statements in __init__ check that max_pips is a reasonable list:

  (a) The list must contain at least 1 value (at least one die)
  (b) Every value is the list must be at least 1 (at least one side on that die)

If any of these assertions are False, Python raises an AssertionError exception
and does not complete executing the __init__ method: it cannot successfully
create a Dice object with bad arguments. Note that I could have simplified the
loop and written it as follows (iterating over list values not list indexes)

    for p in max_pips:
        assert p >= 1, 'Dice.__init__: str(p)+': must be an int >= 1'

But with this code, the error message couldn't contain the index in the
max_pips list at which the first illegal value was found; it would still show
the value itself.

Many __init__ methods verify that a parameter has received a legal and
reasonable argument value, and raise an exception to indicate that the object
being constructed cannot be initialized properly. Sometimes it raises an
exception explicitly, using an if statement that tests for an illegal value.
Sometimes it uses an assert statement to check for an illegal value.

Once these assertions all succeed, there are three statements that define names
in the namespace of the this object and initialize them appropriately. This is
similar to defining/initializing a name in another module: e.g., 

  import m 		# access module m
  m.x = 1		# define a name x in module m and intialize it to 1

All these instance names start with an underscore: _max_pips, _pips, and
_roll_count. Generally starting an instance name with an underscore in Python
is a convention that indicates that only methods in the class should access
this name. Most (often all) instance should start with an undersscore. We will
examine how to access names in value object later, when we discuss other
methods defined in the class.

self._max_pips = max_pips[:]

binds the name _max_pips (in the self object) to a copy of the parameter list.
By coping the parameter list, we can ensure that a user cannot "mess up" this
name by mutating a shared object. Such a "mess up" is impossible when a literal
[6,6] is passed in, but avoids the following problem. If we wrote

  l = [6,6]
  d = Dice(l)
  l[0] = -10

then, if we did not copy the argument as shown above, and wrote just
self._max_pips = max_pips, the instance name would refer to the same
list object as l, whose first value is now -10, which is not allowed for Dice
objects.

self._pips = [0]*len(max_pips)

binds the name _pips (in the self object) to a list object filled with
len(max_pips) 0s. The roll method described below will set the _pips instance
name to random values, appropriate for max_pips. The _pips instance name will
be used by many other methods, but typically only if self._roll_count > 0;
when self._roll_count is 0, this list does not contain good values yet.

self._roll_count = 0

binds the name _roll_count (in the self object) to the value object 0. This
instance name gets incremented in roll and examined in othe methods.

By (3) a reference to this object, which is now initialized by defining and
initializing three instance names, is returned. for d = Dice([6,6]) that
reference is bound to the name d. So d now refers to a Dice object whose state
consists of three instance names bound to values (two lists, one int).

Note that we can put print statements inside __init__ to display relevant
information when __init__ is called (possibly helping us debug this special
method).

The top picture accompanying this lecture illustrate how we would think about
d = Dice([6,6]) using the name and object diagrams that we studied earlier.

Again, the purpose of __init__, whichi is called automatically called when we
write Dice([6,6]), is to create and initialize names the initially empty object
created by Python and passed to the self parameter.

Next, let's see how the other methods in the class, when called, can
examine/update these names (and return values based on them).

------------------------------------------------------------------------------

Dice methods: commands/mutators

The Dice class defines the method roll, which is its only mutator method: a
method that changes the state of a Dice object.

Recall if we call d.roll() then Python uses the fundamental equation of object-
oriented programming to translate this call to type(d).roll(d) or Dice.roll(d),
calling the roll method declared in the Dice class with the argument d matching
the self parameter. We can also directly write Dice.roll(d) but that is not
what programmers write. The roll method is defined in the Dice class as follows.

def roll(self : Dice) -> Dice:
    self._roll_count += 1
    self._pips = [ random.randint(1,max_pips) for max_pips in self._max_pips ]
    return self

When called, self refers to the same object d refers to; so any change to the
names in self actually change those names in d. So, the statement

  self._roll_count += 1

increments _roll_count by 1 (from 0 to 1 the first time it is called).

The statement

  self._pips = [ random.randint(1,max_pips) for max_pips in self._max_pips ]

binds ._pips to a comprehension that iterates through every value in 
self._max_pips and collects random integers from 1 to each value . For example
self._max_pips might refer to [5,2]: 5 pips showing on the first die and 2
pips showing on the second.

We can also write this code without a comprehension, mutating the values (0)
in the list created when __init__ was called:

  for i in range(len(self._pips)):
      self._pips[i] = random.randint(1,self._max_pips[i])

So, we have now advanced the roll counter and computed/stored the number of pips
showing on the side of each die.

The bottom picture accompanying this lecture illustrate how we would think about
d.roll() mutating the Dice object d refers to, updating the object diagram.

Finally Python executes the statement4

  return self

returns the mutated object (the object that d still refers to). If we write just

  d.roll()

we tell Python to do nothing with the returned result; but we can also chain
another call using this returned value: writing

 print(d.roll().pip_sum())

uses the returned result (a reference to d) to call the pip_sum method.

Alternatively, we could have written return None, in which case if we called
call d.roll().pip_sum() Python would raise an exception saying that we cannot
call pip_sum() on a NoneType object. If we wrote no return statement, Python
would even automatically return None.

Notice that we do NOT need to write d = d.roll() (although this code is
technically correct) because after calling just d.roll(), d refers to updated
instance of Dice, with an update _roll_count and _pips list.


------------------------------------------------------------------------------

Dice methods: queries/accessors

All the remaining methods defined in the Dice class are queries/accessors. They
examine the state of a Dice value object but do not change any of its instance
names. Here are the method and a brief description of how the body of each
computes its value correctly.

Typically in classes, each method implements some interesting operation, and
each is typically short and to the point.

  def number_of_dice(self : Dice) -> int:
      return len(self._pips)

Returns the number of dice, which is just the length of either self._pips or
self._max_pips lists. The __init__ method ensured these two lists have the same
number of vaues.

  def all_pip_maximums(self : Dice) -> [int]:
     return self._max_pips[:]

Returns a list of the maximum number of pips that can show on each side. Again
we return a copy, because we don't want whoever binds this result list to be
able to change/mutate any values in the list d._max_pips: that would affect the
results produced by roll.

  def rolls(self : Dice) -> int:
      return self._roll_count

Returns the number of times the dice have been rolled, which is counted and
stored in the _roll_count instance name.

  def pips_on(self : Dice, i : int) -> int:
      assert self._roll_count > 0, 'Dice.pips_on: dice not rolled' 
      assert 0<= i < len(self._pips), \
        'Dice.pips: die index i('+str(i)+') must be >= 0 and <'+str(len(self._pips))
      return self._pips[i]

Returns the number of pips showing on dice i, by returning self._pips[i]. Note
that if the dice have never been rolled, the first assertion fails and the
method raises an exception (because there are no random values in the ._pips
lists). If the second asserton fails, we have not specified a legal index for
the number of dice we have.

  def all_pips(self : Dice) -> [int]:
      return self._pips[:]

Returns a list of all the pips (even if the dice hasn't been rolled, in which
case it returns all 0s).  Again we return a copy, because we don't want whoever
binds this result list to be able to change/mutate any values in it.

  def pip_sum(self : Dice) -> int:
      assert self._roll_count > 0, 'Dice.pip_sum: dice not rolled' 
      return sum(self._pips)

Returns the sum of all the values in the pips list. Note that the sum function
adds up all the values in the list self._pips: e.g., sum([5,1,3,7]) returns 16.

We could write this sum explicitly as

      sum = 0
      for p in self._pips:
          sume += p
      return sum

Next,
       
  def pips_same(self : Dice) -> bool:
      return all( [self._pips[0] == p for p in self._pips] )

Returns whether or not all the pips are the same. The all function returns True
if all the values in the list (constructed here with a comprehension) are True,
and False if any are False. In the comprehension, it computes a bool expression
determining whether every value in the pip list is equal to the first value.
All pips have the same value if all pips have the same value as the first pip.
We could rewrite this code as without the comprehension as follows.

      for p in self._pips:
          if p != self._pips[0]:
              return False
      return True
  

------------------------------------------------------------------------------

Special Methods

First we discuss the __str__ method

  def __str__(self):
      return 'Dice('+str(self._pips)+')'

The __str__ method returns some useful information about the state of the
object. The method above returns a list showing the nuumber of pips on each die:
possibly 'Dice([4,2])'. That is, we can call d.__str__() and it might return the
string 'Dice([4,3])'. But there is a better way to make this call: str(d)

Calling str(d) works because Python's str function is designed to call __str__ 
on its argument. It is defined like

  def str(o : object)-> str:
      return o.__str__()

So like the __init__ method, we typically do not directly call the __str__
method, but call the str(...) function, which calls __str__.

Also recall that when we call a print function, Python automatically calls the
str function on all its arguments, printing the string value of each argument,
for easier reading. But if you concatenate string together, you must call str:
for example I can write either

print('d =',d)

where no call to str is needed; or write the following where calling str(...)
is required (otherwise Python would raise an exception about not being able
to catenate a str to a Dice object).

print('d = ' + str(d))

Now we discuss the repr function:

  def __repr__(self : object):
      return 'Dice('+str(self._max_pips)+')'

The __repr__ method should return a str which if eval'ed would return an
equivalent object to the one it was called on.  The method above returns the
string 'Dice([6,6])' which includes the Dice class and all the information
needed to define a new Dice object. eval('Dice([6,6])') returns another dice
object representing the same 2, 6-sided dice.

Again, although we can call d.__repr__() in Python we call repr(d); it works
because Python's repr function is designed to call __repr__ on its argument. It
is defined like

  def repr(o : object)-> str:
      return o.__repr__()

Finally, the method

   def standard_rolls_for_debugging(self):
      random.seed(12161949)

uses knowledge (which you don't have) about the random module to change the
seed of its random number generator. After this method is called, random.randint
(called in the roll method) generates the same random values, so calling the
roll method in the Dice class generates the same pips. This feature is useful
for debugging (so our program gets the same sequence of rolls; when we debug
some sequence of calls, it will always produce the same values).

Some classes (not his one) define special helper methods that are called by
other methods in the class to get their jobs done. Such methods should start
with a single underscore, to signal to anyone who reads a a class that those
methods should not be called explicitly (just as using instance names starting
with a single underscore indicates that only methods in the class should access
those instance names).

------------------------------------------------------------------------------

Final Words

When beginners write classes, they very often forget to use self everywhere it
is needed: prefacing any instance names (when examined or stored) or class
methods (when called by other methods in the class: not used in Dice). 

The debugger shows a disclosure +/- in front of every value object constructed
from a class (including builtins like lists, tuples, ...). By clicking the
disclosure, Eclipse shows/hides all the instance names defined in the value
object.