while Statements


In this lecture we will examine indefinite iteration with Python's while loop.
In the process we will learn about the break statement, which can be used in
both of Python's loops. And once we learn about break, we will see that the
keyword else is usable with both of Python's loops, and we will learn its
meaning (related to whether the loop finishes "normally" or via a break). Once
we cover Python's try/except statement, we can write a try/except/while that is
equivalent to (and further illustrates) Python's for loop

While loops are called indefinite loops: at the time they start they do not
know how many iterations they will perform (and if written incorrectly, they
might never terminate: what we call an "infinite" loop). While loops use a
boolean condition (really an expression, just like those used in if statements)
to determine when to continue or terminate the loop.

The EBNF for a while_statement is quite simple. Note that like all the control
structures we've seen, its first line ends with a : indicating that a block
(indented) follows. The word while is a keyword in Python.

  while_statement <= while expression:
                         block

As a syntax constraint, expression must result in a boolean value. We call this
expression the "continuation condition" of the loop. Semantically (in a
nutshell) the while loop will repeatedly execute block (0 or more times) while
this expression evaluates to True and terminate the loop the first time this
expression evaluates to False.

Semantically, Python executes a while_loop as follows

  (1) Evaluate the expression (True or False)
  (2) If it is False, terminate the loop
  (3) If it is True, execute block and redo (loop back) to step (1)

The loop's body is executed 0 times if the expression is initially False

Here is a simple while_loop that is infinite, because the expression (the
literal value True, which is by itself an expression) will never evalutate to
False.

while True:
    print('You are going to master progrmming in ICS-31!')

Note the block in a while statement is typically called the body of the while
loop. Instead to writing more interesting boolean expresssions in the
while_loop, we will first look at some examples where the expression is always
the literal True. To make such loops useful (and not infinite), we will need to
introduce the break statement, which will always be used inside an if statement.
I believe beginning programmers should first use only True boolean expressions
in while loops, and use if/break combinations to control their termination.

------------------------------------------------------------------------------

The break statement and while True loops

The EBNF rule for the break statement is very simple. note that break is another
keyword in Python

  break_statement <= break

Python imposes a syntax constraint that a break statement must appear inside
the body of some loop (either for or while). If Python executes a break
statement that is not in the context of some loop, it raises an exception,
because break is only meaningful in loops: SyntaxError: 'break' outside loop

In real scripts, break statements appear inside if statements (which themselves
are inside the bodies of loops), so a typical example is

if count_down == 0:
    break

Semantically, whenever a break statement is executed, Python terminates the
inner-most loop that it apears in (we will see examples of nested loops, one
inside another, when our programming tasks get more complicated); it "breaks"
out of that loop. Terminating a loop means Python next executes the statement
AFTER the loop's block (it does not mean that the program itself terminates).
By putting a break statement inside an if statement, the if can control (based
on its boolean test expression) whether or not the break statement is executed:
the test determines whether or not the loop terminates on this iteration. Here
is a typical combination of while/break statements

count_down = 3
while True:
    print(count_down + '.',end='')
    if count_down == 0:
        break
    count_down -= 1
print('Go');

Let's hand simulate this example and write a trace table for it.

Typically the body will contain code that changes values that appear in
expression, and executing the body enough times will eventually make the
expression True. Here is an example of a countdown while loop
                    count
Statement         | _down | Console    | Explanation
------------------+-------+------------+---------------------------
count_down = 3    |   3   |            | assignment statement
while True:       |       |            | Start loop: True: execute body
print(count_down..|       |3.          | print (stay on same line)
if count_down...  |       |            | False: skip next block (break)
count_down -= 1   |   2   |            | decrement count_down
...while True:    |       |            | Continue loop: True: execute body
print(count_down..|       |3.2.        | print (stay on same line)
if count_down...  |       |            | False: skip next block (break)
count_down -= 1   |   1   |            | decrement count_down
...while True:    |       |            | Continue loop: True: execute body
print(count_down..|       |3.2.1.      | print (stay on same line)
if count_down...  |       |            | False: skip next block (break)
count_down -= 1   |   0   |            | decrement count_down
...while True:    |       |            | Continue loop: True: execute body
print(count_down..|       |3.2.1.0.    | print (stay on same line)
if count_down...  |       |            | True: execute next block (break)
break             |       |            | Terminate loop
print('Go')       |       |3.2.1.0.Go  | Executes statement in block, after loop

Truth be told, this is really a definite loop, and it would be easier to write
as follows. But it does illustrate the meaning and use of while True: and break
statements.

for count_down in irange(3,0,-1):	# note, irange from my goody module
    print(count_down + '.',end='')
print('Go');

Now we will write a loop that prompts the user for an arbitrary  number of
scores, and computes their average. This is not a definite loop because the
user indicates when the data is finished being entered by typing the special
value -1, and the loop has no knowledge of when this will occur; this is called
a sentinel value, and the loop is called a sentinel loop.

count = 0
sum   = 0
while True:
    score = prompt.for_int('Enter a Score (-1 to Terminate)')
    if score == -1:
        break
    count += 1
    sum += score
if count != 0:				# Avoid /0 if sentinel entered first
    print('Average =', sum/count)
else:
    print('No scores to average')

Let's hand simulate this example and write a trace table for it, assuming the
user will enter 3, 6, 4, and then -1.
                
Statement      |count|sum|score| Console | Explanation
---------------+-----+---+-----+-----------+-----------------------------------
count = 0      |  0  |   |     |           | assignment statement
sum = 0        |     | 0 |     |           | assignment statement
while True:    |     |   |     |           | Start loop: True: execute body
score = ...    |     |   |  3  | Enter...3 | assign/prompt for a score
if score == -1:|     |   |     |           | False: skip next block (break)
count += 1     |  1  |   |     |           | assignment statement
sum += score   |     | 3 |     |           | assignment statement
...while True: |     |   |     |           | Continue loop: True: execute body
score = ...    |     |   |  6  | Enter...6 | assign/prompt for a score
if score == -1:|     |   |     |           | False: skip next block (break)
count += 1     |  2  |   |     |           | assignment statement
sum += score   |     | 9 |     |           | assignment statement
...while True: |     |   |     |           | Continue loop: True: execute body
score = ...    |     |   |  4  | Enter...6 | assign/prompt for a score
if score == -1:|     |   |     |           | False: skip next block (break)
count += 1     |  3  |   |     |           | assignment statement
sum += score   |     | 13|     |           | assignment statement
...while True: |     |   |     |           | Continue loop: True: execute body
score = ...    |     |   | -1  | Enter...-1| assign/prompt for a score
if score == -1:|     |   |     |           | True: execute next block (break)
break          |     |   |     |           | Terminate loop
if (count ...  |     |   |     |           | True: execute next block (print)
print('Ave...  |     |   |     |Average... | Executes statement after loop

Finally, let's look at a simplified version of the loop in the Collatz program.
This version just prompts for a value and prints how many iterations it took to
reduce that value to 1 by the Collatz rules. We cannot compute this value any
other way than by looping and changing "test".

import prompt,predicate
test  = prompt.for_int('Enter number to test',is_legal=predicate.is_positive)
count = 0
while True:
    if test == 1:
        break
    else:
        count += 1
        if test%2 == 0:
            test = test//2
        else:
            test = 3*test + 1
print('Reduction took',count,'iterations')

Here are a few observations about this code. First, as we discussed, no one
knows whether this loop even terminates for all inputs: this loop might be
infinite for a certain value of test because

  (1) test might get bigger and bigger during the calculation without limit;
      after all, it goes up by a factor of 3 and down by a factor of only 2.
      Sure it goes up and down, but maybe on average it goes up more than it
      goes down.

  (2) test might get stuck in a cycle, reaching some value X, then going up and
      down a bit and returning to the same value X, which would then repeat in
      exactly the same manner over and over forever forever

So, it makes sense that testing Collatz wouldn't use a for loop, because that
loop always terminates. Of course, no one has found a value for test that
doesn't eventually reach 1 by these rules. But there might be a number out there
that no one has tried that creates an infinite loop (find it and you'll become
famous for disproving the Collatz conjecture): no one has proved that the
process terminates for all positive numbers.

Second, we can simplify the code above as follows, while retaining equivalence
of execution. Let's just focus on the while loop and its body

while True:
    if test == 1:
        break
    count += 1
    if test%2 == 0:
        test = test//2
    else:
        test = 3*test + 1

It is not a tremendous simplification, but we turned the first if/else statement
into just a if, and avoided nesting the following count+=/if statements in
another block in an else deeper inside the while loop. This code is equivalent
because both ifs will execute break if test == 1; and both loops will execute
the count+=/if if test is not == 1: the first in an else: block, the second
by just continuing in the same block the first if is in. Draw a boxing diagram
to see the difference.

We will discuss equivalence and simplification in a later lecture, and see a
general rule that applies to this particular simplification.

------------------------------------------------------------------------------

while loops with real tests (not just True)

Finally, we will show how to change a while True into a while loop with a real
test and remove an if/break. To do this the if/break must be the first statement
inside the body of the while loop: sometimes, if it is not, we can transform the
loop body by other rules of equivalence to promote it to the first position, but
sometimes we cannot.

Generally, we can tranform code of the form

while True:
    if some-test:
        break
    statements

into

while not(some-test):
    statements

In the top case, each execution of the while loop first evaluates some-test and
breaks out of the loop if it is True. In the bottom case, before each execution
of the loop the while evalutes not(some-test) and again, if some-test is True
then not(some-test) is False, and if the boolean expression controlling the
while loop is False, the loop terminates (as with the if/test). Recall the
boolean expression in the while syntax is called the "continuation" condition
(True continues, False terminates) while the boolean expression in the if/break
is a termination condtion (True terminates, False continues) so it makes sense
that we must negate the test with not when changing it from a termination
condition to a continuation condition (or vice versa).

In the Collatz loop, some-test is test == 1 so we could write not(test == 1) to
control the while, but we should simplify that to test != 1 and write

while test != 1:
    count += 1
    if test%2 == 0:
        test = test//2
    else:
        test = 3*test + 1

Note that in the general pattern I put parentheses around some-test, because it
might include operators that are lower precedence than not: for example if the
test were a == 1 or b == 2, then writing not(a == 1 or b == 2) would be correct,
and different from writing not a == 1 or b == 2, which according to Python's
operator precedence rules would be interpreted as not(a == 1) or b == 2, first
performing the not on (a == 1) and then or'ing it with (b == 2). Actually the
simplest negation is a != 1 and b != 2.

We cannot perform this transformation on all while True with if/break loops,
because some do not have the if/break as the first statement in the loop
body (e.g.,the first two loops discussed in this lecture).

Some experienced programmers (and educators) like to avoid using while True with
if/break, but I am fine with this usage. In fact, I think it is easier for
beginners (and sometimes even experienced programmers) to first just write the
loop (not having to think: we always write it as while True) and then worry
about what the body of the loop should compute, and how/where to test for 
termination. If it so happens the if/break is first in the loop (it actually
is for a surprising number of loops) then we can perform the transformation.

Where this advice goes wrong in the hands of beginners is that they often write
multiple if/breaks inside their while True loops that make these loops
complicated and very hard to understand. The most imporant part of a loop is
how it terminates. It is an extraordinary while True loop that requires more
than one if/break, so you should be skeptical that your code is simple if it has
multiple if/break statements.

Finally, it is possible to have a while with a test and if/break statements in
its body (or for that matter, a for loop with an if/break in its body; an
example is shown in the next section). Syntactically and semantically Python
knows what to do, but according to the paragraph above, having tests in two or
more spots is excessive and you should try to simplify your thinking and your
code. I have written some loops in my life that have multiple tests for
termination, but not many.

------------------------------------------------------------------------------

The else: block-else option in for/while loops

Now we will extend the syntax of for/while loops (with optional else parts).
Understanding their meaning depends on knowing about break statements, which is
why I had to simplify the truth (also known as lying) earlier. The actual EBNF
of for_statement and while_statement are as follows (of course you can bet that
I'm still lyingXXXXX simplifying the truth a bit).

for_statement   <=   for index in iterable:
                         block-body
                     [else:
                         block-else]

while_statement <=   while expression:
                         block-body
                     [else:
                         block-else]

Both allow the else: block-else option (so if we ignore this option, the EBNF
I wrote before is correct; typically when I update EBNF it is just to "add"
stuff; never subtract). Here are the semantics of this option.

   If the else; block-else option appears, and the loop terminated normally,
   (not with a break statement) then execute block-else.

Here is an example that makes good use of the else: block-else option. This
code prints the first/lowest value (looking at 0 to 100 inclusive) for which
the function special_property returns True (and then breaks out of the loop);
otherwise it prints that no value in this range had this property: so it prints
exactly one of these two messages. Note you cannot run this code, because there
is no special_property function: I'm using it for illustration only.

for i in irange(100):
    if special_property(i):
        print(i,'is the first value with the special property')
        break
else:
    print('No value in the range had the special property')

Without the else: block-else option, the simplest code that I can write that has
equivalent meaning is as follows.

found_one = False
for i in irange(100):
    if special_property(i):
        print(i,'is the first with the special property')
        found_one = True
        break
if not found_one:
    print('No value in the range had the special property')

This solution requires an extra name (found_one), an assignment to reset the
name, and an if statement. Although I came up with the example above, I have
not used the else: block-else option much in Python (but I've only used Python
for a few months). Most programming languages that I have used don't have this
special feature, so I'm still exploring its usefulness. 

Can you predice what would happen if I removed the break statement in the bigger
code above?

I think the bottom line here is although Python allows else: in loops, it is
hardely every used there. The vast majority of uses of else: is in if
statements.

------------------------------------------------------------------------------

Compressed trace tables

When we hand simulate programs with complicated control structures, most of the
trace table is occupied by information relating to control structures deciding
which statements to execute next, as opposed to statements that actually rebind
a name or change the contents of the console window. Such trace tables are
cumbersome to create and tedious to read (but fully explain how the code is
executed). Compact trace tables remove all the information related to control
structures, and instead focus on rebinding names and the changes to the console
window.

To construct a compact trace table, we list all the names and Console in
separate columns (and omit the Statement and Explanation column). Only when the
code rebinds a name OR the console window changes do we update information in
the appropriate column; and we always do so right beneath the last entry for
this column. THERE ARE NEVER BLANK CELLS BETWEEN ONE CHANGE AND THE NEXT CHANGE.

Note that what we lose in a compact trace table (we gain conciseness) is an
indication of the order in which different names are rebound to new values:
because each column is shown as compactly as possible (no blank entries); we
lose the correlation among columns. Here are compact trace tables for the two
standard trace tables shown above. First, the count down loop.

count_down | Console
-----------+-------------
    3      | 3.
    2      | 3.2.
    1      | 3.2.1.
    0      | 3.2.2.0.
           | 3.2.2.0.Go

Here is the sentinel terminated loop.

count | sum | score | Console
------+-----+-------+-------------------------------------------
  0   |  0  |   3   | Enter a Score (-1 to Terminate): 3
  1   |  3  |   6   | Enter a Score (-1 to Terminate): 6
  2   |  9  |   4   | Enter a Score (-1 to Terminate): 4
  3   | 13  |   -1  | Enter a Score (-1 to Terminate): -1
      |     |       | Average = 4.333333
 	 	 	
So we see the count goes from 0 to 3, the sum goes from 0 to 13, and the
scores are entered as 3, 6, 4, and -1 but there is no correlation among columns
as to when each change happens.

Remember, in a compact trace table, a column entry is filled in only when all
the column entries on top of it have been filled in; there are never an blank
entries (except possibly at the bottom).

------------------------------------------------------------------------------

Short-circuit Logical Operators

The operators and/or are called short-ciruit operators, because unlike other
operators (which evaluate both of their operands before being applied), short-
circuit operators evaluate their left operand first, and sometimes don't need
to evaluate their right operand (and therefore don't, in those cases). For
example True or anything is True; False and anything is False. So if the or
operator's left operand is True, it doesn't need to evalate its right operand:
the result is True; and if the and operator's left operand is False, it doesn't
need to evalate its right operand: the result is False. This rule saves time,
but it is more important for another reason (discussed below). If the left
operands of or/and is False/True it must evaluate the right operand in order to
compute its result.

So, for example, if we wrote the following if/test (assume d is a dict)
  s = prompt.for_string('Enter string')

  if len(s) > 10 and s[10] == '*':
      ....

Python would first evaluate the expression: len(s) > 10. When False it would
determine the value of the and operator is already known: it is False. It
would not have to evaluate s[10] == '*' and that is a good thing, because that
would throw an IndexError exception, because if the length of a string is <= 10
then 10 is not an legal index (legal indexes are 0 to len(s)-1).

So, while short-circuit operators can save a little time, that is not their
most important purpose; avoiding raising exceptions is the primary reason that
and/or operators are short-circuit.

Even without short-circuit operators, we could write the equivalent code as
follows. Only if the test in the first if is True, does Python execute the
second if inside it.

  if len(s) > 10:
      if s[10] == '*':
          ....

But that requires nested if statements and is much more cumbersome. It is
better to spend some time learning about short-circuit operators now, and then
understand how to use them and save time later by using them appropriately.