ICS 33 Fall 2024
Project 3: Why Not Smile?

Due date and time: Monday, November 18, 11:59pm

Git repository: https://ics.uci.edu/~thornton/ics33/ProjectGuide/Project3/Project3.git


Introduction

When I was a young kid, one of my teachers introduced me to a computer for the first time; it was a state of the art (in those days) personal computer called a Radio Shack TRS-80 Model I. First, I played little math games and messed around with other new-fangled educational tools from 1980; the state of the art wasn't much then, but it was fun and new, and felt alive with possibility.

Booting up a TRS-80 took the user directly into the equivalent of a Python shell; you could load programs from external storage like floppy disks or cassette tapes, but the computer's default mode was an environment for writing programs. My teacher asked me if I wanted to learn how to write my own programs, which I thought sounded like a great idea, though I had no idea how to do it. So, I opened up a book of his about the TRS-80's primary programming language, which was called BASIC, which was a good teaching and learning tool for its day: versatile and easy to start with, much like Python is today. I typed in a short program that asked a user for a number of hits and a number of at-bats and printed out a batting average (foreshadowing my later interest in baseball, though I didn't know what it meant at the time). I ran the program, tried it out, and I was mesmerized; the computer did exactly what I asked it to, exactly the way I asked it to. I was hooked. Over forty years later, I still am.

A natural progression of one's curiosity about programming revolves around the question of how to implement one's own programming language. Where do they come from? How are they built? While we won't be able to tackle these questions in too much depth — there are at least three different courses in our undergraduate curriculum that cover aspects of this — this project will ask you to begin exploring them. For that purpose, I've designed a considerably limited (and somewhat different) version of BASIC called Grin, which supports a small handful of statements. You'll be building a Grin interpreter, a program written in Python that takes a Grin program as its input, executes the Grin program, then shows its output. (This may sound a little mind-bending, but it's not as crazy as it sounds. The Python interpreter you've been using was most likely written in a language other than Python; the most popular one is written in a language called C.)

In the process of building your interpreter, you'll gain experience in a few areas that will stretch your abilities:


The Grin language

The precise requirements for your interpreter are discussed later in this write-up, but we'll first need to agree on the definition of the Grin language that your interpreter will implement. Grin is a programming language, though its design is quite different from Python's, so we'll first need to acquaint ourselves with how it works. Given a Grin program, you'll need to know, first and foremost, what its output is meant to be.

A Grin program is a sequence of statements, one per line. Here's an example of a Grin program:


LET MESSAGE "Hello Boo!"
PRINT MESSAGE
.

Each line contains exactly one statement (i.e., there can be no blank lines). Grin assigns a line number to each of the statements, where the first statement in the program is numbered 1, the second statement is numbered 2, and so on. There is no predefined limit on the number of statements in a Grin program. Execution of a Grin program always begins at line number 1. The last line contains only a dot (.) and nothing else, as a way to mark that the program has ended; it's not a statement, but any subsequent lines of text in the Grin program after that end-of-program marker are ignored.

The program above consists of two statements. The first one stores the text Hello Boo! into a variable named MESSAGE, then the second one prints the value of that same variable. The output of the program is what you'd expect, given that description.


Hello Boo!

Lexical rules

Like most programming languages (including Python), a Grin program is made up of a sequence of lexemes, which is a fancy-sounding term for a sequence of characters that combine together with a single meaning and comprise one of the indivisible "atoms" in the language, similar to the role that words play in sentences written in natural languages like English. Programming languages that are written textually generally define a set of lexical rules that specify which lexemes are valid and how to derive a meaning for each of them; Grin is no different, in that respect, so we'll need to start our journey with Grin by acquainting ourselves with those rules.

Grin programs are made up of the following kinds of lexemes.

Some examples of Grin lexemes and their meanings follow.


0                   # Integer literal (zero)
13                  # Integer literal (positive)
-18                 # Integer literal (negative)
0.0                 # Floating-point literal (zero)
11.75               # Floating-point literal (positive)
-3.0                # Floating-point literal (negative)
""                  # String literal (an empty one)
"Boo!"              # String literal (containing four characters)
A                   # Identifier
BOO                 # Identifier
THIS1ISTHELAST1     # Identifier
IF                  # Keyword
GOTO                # Keyword
=                   # Comparison operator
>=                  # Comparison operator
:                   # Label marker
.                   # End-of-program marker

Labels

Any statement in a Grin program can begin with a label, which is a name that can be used to refer to that statement elsewhere in the program without having to rely on knowing its line number. Labels appear at the beginning of a line, and are made up of an identifier followed by a colon.


        LET A 3
        PRINT A
        GOSUB "CHUNK"
        PRINT A
        PRINT B
        GOTO "FINAL"
CHUNK:  LET A 4
        LET B 6
        RETURN
FINAL:  PRINT A
        .

In the program above, two statements have labels on them: LET A 4 is labeled as CHUNK and the last statement is labeled as FINAL.

Spacing

One of the features of Python's syntax is that the way you space your program — indention, empty lines, and so on — has an effect on your program's meaning. Grin, in that sense, is different. Grin programs cannot have blank lines in them, each statement must be on its own line, and at least one space is required to separate lexemes that would otherwise be combined, but the specific amount and placement of blank space between the lexemes on each line is otherwise irrelevant. So, the following program is legal and equivalent Grin to the previous one shown, though obviously there's a lot to be said for using spacing to make a program's meaning more obvious to a human reader.


            LET        A    3
   PRINT        A
      GOSUB    "CHUNK"
        PRINT    A
  PRINT   B
     GOTO         "FINAL"
         CHUNK   :  LET A 4
  LET   B                            6
              RETURN
  FINAL:     PRINT    A
    .

Variables

A Grin program can utilize variables to store values that can be accessed again (or modified) later. Each variable is named by an identifier. Variables do not need to have values assigned to them before they are used, and any variable that is used before it is assigned has the integer value 0.

The primary way to change the value of a variable is with a LET statement. A LET statement changes the value of one variable, by either assigning it a literal value or the value of another variable.

You can print the value of a variable to the output by using a PRINT statement. A PRINT statement prints the value of one variable, followed by a newline.

So, consider the following short Grin program:


LET NAME "Boo"
LET AGE 13.015625
PRINT NAME
PRINT AGE
.

Its output would be:


Boo
13.015625

The formatting rules used when printing the values of variables depend on their types.

Reading input

Grin includes two statements for reading input from the console:

Either way, the syntax is mostly the same: We write INNUM or INSTR, followed by the name of the variable into which you want to read the input value. A short Grin program demonstrates the idea.


PRINT "Number:"
INNUM X
ADD X 7
PRINT X
.

This program prints output and also reads input, so let's imagine what that might look like when we execute it.


    Number:
    ​11​
    18

First, the PRINT statement on line 1 will have printed Number:. Next, a line of input will have been read and treated, in this case, as the integer 11, which will be stored in the variable named X. We'd then add 7 to X, causing its value to become the integer 18. Finally, we'd print X's value, which causes 18 to be printed.

The precise rules for INNUM need to be specified, though, since not all inputs are valid.

Meanwhile, the precise rules for INSTR are much simpler, because not much can go wrong. We read a line of text, then store the contents of that line (without a trailing newline) into the given variable. Any line of text, including empty lines or very long lines, is permitted, so there are no error conditions to consider.

Control flow and how to alter it

A Grin program is executed one statement at a time, beginning at line number 1. Ordinarily, execution proceeds forward, so that line 1 will execute first, followed by line 2, followed by line 3, and so on. Execution continues until either an END statement is reached, or until execution proceeds beyond the last statement in the program.

Like most programming languages, Grin makes it possible to write programs that execute out of sequence, though the mechanisms are a bit more primitive than they are in a language like Python. A GOTO statement causes execution to "jump" immediately forward or backward by the given number of lines. For example, the statement GOTO 4 jumps execution to the line number that's 4 greater than the current one. Here's an example Grin program that uses GOTO:


LET A 1
GOTO 2
LET A 2
PRINT A
.

In this program, line 1 is executed first, setting the variable A's value to 1. Then the GOTO statement will immediately jump execution of the program to line 4 — the GOTO statement is on line 2, and two lines beyond that is line 4 — skipping the second LET. Line 4 prints the value of A, which is still 1. So, the output of the program is simply 1.

A GOTO statement may jump either forward or backward, meaning that the following program is a legal Grin program. See if you can figure out what its output would be. (Remember that the value of a variable that hasn't yet been assigned with a LET is 0.)


LET Z 5
GOTO 5
LET C 4
PRINT C
PRINT Z
END
PRINT C
PRINT Z
GOTO -6
.

Alternatively, GOTO statements can specify a string literal specifying a label instead of a line number, in which case execution jumps to the line that is marked with that label. A Grin program equivalent to the previous one, but that uses labels instead of line numbers, follows.


        LET Z 5
        GOTO "CZ"
CCZ:    LET C 4
        PRINT C
        PRINT Z
        END
CZ:     PRINT C
        PRINT Z
        GOTO "CCZ"
        .

GOTO statements can cause the program to terminate with an error message in a few circumstances.

Finally, it should be noted that GOTO statements can use variables to specify their target, as long as the variable contains either an integer or a string value, in which case that value is treated the same as it would have been if it had been specified literally.


        LET Z 1
        LET C 11
        LET F 4
        LET B "ZC"
        GOTO F
ZC:     PRINT Z
        PRINT C
        END
CZ:     PRINT C
        PRINT Z
        GOTO B
        .

When the target of a GOTO is a variable containing something other than an integer or a string, that, too, terminates the interpreter with an error message.

Arithmetic operations

Grin provides the typical arithmetic operations that can be performed on variables: addition, subtraction, multiplication, and division. Each operation is provided as a statement that updates the value of the given variable by combining it with another value, making it equivalent to operators like +=, -=, etc., in Python. The first operand must be the name of a variable; the second can either be a literal value or the name of a variable. Here are examples of their use on integers:


LET A 4
ADD A 3
PRINT A
LET B 5
SUB B 3
PRINT B
LET C 6
MULT C B
PRINT C
LET D 8
DIV D 2
PRINT D
.

In the example above, the ADD statement adds 3 to the value of A, storing the result in A. So, printing A will display 7 on the output. The output of the entire program above is as follows.


7
2
12
4

Like Python, arithmetic operations have a different meaning when operating on different types of values. Note, though, that some of the rules in Grin are different from the ones you learned in Python. (These are among the subtleties you'll find that change from one programming language to another.)

Statement Type (in variable) Type (in operand) Result Type Example
ADD Integer Integer Integer 11 + 7 = 18
ADD Float Float Float 11.5 + 7.0 = 18.5
ADD Integer Float Float 11 + 7.5 = 18.5
ADD Float Integer Float 11.5 + 7 = 18.5
ADD String String String "Boo" + "lean" = "Boolean"
SUB Integer Integer Integer 18 - 7 = 11
SUB Float Float Float 18.5 - 7.0 = 11.5
SUB Integer Float Float 18 - 6.5 = 11.5
SUB Float Integer Float 18.5 - 7 = 11.5
MULT Integer Integer Integer 5 * 11 = 55
MULT Float Float Float 3.5 * 12.0 = 42.0
MULT Integer Float Float 3 * 12.5 = 37.5
MULT Float Integer Float 3.5 * 12 = 42.0
MULT String Integer String "Boo" * 3 = "BooBooBoo"
MULT Integer String String 3 * "Boo" = "BooBooBoo"
DIV Integer Integer Integer 7 / 2 = 3
DIV Float Float Float 7.5 / 3.0 = 2.5
DIV Integer Float Float 7 / 2.0 = 3.5
DIV Float Integer Float 7.0 / 2 = 3.5

Any combination of types not listed above (e.g., dividing a float by a string) is a runtime error, which means that the program terminates with an error message. Additionally, there are two scenarios that are runtime errors, even though the types are permissible.

Subroutines

There are no functions or methods in Grin, but there is a simplified mechanism called a subroutine. A subroutine is a sequence of Grin statements that can be "called" by executing a GOSUB statement. GOSUB is much like GOTO; it causes execution to jump either by a given number of lines or to a label. However, GOSUB also causes Grin to remember where it jumped from. Subsequently, when a RETURN statement is reached, execution continues at the line following the GOSUB statement that caused the jump. Here's an example:


LET A 1
GOSUB 4
PRINT A
PRINT B
END
LET A 2
LET B 3
RETURN
.

In the program above, line 1 is executed first, setting the value of A to 1. Next, a GOSUB statement is reached. Execution jumps to line 6 (4 greater than the line 2 it appears on), but Grin also remembers that when a RETURN statement is reached, execution should jump back to the line following the GOSUB — in this case, line 3. Line 6 is executed next, setting A to 2, then line 7 sets B to 3. Now, we reach a RETURN statement, causing execution to jump back to the line number that we're remembering — line 3. Line 3 prints the value of A (which is 2), then line 4 prints the value of B (which is 3). Next, we reach line 5, which is an END statement, so the program ends.

Subroutines can be used very similarly to Python functions or methods, except they do not take parameters or return a value. Consider the following example, which contains a subroutine that prints the values of A, B, and C each time it's called:


           LET A 3
           GOSUB "PRINTABC"
           LET B 4
           GOSUB "PRINTABC"
           LET C 5
           GOSUB "PRINTABC"
           LET A 1
           GOSUB "PRINTABC"
           END
PRINTABC:  PRINT A
           PRINT B
           PRINT C
           RETURN
           .

Subroutines can call other subroutines, meaning that two or more GOSUBs may be reached before a RETURN is reached. The rules for this are very similar to functions that call other functions in Python; for each GOSUB that is reached, Grin will remember the line to which it should return. When a RETURN is reached, execution will move to the line remembered from the most recent GOSUB. Here's an example.


LET A 1
GOSUB 5
PRINT A
END
LET A 3
RETURN
PRINT A
LET A 2
GOSUB -4
PRINT A
RETURN
.

In this example, execution begins at line 1 by setting the variable A to 1. Next, we jump to line 7 with a GOSUB, but remember that we should jump back to line 3 when we encounter a RETURN. Line 7 prints A (which is 1), then line 8 changes A's value to 2. Now we've reached line 9, which is another GOSUB statement. At this point, execution will jump to line 5, but we'll also need to remember to jump back to the line following this GOSUB — line 10 — when we reach a RETURN. But we also need to remember the line from the previous GOSUB — line 3.

Line 5 sets A to 3, then we encounter our first RETURN statement. We're remembering two lines — line 3 and line 10. But line 10 is the most recently remembered line, so execution jumps to line 10. Line 10 prints A (which is 3). Now, we encounter another RETURN statement on line 11. We're remembering the line 3 from the first GOSUB. So, execution jumps to line 3, printing A (which is still 3), then ending the program on line 4.

So, the output of this program is as follows.


1
3
3

Like GOTO statements, GOSUB statements are not permitted to jump beyond the boundaries of the program or to non-existent labels, nor can they jump to the same line they came from. If such a GOSUB statement is encountered while a program is executed, the program terminates with an error message.

It is also an error for a RETURN statement to be encountered when there has been no previous GOSUB. The Grin program will immediately terminate and print an error message in this case, as well.

Conditionally altering control flow

Grin provides no precise equivalent of Python's if statement, but it does offer a form of conditionality, in the sense that both GOTO and GOSUB statements can operate conditionally. Optionally, after the target of a GOTO or GOSUB, we can write the word IF, followed by a comparison expression that compares two values — literal values or the values in variables — with the result of that comparison determining whether the statement should cause execution to jump.


LET A 3
LET B 5
GOTO 2 IF A < 4
PRINT A
PRINT B
.

In the program above, the variables A and B are given the values 3 and 5, respectively. A GOTO statement compares A to 4. Since A is less than 4, the GOTO statement jumps to line 5 — 2 lines beyond itself — and B is printed, but A is not. The output of the program is as follows.


5

Both GOTO and GOSUB can be executed conditionally in this way. The comparison can use one of these six relational operators.

The types of values being compared partly determine the result of their comparison.

Comparisons between any other pair of values (e.g., integers to strings) result in runtime errors.


Getting started

Near the top of this project write-up, you'll find a link to a Git repository that provides the starting point for this project. Using the instructions you followed in Project 0 (in the section titled Starting a new project), create a new PyCharm project using that Git repository; you'll do your work within that PyCharm project.

Acquainting yourself with the provided code

There's one bit of good news right off the bat: You aren't implementing this project from scratch. Some of the details have been implemented already, and they've been provided to you in their entirety, including the unit tests used to test them. While you won't need to have an expert understanding of every line of that code, you'll need to gain some familiarity with the "public" parts of it (i.e., you'll need to understand the problems solved by the provided code and how to use it), like you would with the Python libraries you've likely used in your prior coursework.

Once you create your new PyCharm project, you'll notice that they're organized similarly to the code in Project 2, with multiple packages used to separate different areas of the program's functionality.

Have a look around what's been provided. You'll find that each file contains documentation describing its purpose, and that each one also describes what ways (if any) you'll need to change them, though most of what you'll be writing will be in new files you'll be adding within the grin package and its corresponding test directory.


The program

To satisfy this project's requirements, you'll be building your own Grin interpreter, using the provided code as a starting point. The most basic requirements that your program will need to meet are the following ones.


Designing your interpreter

As the size of a program increases, one of the most difficult obstacles that inexperienced programmers face is their ability to keep separate issues isolated from one another, so they can work on one problem, get all the way to the bottom of it, and then move on to another. This is sometimes referred to as separation of concerns, one of the primary strategies for which is to break a large program into a set of smaller pieces. The obvious mechanism for breaking up a program in Python is the use of classes and functions, though the finesse is in deciding where the seams between those classes and functions should be.

The temptation, especially for novices, is always to try to think about the complete picture, since this strategy works well for the short programs that you write when you're first starting out. As programs become larger, confusion naturally sets in, as the complete picture can be difficult to keep in your brain all at once. Even moderately small Python programs are typically built out of many interacting parts and encompass a great deal of complexity. My complete Grin interpreter has around a dozen modules and several hundred unit tests. (Yours may have fewer, because I implemented a couple of features that I haven't assigned, but this gives you a rough idea about size.) Now, before you freak out, bear in mind that many of those modules contain relatively short classes, a few relatively short functions, and so on. I opted to write more modules with less code in each, so that I could concentrate my efforts on implementing and testing each one largely in the absence of the others.

This project will encourage you to begin thinking about your programs the same way, which will give you the ability to write much larger programs than you could before, as well as enable you to be able to write unit tests at a level of depth you weren't able to do previously.

Design requirements

As you work on your interpreter, you'll want to keep the following requirements in mind.

We don't have specific requirements about how many modules you'll need to have, or precisely the way you break your program up, but we'll be evaluating whether you've approached the problem in a way that keeps separate concerns separate.

Within your modules, you'll need to look for opportunities to use classes and inheritance wherever appropriate, both because they're topics we've been exploring in some depth in lecture, and because this problem is one that's amenable to being solved with them, since there are behavioral similarities that can be implemented once and reused using inheritance.

The basic concepts underlying the interpreter

While it's certainly not the case that I start every design by thinking about it in its entirety, one way to start thinking about this particular problem is to brainstorm about the concepts that underlie it. You can then think about the way those concepts fit together, and the ways you might be able to keep them separate. In general, separate concepts should be kept separate until they can't be — which might be quite the opposite of the way you've approached design in your past, because it's not a technique that pays off until programs grow to sizes like this one.

Each of those concepts could potentially be implemented in a module, and by one or more classes. Don't feel as though you need to build an entire design in your head at once; focus instead on one problem that you can isolate from the others, get your head around that problem, implement (and test) something that you think solves it, and move forward from there. Allow yourself the freedom to be wrong sometimes — not every design idea will pan out, but when you have Git backing you up, you can give up on a bad idea by simply rolling back your changes to the most recent "stable ground" commit.


Unit testing

Along with your Grin interpreter, you will be required to write unit tests, implemented using the unittest module in the Python standard library, and covering as much of your interpreter as is practical.

Note that how you design aspects of your interpreter has a positive impact on whether you can unit test it, as well as how hard you might have to work to do it; that's one of the reasons why you're aiming to write multiple modules in the grin package, and to tackle separate issues separately. For example, the fact that lexing and parsing are handled separately in the provided code means they can be tested separately; the fact that the result of parsing is a sequence of opaque tokens (instead of strings that need to be parsed again later) makes the code that uses the results of parsing similarly easier to test, since the tests can be written in terms of those higher-level tokens.

There is not a strict requirement around code coverage measurement, nor a specific number of tests that must be written, but we'll be evaluating whether your design accommodates your ability to test it, and whether you've written unit tests that substantially cover the portions that can be tested. (Isolating code that has side effects — such as reading and writing text in the Python shell — can go a long way toward making your program more testable.) The provided tests are there partly to give you an idea of what a reasonably complete set of unit tests look like; your goal is to do likewise.


Grin quick reference

Here is a list of all of the Grin statements (and their different variants) that should be supported by your interpreter, with a brief description of the effect of each.

Statement Description
LET var value Changes the value of the variable var to the given value, which will either be a literal value or the name of another variable.
PRINT value Prints the given value to the console, where value will be either a literal value or the name of a variable.
INNUM var Reads a number (either integer or floating-point) into the variable var, which must be the name of a variable.
INSTR var Reads a line of text and stores it (as a string) in the variable var, which must be the name of a variable.
ADD var value Adds the given value to the value of the variable var, where value will be either a literal value or the name of another variable.
SUB var value Subtracts the given value from the value of the variable var, where value will be either a literal value or the name of another variable.
MULT var value Multiplies the value of the variable var by the given value, where value will be either a literal value or the name of another variable.
DIV var value Divides the given value to the value of the variable var, where value will be either a literal value or the name of another variable.
GOTO target Jumps execution of the program to the given target, which will be an integer specifying a relative number of lines or a string containing a label.
GOTO target IF value1 op value2 Jumps execution of the program to the given target, but only if the values of value1 and value2 compare true using the relational operator op (=, <>, <, <=, >, >=). If the comparison is false, the statement has no effect.
GOSUB target Temporarily jumps execution of the program to the given target, which will be an integer specifying a relative number of lines or a string containing a label. A subsequent RETURN statement will cause execution to jump back to the line followed the GOSUB.
GOSUB target IF value1 op value2 Temporarily jumps execution of the program to the given target, but only if the values of value1 and value2 compare true using the relational operator op (=, <>, <, <=, >, >=). If the comparison is false, the statement has no effect.
RETURN Jumps execution of the program back to the line following the most recently-executed GOSUB statement.
END Ends the program immediately.
. Special marker that indicates the end of the program text. Behaves as an END statement when encountered.


Sanity-checking your output

We are providing a tool that you can use to sanity check whether you've followed the basic requirements above. It will only give you a "passing" result in these circmustances.

It should be noted that there are many additional tests you'll be want to perform, and that there are many additional tests that we'll be using when we grade your project. The way to understand the sanity checker's output is to think of it this way: Just because the sanity checker says your program passes doesn't mean it's close to perfect, but if you cannot get the sanity checker to report that your program passes, it surely will not pass all of our automated tests (and may well fail all of them).

You'll find the sanity checker in your project directory, in a file named project3_sanitycheck.py. Run that program like you would any other, and it will report a result.


Limitations

You can use the Python standard library where appropriate in this project, but you will otherwise not be able to use code written by anyone else other than you. Notably, this includes third-party libraries (i.e., those that are not part of Python's standard library); colloquially, if we have to install something other than Python, Git, and PyCharm in order for your program to work, it's considered off-limits.


Preparing your submission

When you're ready to submit your work, run the provided prepare_submission.py script, as you did in prior projects, which will create a Git bundle from the Git repository in your project directory; that Git bundle will be your submission.

Verifying your bundle before submission

If you're feeling unsure of whether your bundle is complete and correct, you can verify it by creating a new PyCharm project from it, as you did in Project 0. (You'll want to create this project in a different directory from your project directory, so it's separate and isolated.) Afterward, you should see the files in their final form, and the Git tab in PyCharm should show your entire commit history. If so, you're in business; go ahead and submit your work.


Deliverables

Submit your project3.bundle file (and no others) to Canvas. There are a few rules to be aware of.

Can I submit after the deadline?

Yes, it is possible, subject to the late work policy for this course, which is described in the section titled Late work at this link. Beyond the late work deadline described there, we will no longer accept submissions.

What do I do if Canvas adjusts my filename?

Canvas will sometimes modify your filenames when you submit them (e.g., by adding a numbering scheme like -1 or a long sequence of hexadecimal digits to its name). In general, this is fine; as long as the file you submitted has the correct name prior to submission, we'll be able to obtain it with that same name, even if Canvas adjusts it.