ICS 33 Spring 2024, Project 4: Still Looking for Something

Background

Our recent discussion of Functional Programming alluded to the fact that what makes programming languages different from one another isn't solely their syntax, though that's certainly part of it. Each programming language asks its users to think differently — sometimes dramatically so — about how best to organize our solution to a problem. What's considered normal (or even desirable) in an object-oriented language might be awkward (or even impossible) in a purely functional language, and vice versa. How we'd solve a problem in a data manipulation language like SQL would be radically different from how we'd solve the same problem in a language like Python. Naturally, some kinds of problems will be better solved with one set of tools than another, so we'd expect different programming languages to excel at different tasks; part of why we want some exposure to more than one programming language is so that we can start to develop our sensibilities about the ways that languages can differ, and how we might be able to recognize the kinds of problems that are a better fit for some than others. That way, even if we don't become experts in multiple languages at once, we'll at least have embraced the idea that no single language is the best solution to every problem; that'll open our minds to learning about alternatives when they show promise, rather than falling in love with our first language and never being able to let go of it, or simply riding the waves of hype and fashion wherever they lead, for better or worse.

Fortunately, we've already had a head start on that journey, because Project 2 asked you to build a single application that was written using more than one programming language. We used Python to implement our user interface and the "engine" underlying it, while we instead used SQL to describe our interactions with the database that stored and managed the program's data. The technique of writing systems made up of code written in multiple programming languages is sometimes called polyglot programming, which, like many choices we make in computing, represents a tradeoff: We give up the simplicity of writing everything in a single language, but we gain access to a set of abilities that approach the union of the abilities of all of the languages we're using. As long as we can figure out how to make code in one language work together with code in another — in Project 2, we relied on the sqlite3 library to smoothly communicate between them — and as long as we're careful to use the "best-fit" language for each part of our program, we can sometimes achieve things that are much more difficult to achieve when writing everything in one language. The more complex the system, the greater the chance it may benefit from polyglot techniques.

Among the differences between programming languages are the differences in their syntax, which is to say that different programming languages allow us to use different keywords and symbols in different orders. Where a SQL statement might begin with SELECT or CREATE TABLE, a Python statement might instead begin with class or def. There is some overlap between the words and phrases allowed across programming languages, but there are almost always differences somewhere. We can write a + b in both Python and SQL, for example, but the statements in which it can legally appear would need to be structured differently.

As you'll see in later coursework, the ability to describe the syntax of a programming language is a fairly universal need, so we would benefit from understanding a universal solution to it. A grammar is a well-known formalism that can do that job nicely. Grammars provide a formal way to describe syntax, allowing us to specify the valid orders in which words and symbols can appear. Grammars form the theoretical basis of parsers like the one provided in Project 3, whose main jobs are to decide whether a sequence of symbols is valid, by inferring the structure from which it derives its meaning. But we can use grammars in the opposite direction, too — generating sequences of symbols that we know are valid, rather than determining whether a sequence of symbols is valid — and that's our focus in this project.

To satisfy this project's requirements, you'll write a program that randomly generates text in accordance with a grammar that's given as the program's input. (Note that parsing and generating text are hardly the only tasks for which we can use grammars; they're recurrent in the study of computer science, so you're likely to see them again in your studies, probably more than once.) You will also gain practice implementing a mutually recursive algorithm in Python, which will strengthen your understanding of our recent conversation in which we were Revisiting Recursion.

Grammars

A grammar is a collection of substitution rules, each of which specifies how a symbol can be replaced with a sequence of other symbols. Collectively, the substitution rules that comprise a grammar describe a set of sentences that we say make up a language.

As a first example, consider the following grammar.

A → 0 A 1 A | B
B → #

There are two rules that make up our grammar: One specifying how the symbol A can be replaced, and another specifying a different replacement for B. We say that symbols that can be replaced in this way are variables, which I've denoted here with boldfaced, underlined text. Meanwhile, we say that symbols that cannot be replaced are terminals, and that the sentences that are part of a language described by a grammar are made up only of terminals. There are two variables in our grammar (A and B) and three terminals (0, 1, and #).

The vertical bar ('|') symbol in the rule for A indicates optionality, which is to say that we can replace an occurrence of A with one of two options: either with the symbols 0 A 1 A or with the symbol B. Lacking a vertical bar, the rule for B offers only one option: We can only replace B with the terminal #.

We consider one of the variables to be the start variable, which is meant to describe an entire sentence. Other variables describe fragments of sentences. For the purposes of this example, we'll say that A is the start variable.

Generating a sentence from a grammar

From a conceptual point of view, a grammar can be used to generate strings of terminals within its language in the following manner. (I should point out that this will not be precisely how your program will generate its output, but we'll start here, since it's a good way to understand the concepts underlying what we're doing.)

Begin with a sentence containing only one symbol: the start variable.
As long as there are still variables in the sentence, pick one of them, find the corresponding rule with that variable on its left-hand side, and choose one of its options. Replace the variable with the symbols in the option you chose.

A sequence of substitutions leading from the start variable to a string of terminals is called a derivation. When the leftmost variable is always replaced at each step, the derivation is called a leftmost derivation. The sentence 0 0 # 1 # 1 # is in the language described by the grammar above, a fact we can prove using the following leftmost derivation.

A ⇒ 0 A 1 A ⇒ 0 0 A 1 A 1 A ⇒ 0 0 B 1 A 1 A ⇒ 0 0 # 1 A 1 A ⇒ 0 0 # 1 B 1 A ⇒ 0 0 # 1 # 1 A ⇒ 0 0 # 1 # 1 B ⇒ 0 0 # 1 # 1 #

The algorithm described above would be able to produce this same sentence by making the same choices for each application of a rule that was made in this derivation.

We would say, generally, that the language of a grammar is the set of all strings of terminals for which such a derivation can be built. It's worth noting that there are two aspects of this problem where infiniteness comes into play.

The set of strings in a language may be infinite. For example, if a grammar contained the rule X → 1 X | 1, there would be no limit on how many times we could choose the 1 X option instead of the 1 option. Still, if we're generating strings at random, we'll always pick exactly one of these options, and we expect, sooner or later, to choose the 1 option, which would prevent the generated string from becoming any longer.
A grammar can be written in a way that it describes individual strings of infinite length. If the only choice for the symbol Y is the rule Y → 1 Y, a derivation in which a string contains Y will never end; any substitution based on that rule will still lead to a string containing Y. (This is a similar problem we encounter when we have a recursive function with no base case.) In practice, though, a properly written grammar will eventually lead only to sentences of finite length.

The program

The basic goal of your program is to use the description of a grammar to randomly generate sentences that are in the grammar's language. There are a number of details to consider, which are described below.

The format of a grammar file

The program will read a grammar file, which contains the description of a grammar to be used for generating random sentences. To include that feature in your program, though, we'll need to agree on a format for grammar files, which is specified in detail below.

Each rule starts with a line containing only a left curly brace {. We'll say that each of these lines is called a rule opener.
Each rule ends with a line containing only a right curly brace }. We'll say that each of these lines is called a rule closer.
Any line of text that is not between a rule opener and a subsequent rule closer is considered to be a comment (i.e., it's irrelevant from our perspective, but can be a useful way to write a grammar file that would be more understandable to a human reader).
After a rule opener, the next line specifies the name of the variable for which a rule is being described. This line will consist of only letters and digits, but no whitespace (or other) characters.
Subsequent lines of the rule are the options for substituting a sequence of symbols in place of the rule's variable. There will always be at least one of these lines, and each of them will be as follows.
- It will begin with a positive integer (i.e., an integer greater than zero) that specifies the option's weight, which determines how frequently we'll choose it, relative to the others. That weight will be followed by a space.
- After that will be zero or more symbols, each adjacent pair separated by a single space. When a symbol consists of letters and digits surrounded by brackets (i.e., [ and ]), it is a variable; otherwise, it is a terminal. (Note that the syntactic meaning of spaces means that symbols cannot contain spaces.)

As we'll see, a grammar file doesn't specify a start variable; that's specified subsequently as input to the program, so that the same grammar file can be used with different start variables in different runs.

Having seen a description of the format, let's take a look at an example grammar file, so we can fully understand the details of what it means.


{
HowIsBoo
1 Boo is [Adjective] today
}

{
Adjective
3 happy
3 perfect
1 relaxing
1 fulfilled
2 excited
}

Let's suppose that HowIsBoo is the start variable. If so, then the grammar describes sentences whose basic structure is always Boo is _____ today, with the _____ replaced with one of five adjectives:

There's a 3-in-10 (30%) chance of the adjective being happy.
There's a 3-in-10 (30%) chance of the adjective being perfect.
There's a 1-in-10 (10%) chance of the adjective being relaxing.
There's a 1-in-10 (10%) chance of the adjective being fulfilled.
There's a 2-in-10 (20%) chance of the adjective being excited.

Where did those probabilities come from? The sum of the weights for all of the options for the Adjective variable is 10. (3 + 3 + 1 + 1 + 2 = 10.) Each individual weight is a numerator, and that sum is the denominator; happy has a weight of 3, so its odds are 3-in-10 (30%), and so on.

One thing this example demonstrates is that weights have no meaning across rules, but only within a rule. For example, the sum of the weights in the rule for HowIsBoo is 1, while the sum for Adjective is 10, which means that "1 point" of weight means more in the HowisBoo rule than it does in the Adjective rule.

A more complete example grammar file

To provide you with a more complete example of a grammar file, check out the example linked below.

grin.txt

That's a grammar file that, when its start variable is GrinStatement, generates random statements written in the Grin language from Project 3. The generated statements will have no syntax errors in them, so it should be possible to run the lexer and parser on them; however, since the statements are generated individually and separately, it's unlikely that you'd be able to run them as a Grin program, because they may have run-time errors or other problems, such as infinite loops, division by zero, or jumping to non-existent labels. Generating semantically valid Grin programs (i.e., ones that you could successfully execute) is a problem that grammars are not equipped to solve, as it turns out.

The input

The program will begin by reading exactly three lines from the Python shell (i.e., using Python's built-in input function).

The path to an existing grammar file. (If only the name of the file is specified, it will need to be located in the program's current working directory, which, by default, is the same directory as your main module.)
A positive integer specifying the number of random sentences to be generated. (Note that, as always, zero is not a positive number.)
The name of the start variable. (A variable's name does not include the brackets; the brackets are a syntactic device within the grammar file to make clear when an option is referring to a variable.)

You can safely assume that the grammar file exists, that it will be valid (i.e., it will follow the grammar file format described above), and that the program input will be formatted according to the rules specified here; we won't be testing your program with inputs that don't meet those requirements, so your program can do anything (or even crash) if given such inputs.

We also will not be testing with a grammar file that describes infinite-length sentences, which means that your program can do anything (or even crash) if given such a grammar file.

The output

The output of your program is simple: If asked to generate n sentences, your program would print a total of n lines of output, each being one of those sentences, and each having a newline on the end of it. No more, no less.

Each sentence is a sequence of terminals, separated by spaces. That's it.

A complete example of the program's execution

Let's suppose that we had a grammar file named grammar.txt identical to the shorter example shown above. Given that, an example of the program's execution might look like this.


    grammar.txt
    10
    HowIsBoo
    Boo is happy today
    Boo is fulfilled today
    Boo is relaxing today
    Boo is excited today
    Boo is perfect today
    Boo is happy today
    Boo is perfect today
    Boo is perfect today
    Boo is excited today
    Boo is happy today

Don't forget that the output is generated randomly, which means that a subsequent run of the same program with the same grammar file and the same input might reasonably be expected to produce different output. Remember, too, that the grammar file specifies its options as weights that are probabilities rather than being absolute. Consequently, a subsequent run that generates 10 sentences may, for example, have a different number of occurrences Boo is happy today; just because there's a 3-in-10 chance that happy is chosen in each sentence doesn't mean that exactly three out of every ten sentences will contain happy. (You can flip a coin ten times in a row and it can come up heads all ten times, even though there's a 1-in-2 chance of it happening each time. It's not likely, but it's not impossible, either.)

Design requirements

There are a number of ways that this problem could be solved, but we'll focus on an approach that leads to a clean, mutually recursive algorithm for solving it, which you'll be required to implement.

Representing the grammar as objects

From the description of the grammar file, we can see that it's built up from the following concepts.

A grammar contains a collection of rules.
Each rule is made up of a variable and one or more options.
Each option has a weight and a sequence of symbols, each of which is a terminal or a variable.

These facts lead directly to an idea of how to design a combination of objects that can be used to represent a grammar.

A class representing a terminal symbol.
A class representing a variable symbol.
A class representing an option.
A class representing a rule.
A class representing a grammar.

This may seem like a heavy-handed approach, but it pays off if we take it a step further. What if all of these classes implemented the same protocol, which allows us to ask any of their objects to do the same job: "Given this grammar, generate a sentence fragment from yourself"?

Generating random sentences from a grammar

Once you've represented your grammar as a combination of objects as described in the previous section, it is possible to implement a relatively straightforward mutually recursive algorithm to generate random sentences from it. The algorithm revolves around the idea of generating sentence fragments, then putting the fragments together into a complete sentence.

Here is a sketch of such an algorithm.

To generate a sentence from a grammar, it will look up the rule corresponding to the start variable, then ask that rule to generate a sentence fragment.
To generate a sentence fragment from a rule, one of its options will be chosen at random (in accordance with their weights), which will then be asked to generate a sentence fragment.
To generate a sentence fragment from an option, iterate through its symbols, generating sentence fragments from each one.
To generate a sentence fragment from a variable symbol, ask the grammar for the rule corresponding to that variable, then ask that rule to generate a sentence fragment.
To generate a sentence fragment from a terminal symbol, return the value of that terminal; that's its sentence fragment.

This mutually recursive strategy provides a great deal of power with relatively little code; by relying on Python's duck typing mechanism, we can allow the "right thing" to happen quickly and easily. (Note that why we say it's a "mutually recursive" strategy is because a grammar might use a rule, which uses one of its options, which uses one of its symbols that is a variable, which would, in turn, use another rule.)

Furthermore, if we implement that algorithm using Python's generator functions — each of these methods yields a sequence of terminal symbols, rather than returning them — we can also do this job while using relatively little memory; our cost becomes a function of the depth of the grammar's rules (i.e., how deeply we recurse), rather than the length of the sentence we're generating, which is likely to be a significant improvement if we're building long sentences.

Your main module

You must have a Python module named project4.py, which provides a way to execute your program in whole; executing project4.py executes the program. Since you expect this module to be executed in this way, it would naturally need to have an if __name__ == '__main__': statement at the end of it, for reasons described in your prior coursework. Note that the provided Git repository will already contain this file (and the necessary if statement).

Modules other than the main module

Like previous projects, this is a project that is large enough that it will benefit from being divided into separate modules, each focusing on one kind of functionality, as opposed to jamming all of it into a single file or, worse yet, a single function. As before, wFe aren't requiring a particular organization, but we are expecting to see that you have "kept separate things separate."

Unlike in Project 2 and Project 3, we are not requiring the use of Python packages, though you are certainly welcome to use them if you'd like.

Working and testing incrementally

As you did in previous projects, you are required to do your work incrementally, to test it incrementally (i.e., as you write new functions, you'll be implementing unit tests for them), and to commit your work periodically into a Git repository, which you will be bundling and submitting to us.

As in those previous projects, we don't have a specific requirement about how many commits you make, or how big a "feature" is, but your general goal is to commit when you've reached stable ground — a new feature is working, and you've tested it (including with unit test). We'll expect to see a history of these kinds of incremental commits.

Testing requirements

Along with your program, you will be required to write unit tests, implemented using the unittest module in the Python standard library, and covering as much of your program as is practical. As before, write your unit tests in Python modules within a directory named tests.

As in previous projects, how you design aspects of your program has a positive impact on whether you can write unit tests for it, as well as how hard you might have to work to do it. Your goal is to cover as much of your program as is practical, though, as in recent projects, there is not a strict requirement around code coverage measurement, nor a specific number of tests that must be written, but we'll be evaluating whether your design accommodates your ability to test it, and whether you've written unit tests that substantially cover the portions that can be tested.

Using test doubles to improve testability

One of the problems you face in this project is testing code that does things that are not directly amenable to the kinds of unit testing techniques you've learned thus far.

Printing output to the Python shell with the built-in print function. (This requires being able to ask "What got printed?" afterward, which isn't something that print supports.)
Reading input from the Python shell with the built-in input function. (This requires a user to supply input, since Python offers no built-in support for changing input to return input from somewhere else.)
Reading a grammar's description from a file. (This requires a file already to be there, making any such unit test fragile, since any change to that file would break the corresponding test.)
Choosing numbers at random and using that to drive the output, so that the same input generates different outputs each time. (This one seems particularly insurmountable, since unit tests depend on having predictable output.)

Earlier in the quarter, we learned a workaround for the first of these: By using the contextlib.redirect_stdout function from Python's standard library in concert with Python's with statement, we can temporarily redirect anything printed to the Python shell to an intermediary object, which provides the ability to ask "What got printed?" afterward. That kind of intermediary object is sometimes called a test double, whose job is to replace the built-in behavior of print — or, to be clearer, the built-in behavior of the program's standard output, which print depends on — with something different, so that the code under test can do its normal job, while being unaware that its output is being rerouted elsewhere.

The other problems listed above don't have such a simple workaround available, yet nothing prevents us from implementing the same technique ourselves. If we use an intermediary object to do a job for us, we can substitute an object in place of that intermediary in a testing scenario. As long as both objects support the same protocol for being asked to do that job, the caller can remain blissfully unaware of the difference. Consequently, we'll write more than one class that supports the same protocol: one that does the "normal" thing and another whose objects can act as a test double instead.

Rather than reading input by explicitly calling the input function, read it by asking an intermediary object for that same input. The usual implementation will call input, but a test double might respond to the same method calls by returning a hard-coded result instead.
Rather than reading a grammar from a file directly, intermediaries might present us the option of reading them from either a file or hard-coded input.
Rather than generating random numbers by calling functions from the random module is Python's standard library, intermediaries could allow us to either generate random numbers normally or generate hard-coded results instead.

As an old joke in computer science circles says, "We can solve any problem by introducing an extra level of indirection," and that's essentially what's being suggested here. (Joking aside, sometimes it's useful advice!) Third-party libraries can add fancier support to smooth this out even further, but Python offers us enough flexibility for our purposes here.

How to implement a test double

The first thing to realize about test doubles is that the phrase "test double" may be new for you, but the concept is not: When objects of two different types share the same protocol, they're both capable of being asked to do some job, even if they do that job differently. This has been a recurring theme in your coursework, since it's one of the pillars that Python's design rests on.

So, let's suppose that the job is running a database query against a SQLite database, like you did in Project 2. The "real" version might look something like this.


class SqlitePersonFinder:
    def __init__(self, connection):
        self._connection = connection
    
    def find_all_pepole(self):
        cursor = self._connection.execute(
            '''
            SELECT person_id, name, age
            FROM person;
            ''')
        
        try:
            while (row := cursor.fetchone()) is not None:
                yield Person(row[0], row[1], row[2])
        finally:
            cursor.close()

The problem with doing this in a unit test is that we can't predict what the output will be, unless we first set up a database with precisely the right set of people in it, so that we'll know what the answer is. But if we're really testing something else — a function that calls find_all_people, but what's really interesting about it isn't the database part, but what we do with the result — then it would be better to have an answer we can rely on.

So, why not give ourselves an object that can do the same job differently? The simplest idea would be to give it a list of people and have it yield those instead. "If I ask you what people are in the database, just give me these."


class FakePersonFinder:
    def __init__(self, people):
        self._people = list(people)
    
    def find_all_people(self):
        return (p for p in self._people)

As long as a "person finder" object was, for example, a parameter to the function you're testing, you could test your function by sending it a FakePersonFinder, while your actual program would pass it a SqlitePersonFinder instead.

That's all a test double is: an object that takes the place of something unpredictable (i.e., something that would cause tests to behave differently depending on how other things outside of them are set up) and replaces it with something predictable instead.

Sanity-checking your output

We are providing a tool that you can use to sanity check whether you've followed the basic requirements above. It will only give you a "passing" result in these circmustances.

It's possible to run your program by executing a correctly named module (project4.py), spelled and capitalized correctly.
Executing that module is enough to execute your program.
Your program reads its input and generates character-for-character correct input for one simple test scenario. (Notably, since this program's output is random, the scenario tested uses no optionality in any of its rules.)

It should be noted that there are many additional tests you'll be want to perform, and that there are many additional tests that we'll be using when we grade your project. The way to understand the sanity checker's output is to think of it this way: Just because the sanity checker says your program passes doesn't mean it's close to perfect, but if you cannot get the sanity checker to report that your program passes, it surely will not pass all of our automated tests (and may well fail all of them).

You'll find the sanity checker in your project directory, in a file named project4_sanitycheck.py. Run that program like you would any other, and it will report a result.

Limitations

You can use the Python standard library where appropriate in this project, but you will otherwise not be able to use code written by anyone else other than you. Notably, this includes third-party libraries (i.e., those that are not part of Python's standard library); colloquially, if we have to install something other than Python, Git, and PyCharm in order for your program to work, it's considered off-limits.

Preparing your submission

When you're ready to submit your work, run the provided prepare_submission.py script, as you did in prior projects, which will create a Git bundle from the Git repository in your project directory; that Git bundle will be your submission.

Verifying your bundle before submission

If you're feeling unsure of whether your bundle is complete and correct, you can verify it by creating a new PyCharm project from it, as you did in Project 0. (You'll want to create this project in a different directory from your project directory, so it's separate and isolated.) Afterward, you should see the files in their final form, and the Git tab in PyCharm should show your entire commit history. If so, you're in business; go ahead and submit your work.

Deliverables

Submit your project4.bundle file (and no others) to Canvas. There are a few rules to be aware of.

When grading your program, we'll grade only the most recent submission. We will not negotiate about which submission will be graded, or, for example, grade multiple of your submissions and "take the highest score."
When grading your program, we'll grade only the most recent commit on the main branch, except to the extent that we'll examine prior commits when evaluating your overall process. We will not negotiate about which commit will be graded, or, for example, grade multiple of your commits and "take the highest score."
You're responsible for submitting the version of your project that you want graded prior to the deadline. Contacting us afterward and telling us that you accidentally submitted the wrong version will not be grounds for a resubmission under any circumstances.
You're responsible for making a submission in order to receive credit, which means you'll want to be sure that you've remembered to submit your work and verified in Canvas that it's been received. A later claim of having forgotten to submit your work or having misremembered the due date will not be grounds for a resubmission under any circumstances.
The determination of whether your work has been submitted before the deadline is the time it was submitted to Canvas. Neither timestamps on local copies of your files nor timestamps on commits in your Git repository or in other places (e.g., emails or online storage) are considered evidence of completion prior to the deadline under any circumstances.

Can I submit after the deadline?

Yes, it is possible, subject to the late work policy for this course, which is described in the section titled Late work at this link. Beyond the late work deadline described there, we will no longer accept submissions.

What do I do if Canvas adjusts my filename?

Canvas will sometimes modify your filenames when you submit them (e.g., by adding a numbering scheme like -1 or a long sequence of hexadecimal digits to its name). In general, this is fine; as long as the file you submitted has the correct name prior to submission, we'll be able to obtain it with that same name, even if Canvas adjusts it.