Setup

Register for the lab using grinch: https://grinch.caltech.edu/register. For this lab, you will need to clone the repository for some starter code. The button at the bottom of this page will automatically submit your answers, and gitlab will check them for you.

Manhattan Walks

Consider the following problem:

A Manhattan Walk from \((0, 0)\) to \((n, m)\) is a series of moves starting at \((n, m)\) with \(m\) Norths and \(n\) Easts that eventually reaches \((0, 0)\). There are many such Manhattan Walks (differentiated by a different order of Norths and Easts). The following programs both compute how many Manhattan Walks there are from \((n, m)\) to \((0, 0)\).

def walks(n, m):
   if n < 0 or m < 0:
      return 0
   if n == 0 and m == 0:
      return 1
   return walks(n - 1, m) + walks(n, m - 1)

walks_answers = {}
def walks_memo(n, m):
   global walks_answers
   if n < 0 or m < 0:
      return 0
   if n == 0 and m == 0:
       return 1
   if (n, m) in walks_answers:
      return walks_answers[(n, m)]
   walks_answers[(n, m)] = walks_memo(n - 1, m) + walks_memo(n, m - 1)
   return walks_answers[(n, m)]

The program on the left naively uses recursion to calculate the answer. Unfortunately, it repeats many of the same calls over and over.

For example, walks(2, 2) calls walks(1, 1) twice. (Once from walks(2, 2) \(\mapsto\) walks(1, 2) \(\mapsto\) walks(1, 1) and again from walks(2, 2) \(\mapsto\) walks(2, 1) \(\mapsto\) walks(1, 1).)

The program on the right “memoizes” the results, which is a fancy way of saying it saves answers to calls its already seen. First, it initializes a global map from arguments to answers. Then, before computing the recursive calls from scratch, it checks if they were already computed and in the map. Finally, before returning a new answer, it saves the result into that map.

`static` in Java

As we’ve discussed previously, static in Java is a way of indicating that a field “transcends” any particular instance. To put it another way, the field belongs to the blueprint of the class itself rather than to any particular instance created with the new keyword. This allows us to make “functions” and global variables that we can use over multiple iterations of calls or instantiations of classes. For today’s purposes, we will use static variables to count and collect things across multiple calls to the same “function”.

Counting the Number of Method Calls

We would like to begin by figuring out just how many repeated calls are made to walks when calculating a single input. To do this, we’ll write some Java code to automate tracing through the code and finding the right numbers.

Task 0.

Modify Walks.java in the following three ways:

Add a new static field to the class to represent the number of recursive calls made by walks.
Modify the walks method to modify the existing count every time a call to walks is made (including the original, first call).
Write a main method that, for each argument pair in the following table, resets your static field, calls the method, and checks the resulting number.

Question.

Use the results of your modified program from task 0 to fill out the following table.

Method Call	# of Actual Calls
`walks(0, 0)`	1
`walks(1, 1)`
`walks(2, 2)`
`walks(3, 3)`
`walks(4, 4)`
`walks(5, 5)`
`walks(6, 6)`

Check Answer

Counting the Number of Unique Method Calls

This time, we will try to count the number of unique method calls made when walks(n, n) is called for various n’s.

Java has a (relatively) recent feature called record classes which allows you to write a whole class (including reasonable toString, hashCode, compareTo, etc. methods) without explicitly writing them yourself. In this lab, we will need to make use of a simple record class called Pair which represents a tuple of two items. You can declare this record class with the following line in your code:

public record Pair<K, V>(K key, V value) {}

Task 1.

Modify Walks.java in the following three ways:

Add another new static field to keep track of which arguments have previously been given to a call to walks. Since the recursive calls might have different \(n\) and \(m\) values, you should use the Pair class here, as described above.
Modify the walks method to modify the set of prior arguments every time a call to walks is made (including the original, first call).
Write a main method that, for each argument pair in the following table, resets your static field, calls the method, and checks the size of the resulting set.

Question.

Use the results of your modified program from task 1 to fill out the following table.

Method Call	# of Unique Calls
`walks(0, 0)`	1
`walks(1, 1)`
`walks(2, 2)`
`walks(3, 3)`
`walks(4, 4)`
`walks(5, 5)`
`walks(6, 6)`

Check Answer

Calculating the Results (without memoization)

Finally, we will actually calculate the results.

Task 2.

Use the existing Walks.java (which does not use memoization) to fill in the values in the following table, You should notice that, as the argument values increase, the speed of the function slows down significantly.

Question.

Fill in these values as a baseline.

Method Call	Return Value
`walks(0, 0)`	1
`walks(1, 1)`
`walks(2, 2)`
`walks(3, 3)`
`walks(4, 4)`
`walks(5, 5)`
`walks(6, 6)`

Check Answer

Calculating the Results (with memoization)

Now, attempt to calculate walks(20, 20) with your existing code. By our calculations, this call would take around 2.5 hours to finish on Adam’s new 2024 MacBook Pro. Let’s see if we can speed this up using memoization. That is, using a hash table to save the results of any previous calculations to avoid spending time re-calculating the same thing over and over.

Task 3.

Modify Walks.java in the following three ways:

Add another new static field to keep track of the answers for each Pair provided as argments to a call to walks. This field must be static. Think about why.
Modify the walks method to check the set of prior arguments every time a call to walks is made before doing any further calculations. If the pair is in your field, just use whatever the previous result was instead of calculating it again.
Modify the walks method to add to your result to the static field after calculating it (i.e., right before returning to the user).
Write a main method that, for each argument pair in the following table, calls your method, and prints out the resulting answer.

Question.

Fill in these values by using your results from the memoized version of walks as specified above. Notice how long walks(17, 17) takes now, with memoization.

Method Call	Return Value
`walks(17, 17)`
`walks(20, 20)`
`walks(160, 160)`
`walks(195, 195)`

Check Answer

Memoization and Recurrence Relations

So when is memoization useful? As we saw in the earlier parts, there is a recurring pattern of recursive calls. At each step, we make calls by adjusting one or more of the arguments in a step-wise function, approaching the base case through various paths. This pattern is called a recurrence relation, defined in terms of a few variables. The recurrence relation in the above problems can be phrased as: f(n,m) = f(n-1, m) + f(n, m-1).

In the next portion, you will work on defining a recurrence relation.

Longest Common Subsequence

Now that you’ve seen how memoization works, let’s move onto a more difficult problem:

The longest common subsequence (LCS) of two strings is the longest sequence of characters common to both sequences. The characters in the subsequence must be in order, but do not have to be consecutive in either string (i.e., there can be skipped characters in the middle).

Let lcs(n, m) be the length of the longest common subsequence of the first \(n\) characters of \(s_1\) and the first \(m\) characters of \(s_2\).

Consider the following “greedy” algorithm that attempts to compute lcs(n, m):

def badlcs(s1, s2):
    # If there's no string left...
    if len(s1) == 0 or len(s2) == 0:
        return 0
    # If the first characters are a match...
    elif s1[0] == s2[0]:
        return 1 + lcs(s1[1:], s2[1:])
    # Reduce the problem since we didn't find a match...
    else:
        return lcs(s1[1:], s2)

The reason badlcs doesn’t work is that it attempts to always remove from the left. There is, however, a correct algorithm that always chooses the best path rather than assuming which one it will be. Think through what that algorithm might look like and fill in the blanks below to complete it.

Setup

Manhattan Walks

static in Java

Counting the Number of Method Calls

Counting the Number of Unique Method Calls

Calculating the Results (without memoization)

Calculating the Results (with memoization)

Memoization and Recurrence Relations

Longest Common Subsequence

`static` in Java