CS 164/266, Fall 2025, Practice Problem Set 7

Draw a kD-tree for the 15 points with coordinates \((i,i)\) for \(i=1,2,\dots,15)\)
Consider an approximate range query for a range with diameter \(d\), and approximation parameter \(\varepsilon\). The “query analysis lemma” slide of the lecture notes explains why a quadtree square that crosses the range (according to the definition on the slide has a lower bound of \(d/\sqrt 2\) on its side length. Give an argument proving that there is also an upper bound on the side length of a quadtree square that crosses the range.

(It is ok for your answer to produce an inaccurate bound, as long as the bound is valid. If it helps you may assume that the range is a circle, but it should be possible to answer this question without making that assumption. Hint: assume you have a large square that crosses the range, find one of its child squares that contains a point of the inner boundary of the range, and use the assumption that this child does not cross the range to find a limit on how large it can be.)
There are 10 different non-empty intervals that have the numbers 1, 2, 3, 4, or 5 as their endpoints.
1. Draw an interval tree for range reporting with these intervals, based on a binary tree whose root node stores the number 3 and whose two children store the numbers 1 and 4. For each node of the interval tree write the two sorted orders of intervals stored at that node. (If some intervals are tied in this sorted order you may break the ties arbitrarily.)
2. Draw a segment tree for range counting with the same 10 intervals, based on a binary tree whose root node stores the number 3 and whose two children store the numbers 1 and 4.
Find a formula for the number of nodes in a segment tree, as a function of the number of distinct endpoints in a given system of intervals. (For instance, for problem 1, your formula should give the number of nodes for five endpoints.) Explain why there cannot be an exact formula like this for interval trees.
The graduate reading describes experiments using an implementation of the onion layers of a point set (an \(n\times n\) grid) that repeatedly uses Graham scan to find and remove one layer at a time. (They used this method because it was fast enough, and easier to implement than faster algorithms.)
1. What would be the running time of this repeated Graham scan algorithm, in the worst case, on an input that has the same size (\(n^2\) points) but is not necessarily a grid?
2. The paper instead states (near the start of section 3) that, for their experiments on a grid, the algorithm takes the smaller time bound \(O(n^{7/3})\). This improved runtime is smaller than the answer to part (a) for three reasons. One reason is that they apply Graham scan to subsets of only \(O(n)\) points, formed by including only two extreme points in each row of the grid. A second reason (the reason there is no log in the time bound) is that these subsets of points are already sorted, and do not need the sorting step of Graham scan. What is the third reason the algorithm is faster for grids?