# Theoretical Properties of GDD

Having introduced Graph Diffusion Distance in Chapter 2, we proceed to prove several of properties of various instances of our graph dissimilarity score, including triangle inequalities for some specific versions and an upper bound on the distance between two graph products. We will here rely heavily on various properties of the Kronecker sum and product of matrices which may be found in (Hogben 2006), Section 11.4.

## Optimization over P is equivalent to an eigenvalue matching problem

For the purpose of the calculations in this section, we restrict ourselves to the “diffusion” term of our objective function [eqn:objfunction] (the term which coerces two diffusion processes to agree), which we will write as \begin{aligned} \label{eqn:distancedefn} D_{P,\alpha} \left( G_1, G_2 \right) &= \left| \left| \frac{1}{\sqrt{\alpha}} P L_1 - \sqrt{\alpha} L_2 P \right| \right|_F .\end{aligned} Because $L_1$ and $L_2$ are each real and symmetric, they may both be diagonalized as $L_i = U_i \Lambda_i U_i^T$ where $U_i$ is a rotation matrix and $\Lambda_i$ is a diagonal matrix with the eigenvalues of $L_i$ on the diagonal. Substituting into [eqn:distancedefn], and letting $\tilde{P} = U_2^T P U_1$, we have \begin{aligned} D_{P,\alpha} \left( G_1, G_2 \right) &= \left| \left| \frac{1}{\sqrt{\alpha}} P L_1 - \sqrt{\alpha} L_2 P \right| \right|_F \nonumber \\ &= \left| \left| \frac{1}{\sqrt{\alpha}} P U_1 \Lambda_1 U_1^T - \sqrt{\alpha} U_2 \Lambda_2 U_2^T P \right| \right|_F \nonumber \\ &= \left| \left| \frac{1}{\sqrt{\alpha}} \left( U_2^T P U_1 \right) \Lambda_1 - \sqrt{\alpha} \Lambda_2 \left( U_2^T P U_1 \right) \right| \right|_F \nonumber \\ &= \left| \left| \frac{1}{\sqrt{\alpha}} \tilde{P} \Lambda_1 - \sqrt{\alpha} \Lambda_2 \tilde{P} \right| \right|_F \label{eqn:spectral_form}\end{aligned} where $\tilde{P}$ is an orthogonal matrix $\tilde{P}^T \tilde{P} = I$ if and only if $P$ is as well. Since the Frobenius norm is invariant under multiplication by rotation matrices, [eqn:spectral_form] is a re-formulation of our original Laplacian matrix objective function in terms of the spectra of the two graphs. Optimization of this modified form of the objective function subject to orthogonality constraints on $P$ is upper-bounded by optimization over matchings of eigenvalues: for any fixed $\alpha$ the eigenvalue-matching problem has the same objective function, but our optimization is over all real valued orthogonal $P$. The orthogonality constraint is a relaxed version of the constraints on matching problems (Equation [eqn:matchingconstraints]) discussed in subsection [subsub:definitions], since matching matrices M are also orthogonal $(M^T M = I)$. Many algorithms exist for solving the inner partial and 0-1 constrained minimum-cost assignment problems, such as the Munkres algorithm (Munkres 1957) (also in subsection [subsub:definitions]).

We note three corollaries of the above argument. Namely, because the Frobenius norm is invariant under the mapping to and from eigenspace:

1. Optimal or near-optimal $\tilde{P}$ in eigenvalue-space maintain their optimality through the mapping $U_2 \cdot U_1^T$ back to graph-space.

2. Solutions which are within $\epsilon$ of the optimum in $\tilde{P}$-space are also within $\epsilon$ of the optimum in $P$-space; and

3. More precisely, if they exist, zero-cost eigenvalue matchings correspond exactly with zero-cost $P$.

A natural next question would be why it might be worthwhile to work in the original graph-space, rather than always optimizing this simpler eigenvalue-matching problem instead. In many cases (path graphs, cycle graphs) the spectrum of a member $G_l$ of a graph lineage is a subset of that of $G_{l+1}$, guaranteeing that zero-cost eigenvalue matchings (and thus, by the argument above, prolongations with zero diffusion cost) exist. However, when this is not the case, the above argument only upper bounds the true distance, since the matching problem constraints are more strict. Thus, numerical optimization over $P$, with orthogonality constraints only, may find a better bound on $D^{P,\alpha} \left( G_{l}, G_{l+1} \right)$.

## Triangle Inequality for $\alpha = 1$

In this section, we show that both the linear and exponential versions of diffusion distance satisfy the triangle inequality when $\alpha = 1$.

[lem:p_lem_1] For any matrices $M$ and $P$, with $P$ satisfying $P^T P = I$,
${\left| \left| P M\right| \right|}^2_F \leq {\left| \left| M \right| \right|}^2_F$ and ${\left| \left| M P \right| \right|}^2_F \leq {\left| \left| M \right| \right|}^2_F$ .

Suppose without loss of generality that $P^T P = I$. Then:

1. ${\left| \left| P M\right| \right|}^2_F = \Tr [ M^T P^T P M ] = \Tr[M^T M] = {\left| \left|M\right| \right|}^2_F$

2. If $P^T P = I$, then letting $P P^T = \Pi$, $\Pi$ is a projection operator satisfying $\Pi^T = \Pi = \Pi^2$. Then,

$$\begin{split} {\left| \left| M \right| \right|}^2_F = \Tr [M^T M ] &= \Tr[M^T M (\Pi + (I - \Pi))] \\ &= \Tr[M^T M \Pi] + \Tr[M^T M (I - \Pi)] \\ &= \Tr[M^T M P P^T] + \Tr[M^T M {(I - \Pi)}^2] \\ &= {\left| \left| M P \right| \right|}_F^2 + {\left| \left| M (I - \Pi) \right| \right|}_F^2 \\ &\geq {\left| \left| M P \right| \right|}_F^2 \end{split}$$

[thm:constant_alpha_thm] $\tilde{D}^2$ satisfies the triangle inequality for $\alpha = 1$.

Let $G_1, G_2, G_3$ be simple graphs, with Laplacians $L_1, L_2, L_3$. Let \begin{aligned} P_{31} = \arg\inf_{P | \mathcal{C}(P)} {\left| \left| P L_1 - L_3 P \right| \right|}_F^2 .\end{aligned} $P_{31}$ is guaranteed to exist for constraints $\mathcal{C}$ which form a compact space of matrices; orthogonality constraints are an example, since the space of orthogonal matrices is closed and bounded. Then $\begin{split} \tilde{D}^2(G_1, G_3 \left| \alpha= 1\right.) &= {\left| \left| P_{31} L_1 - L_3 P_{31} \right| \right|}_F^2 = \inf_{P | \mathcal{C}(P)} {\left| \left| P L_1 - L_3 P \right| \right|}_F^2 \label{defn:alpha_1_linear} \\ &\leq \inf_{P_{32}, P_{21} | \mathcal{C}(P_{32} P_{21})} {\left| \left| P_{32} P_{21} L_1 - L_3 P_{32} P_{21} \right| \right|}_F^2 \\ \end{split}$ where we write $\mathcal{C}(P_{32} P_{21})$ to signify that the product $P_{32} P_{21}$ satisfies the original transitive constraints on $P$, e.g. orthogonality and/or sparsity. Since the constraint predicate $\mathcal{C}(P)$ satisfies Equation [eqn:tran_constraint_defn], then so does their product, so we may write $\begin{split} \tilde{D}(G_1, G_3 \left| \alpha= 1\right.) &\leq \inf_{P_{32} | \mathcal{C}(P_{32})} \inf_{P_{21} | \mathcal{C}(P_{21})} {\left| \left| P_{32} P_{21} L_1 - L_3 P_{32} P_{21} \right| \right|}_F \\ &= \inf_{P_{32} | \mathcal{C}(P_{32})} \inf_{P_{21} | \mathcal{C}(P_{21})} {\left| \left| P_{32} P_{21} L_1 - P_{32} L_2 P_{21} \right. \right.} \\ & {\left. \left. \quad \quad \qquad + \quad P_{32} L_2 P_{21} - L_3 P_{32} P_{21} \right| \right|}_F \\ &\leq \inf_{P_{32} | \mathcal{C}(P_{32})} \inf_{P_{21} | \mathcal{C}(P_{21})} \left( {\left| \left| P_{32} P_{21} L_1 - P_{32} L_2 P_{21} \right| \right|}_F \right. \\ & \quad \quad \qquad + \quad \left. {\left| \left| P_{32} L_2 P_{21} - L_3 P_{32} P_{21} \right| \right|}_F \right) \\ &=\inf_{P_{32} | \mathcal{C}(P_{32})} \inf_{P_{21} | \mathcal{C}(P_{21})} \left( {\left| \left| P_{32} \left( P_{21} L_1 - L_2 P_{21} \right) \right| \right|}_F \right. \\ &\quad \quad \qquad + \quad \left. {\left| \left| \left( P_{32} L_2 - L_3 P_{32} \right) P_{21} \right| \right|}_F \right)\\ \end{split}$ By Lemma [lem:p_lem_1], $\begin{split} \tilde{D}(G_1, G_3 \left| \alpha= 1\right.) &\leq \inf_{P_{32} | \mathcal{C}(P_{32})} \inf_{P_{21} | \mathcal{C}(P_{21})} \left( {\left| \left| P_{21} L_1 - L_2 P_{21} \right| \right|}_F \right. \\ & \quad \quad \qquad + \quad \left. {\left| \left| P_{32} L_2 - L_3 P_{32} \right| \right|}_F \right) \\ &= \inf_{P_{21} | \mathcal{C}(P_{21})} {\left| \left| P_{21} L_1 - L_2 P_{21} \right| \right|}_F \\ & \quad \quad \qquad + \quad \inf_{P_{32} | \mathcal{C}(P_{32})} {\left| \left| P_{32} L_2 - L_3 P_{32} \right| \right|}_F \\ &= \tilde{D}(G_1, G_2 \left| \alpha= 1\right.) + \tilde{D}(G_2, G_3 \left| \alpha= 1\right.) \end{split}$

We note that in this proof we use $L_1, L_2,$ and $L_3$ (making this the small-$t$ or linear version of the objective function), but the same argument holds when all three are replaced with $e^{t L_i}$, so we also have

[cor:constant_alpha_exp] $D$ satisfies the triangle inequality for $\alpha = 1$.

By the same calculation as in Theorem [thm:constant_alpha_thm], with all $L_i$ replaced by $e^{t_c L_i}$, we have \begin{aligned} D\left( \left. G_1, G_3 \right| t_c, \alpha = 1 \right) &\leq D(G_1, G_2\left| t_c, \alpha= 1 \right) + D(G_2, G_3\left| t_c, \alpha= 1 \right)\\ \nonumber\end{aligned} for any constant $t_c$. Then, letting \begin{aligned} t_{13} = \arg \sup_{t_c} D\left( \left. G_1, G_3 \right| t_c, \alpha = 1 \right)\end{aligned} we have: $\begin{split} D\left( \left. G_1, G_3 \right| \alpha = 1 \right) &= \sup_{t_c} D\left( \left. G_1, G_3 \right| t_c, \alpha = 1 \right) \\ &= D\left( \left. G_1, G_3 \right| t_{13}, \alpha = 1 \right) \\ &\leq D(G_1, G_2\left| t_{13}, \alpha = 1 \right) + D(G_2, G_3\left| t_{13}, \alpha = 1 \right) \\ &\leq \sup_{t_c} D\left( \left. G_1, G_2 \right| t_c, \alpha = 1 \right) \\ & \quad \quad \qquad + \quad \sup_{t_c} D\left( \left. G_2, G_3 \right| t_c, \alpha = 1 \right) \\ &= D\left( \left. G_1, G_2 \right| \alpha = 1 \right) + D\left( \left. G_2, G_3 \right| \alpha = 1 \right) \end{split}$

Note that in the proofs of Theorem [thm:constant_alpha_thm], Theorem [thm:tsgdd_triangle], and Corollary [cor:constant_alpha_exp], we assume that the constraint predicate $\mathcal{C}(P)$ includes at least orthogonality (so that we may apply Lemma [lem:p_lem_1]). However, this constraint predicate could be more strict, e.g. include both orthogonality and sparsity. Hence these statements also apply to the $s > 0$ cases in Table [tab:dist_versions], which we do not otherwise consider in this work: in our numerical experiments we (for reasons of computational simplicity) only require our optimization over $P$ be orthogonally constrained.

## Time-Scaled Graph Diffusion Distance

For any graphs $G_1$ and $G_2$, and some real nuber $r$, we define the Time-Scaled Graph Diffusion Distance (TSGDD) as a scaled version of the linear distance, with $\alpha$ fixed. Namely, let \begin{aligned} \tilde{D}^2_r(G_1, G_2) &= {\left( n_1 n_2 \right)}^{-2r} \tilde{D}^2\left(G_1, G_2 \left| \alpha = {\left(\frac{n_1}{n_2}\right)}^r \right. \right) \label{eqn:TSGDD_defn}\\ &= \inf_{P | \mathcal{C}(P)} {\left( n_1 n_2 \right)}^{-2r} {\left| \left| {\left(\frac{n_1}{n_2}\right)}^{-r} P L_1 - {\left(\frac{n_1}{n_2}\right)}^r L_2 P \right| \right|}^2_F \nonumber \end{aligned} The intuition for this version of the distance measure is that we are constraining the time dilation, $\alpha$, between $G_1$ and $G_2$ to be a power of the ratio of the two graph sizes. The factor ${\left(n_1 n_2 \right)}^{-2r}$ is needed to ensure this version of the distance satisfies the triangle inequality, as seen in Theorem [thm:tsgdd_triangle].

[thm:tsgdd_triangle] The TSGDD, as defined above, satisfies the triangle inequality.

As above, let $G_1, G_2, G_3$ be three graphs with $n_i = \left| G_i \right|$ and $n_1 \leq n_2 \leq n_3$, and let $L_i$ be the Laplacian of $G_i$. Let $\mathcal{C}(P)$ represent a transitive constraint predicate, also as described previously. Then, for a constant $r \in \mathbb{R}$, we have: \begin{aligned} &\tilde{D}_r(G_1, G_3) = \\ & \qquad \inf_{P|\mathcal{C}(P)} {\left( n_1 n_3\right)}^{-r} {\left|\left| {\left( \frac{n_1}{n_3} \right)}^{-r} P L_1 - {\left( \frac{n_1}{n_3} \right)}^{r} L_3 P \right|\right|}_F \nonumber \\ & \qquad \leq \inf_{P_{32}, P_{21} | \mathcal{C}(P_{32} P_{21})} {\left( n_1 n_3\right)}^{-r} {\left|\left| {\left( \frac{n_1}{n_3} \right)}^{-r} P_{32} P_{21} L_1 - {\left( \frac{n_1}{n_3} \right)}^{r} L_3 P_{32} P_{21} \right|\right|}_F \nonumber \\ \intertext{under the assumption, as in Equation \eqref{eqn:tran_constraint_defn}, that  \mathcal{C}(P_{32}) \wedge \mathcal{C}(P_{21}) \implies \mathcal{C}(P_{32} P_{21}) , } \end{aligned} \begin{aligned} % &\tilde{D}_r(G_1, G_3) \leq \\ &\inf_{P_{32}, P_{21} | \mathcal{C}(P_{32}) \wedge \mathcal{C}(P_{21})} {\left( n_1 n_3\right)}^{-r} {\left|\left| {\left( \frac{n_1}{n_3} \right)}^{-r} P_{32} P_{21} L_1 - {\left( \frac{n_1}{n_3} \right)}^{r} L_3 P_{32} P_{21} \right|\right|}_F \nonumber \\ % &= \inf_{P_{32}, P_{21} | \mathcal{C}(P_{32}) \wedge \mathcal{C}(P_{21})} {\left( n_1 n_3\right)}^{-r} {\left|\left| {\left( \frac{n_1}{n_3} \right)}^{-r} P_{32} P_{21} L_1 - {\left( \frac{n_1 n_3}{n_2^2} \right)}^{r} P_{32} L_2 P_{21} \right. \right. } \nonumber \\* &\null \qquad + \qquad {\left. \left. {\left( \frac{n_1 n_3}{n_2^2} \right)}^{r} P_{32} L_2 P_{21} - {\left( \frac{n_1}{n_3} \right)}^{r} L_3 P_{32} P_{21} \right|\right|}_F \\ % &\leq \inf_{P_{32}, P_{21} | \mathcal{C}(P_{32}) \wedge \mathcal{C}(P_{21})} {\left( n_1 n_3\right)}^{-r} {\left|\left| {\left( \frac{n_1}{n_3} \right)}^{-r} P_{32} P_{21} L_1 - {\left( \frac{n_1 n_3}{n_2^2} \right)}^{r} P_{32} L_2 P_{21} \right|\right|}_F \\ &\null \qquad + \qquad {\left( n_1 n_3\right)}^{-r} {\left|\left| {\left( \frac{n_1 n_3}{n_2^2} \right)}^{r} P_{32} L_2 P_{21} - {\left( \frac{n_1}{n_3} \right)}^{r} L_3 P_{32} P_{21} \right|\right|}_F \\ % &= \inf_{P_{32}, P_{21} | \mathcal{C}(P_{32}) \wedge \mathcal{C}(P_{21})} {\left( n_1 n_3\right)}^{-r} {\left( \frac{n_3}{n_2} \right)}^{r} {\left|\left| {\left( \frac{n_1}{n_2} \right)}^{-r} P_{32} P_{21} L_1 - {\left( \frac{n_1}{n_2} \right)}^{r} P_{32} L_2 P_{21} \right|\right|}_F \\ &\null \qquad + \qquad {\left( n_1 n_3\right)}^{-r} {\left( \frac{n_1}{n_2} \right)}^{r} {\left|\left| {\left( \frac{n_2}{n_3} \right)}^{-r} P_{32} L_2 P_{21} - {\left( \frac{n_2}{n_3} \right)}^{r} L_3 P_{32} P_{21} \right|\right|}_F \\ % &= \inf_{P_{32}, P_{21} | \mathcal{C}(P_{32}) \wedge \mathcal{C}(P_{21})} {\left( n_1 n_2\right)}^{-r} {\left|\left| {\left( \frac{n_1}{n_2} \right)}^{-r} P_{32} P_{21} L_1 - {\left( \frac{n_1}{n_2} \right)}^{r} P_{32} L_2 P_{21} \right|\right|}_F \\ &\null \qquad + \qquad {\left( n_2 n_3\right)}^{-r} {\left|\left| {\left( \frac{n_2}{n_3} \right)}^{-r} P_{32} L_2 P_{21} - {\left( \frac{n_2}{n_3} \right)}^{r} L_3 P_{32} P_{21} \right|\right|}_F \intertext{By Lemma \ref{lem:p_lem_1},} \end{aligned} \begin{aligned} \tilde{D}_r(G_1, G_3) &\leq \inf_{P_{32}, P_{21} | \mathcal{C}(P_{32}) \wedge \mathcal{C}(P_{21})} {\left( n_1 n_2\right)}^{-r} {\left|\left| {\left( \frac{n_1}{n_2} \right)}^{-r} P_{21} L_1 - {\left( \frac{n_1}{n_2} \right)}^{r} L_2 P_{21} \right|\right|}_F \\* &\null \qquad + \qquad {\left( n_2 n_3\right)}^{-r} {\left|\left| {\left( \frac{n_2}{n_3} \right)}^{-r} P_{32} L_2 - {\left( \frac{n_2}{n_3} \right)}^{r} L_3 P_{32} \right|\right|}_F \\ % &= \inf_{P_{21} | \mathcal{C}(P_{21})} {\left( n_1 n_2\right)}^{-r} {\left|\left| {\left( \frac{n_1}{n_2} \right)}^{-r} P_{21} L_1 - {\left( \frac{n_1}{n_2} \right)}^{r} L_2 P_{21} \right|\right|}_F \\* &\null \qquad + \qquad \inf_{P_{32} | \mathcal{C}(P_{32})} {\left( n_2 n_3\right)}^{-r} {\left|\left| {\left( \frac{n_2}{n_3} \right)}^{-r} P_{32} L_2 - {\left( \frac{n_2}{n_3} \right)}^{r} L_3 P_{32} \right|\right|}_F \\ &= \tilde{D}_r(G_1, G_2) + \tilde{D}_r(G_2, G_3) \nonumber \intertext{and so} &\tilde{D}_r(G_1, G_3) \leq \tilde{D}_r(G_1, G_2) + \tilde{D}_r(G_2, G_3)\end{aligned} for any fixed $r \in \mathbb{R}$.

## Sparse-Diffusion Distance

Recall that we use the notation $\mathcal{C}(P)$ for a constraint predicate that must be satisfied by prolongation matrix $P$, which is transitive in the sense that: \begin{aligned} \label{eqn:tran_constraint_defn} \mathcal{C}(P_{32}) \wedge \mathcal{C}(P_{21}) \implies \mathcal{C}(P_{32} P_{21}).\end{aligned} The simplest example is $\mathcal{C}(P) = \mathcal{C}_{\text{orthog}}(P) \equiv (P^T P = I)$. Let $\text{degree}_{i,j}(M)$ is the total number of nonzero entries in row $i$ or column $j$ of $M$. Sparsity can be introduced in transitive form by $\mathcal{C}(P) = \mathcal{C}_{\text{orthog}}(P) \wedge C_{\text{sparsity}}(P)$ where $\mathcal{C}_{\text{sparsity}}(P) \equiv \Big( \max_{i,j} \text{degree}_{i,j}(P) \leq (n_{P \text{coarse}}/n_{P \text{fine}})^-s \Big)$ for some real number $s\geq 0$. Here, $n_{P \text{fine}}$ and $n_{P \text{coarse}}$ are the dimensions of $P$. This predicate is transitive since $\max_{i,j} \text{degree}_{i,j}(P_{3 2} P_{2 1}) \leq \left( \max_{i,j} \text{degree}_{i,j}(P_{3 2} ) \right) \left( \max_{i,j} \text{degree}_{i,j}(P_{2 1})\right) ,$ and since $n_2$ cancels out from the numerator and denominator of the product of the fanout bounds. This transitive sparsity constraint depends on a power-law parameter $s \geq 0$. When $s=0$, there is no sparsity constraint.

Another form of sparsity constraints are those which specify a pattern on matrix entries which are allowed to be nonzero. Two simple examples (which are also transitive) are matrices which are constrained to be upper triangular, as well as matrices which are constrained to be of the form $A \otimes B$ where $A$ and $B$ are themselves both constrained to be sparse. More complicated are $n_1 \times n_2$ matrices which are constrained to be banded for some specified pattern of bands: more specifically, that there is a reordering of the rows and columns that the number of diagonal bands (of width 1, slope $\frac{n_1}{n_2}$) with nonzero entries is less than ${\left(\frac{n_1}{n_2}\right)}^q$ for some $0 \leq q < 1$. For example, linear interpolation matrices between d-dimensional grids, with non-overlapping source regions, follow this constraint.

As a final note on sparsity, we observe that any of the optimizations detailed in this work could also be performed including a sparsity term (for example, the $|\cdot|_1$-norm of the matrix $P$, calculated as $\sum_i \sum_j |p_{i j}|$ is one possibility, as are terms which penalize $t$ or $\alpha$ far from 1), rather than explicit sparsity constraints. A potential method of performing this optimization would be to start by optimizing the non-sparse version of the objective function (as detailed in Section 4.1) and then slowly increasing the strength of the regularization term.

## Upper Bounds for Graph Products (Linear Version)

We next consider the problem of finding optimal prolongations between two graphs $\mathbf{G}_\Box^{(1)} = G^{(1)}_1 \Box G^{(2)}_1$ and $\mathbf{G}_\Box^{(2)} = G^{(1)}_2 \Box G^{(2)}_2$ when optimal prolongations are known between $G^{(1)}_1$ and $G^{(1)}_2$, and $G^{(2)}_1$ and $G^{(2)}_2$. We show that under some reasonable assumptions, these two prolongation optimizations decouple - we may thus solve them separately and combine the solutions to obtain the optimal prolongations between the two product graphs.

From the definition of graph box product, we have \begin{aligned} L_\Box^{(j)} &= L(G_1^{(j)} \Box G_2^{(j)}) \\ &= A(G_1^{(j)} \Box G_2^{(j)}) - D(G_1^{(j)} \Box G_2^{(j)}) \\ &= \left( A(G_1^{(j)}) \otimes I_2^{(j)} + I_1^{(j)} \otimes A(G_2^{(j)}) \right) - \left( D(G_1^{(j)}) \otimes I_2^{(j)} + I_1^{(j)} \otimes D(G_2^{(j)}) \right) \\ &= \left( A(G_1^{(j)}) \otimes I_2^{(j)} -D(G_1^{(j)}) \otimes I_2^{(j)} \right) - \left(I_1^{(j)} \otimes A(G_2^{(j)}) - I_1^{(j)} \otimes D(G_2^{(j)}) \right) \\ &= (L_1^{(j)} \otimes I_2^{(j)}) + (I_1^{(j)} \otimes L_2^{(j)}) \\ &= L(G_1^{(j)}) \oplus L(G_2^{(j)})\end{aligned} where $\oplus$ is the Kronecker sum of matrices as previously defined. See (Fiedler 1973), Item 3.4 for more details on Laplacians of graph products. We calculate

\begin{aligned} D^{P,\alpha} \left( G_\Box^{(1)}, G_\Box^{(2)} \right) &= \left| \left| \frac{1}{\sqrt{\alpha}} P L_\Box^{(1)} - \sqrt{\alpha} L_\Box^{(2)} P \right| \right|_F\\ &= \left| \left| \frac{1}{\sqrt{\alpha}} P \left( \left( L_1^{(1)} \otimes I_2^{(1)} \right) + \left(I_1^{(1)} \otimes L_2^{(1)}\right) \right) \right. \right. \\ & \left. \left. \qquad - \sqrt{\alpha} \left( \left( L_1^{(2)} \otimes I_2^{(2)}\right) + \left(I_1^{(2)} \otimes L_2^{(2)} \right) \right) P \right| \right|_F \\ &= \left| \left| \left( \frac{1}{\sqrt{\alpha}} P \left(L_1^{(1)} \otimes I_2^{(1)}\right) - \sqrt{\alpha}\left(L_1^{(2)} \otimes I_2^{(2)}\right) P \right) \right. \right. \\ & \left. \left. \qquad + \left( \frac{1}{\sqrt{\alpha}} P \left( I_1^{(1)} \otimes L_2^{(1)} \right) - \sqrt{\alpha} \left( I_1^{(2)} \otimes L_2^{(2)} \right) P\right) \right| \right|_F \\ \intertext{Now we try out the assumption that P = P_1 \otimes P_2, which restricts the search space over P and may increase the objective function:} D^{P,\alpha} \left( G_\Box^{(1)}, G_\Box^{(2)} \right) &= \left| \left| \left[ \frac{1}{\sqrt{\alpha}} \left( P_1 \otimes P_2 \right)\left(L_1^{(1)} \otimes I_2^{(1)}\right) \right. \right. \right. \\ & \qquad \qquad - \left. \sqrt{\alpha}\left(L_1^{(2)} \otimes I_2^{(2)}\right) \left( P_1 \otimes P_2 \right) \right] \\ & \qquad + \left[ \frac{1}{\sqrt{\alpha}} \left( P_1 \otimes P_2 \right) \left( I_1^{(1)} \otimes L_2^{(1)} \right) \right. \\ & \left. \left. \left. \qquad \qquad - \sqrt{\alpha} \left( I_1^{(2)} \otimes L_2^{(2)} \right) \left( P_1 \otimes P_2 \right) \right] \right| \right|_F \\ &= \left| \left| \left( \frac{1}{\sqrt{\alpha}} \left( P_1 L_1^{(1)} \otimes P_2 \right) - \sqrt{\alpha}\left(L_1^{(2)} P_1 \otimes P_2 \right) \right) \right. \right. \\ & \left. \left. \qquad + \left( \frac{1}{\sqrt{\alpha}} \left( P_1 \otimes P_2 L_2^{(1)} \right) - \sqrt{\alpha} \left( P_1 \otimes L_2^{(2)} P_2 \right) \right) \right| \right|_F \\ &= \left| \left| \left( \left( \frac{1}{\sqrt{\alpha}} P_1 L_1^{(1)} - \sqrt{\alpha} L_1^{(2)} P_1 \right) \otimes P_2 \right) \right. \right. \\ & \left. \left. \qquad + \left( P_1 \otimes \left( \frac{1}{\sqrt{\alpha}} P_2 L_2^{(1)} - \sqrt{\alpha} L_2^{(2)} P_2 \right) \right) \right| \right|_F \intertext{Since || A + B ||_F \leq ||A||_F + ||B||_F,} &\leq \left| \left| \left( \left( \frac{1}{\sqrt{\alpha}} P_1 L_1^{(1)} - \sqrt{\alpha} L_1^{(2)} P_1 \right) \otimes P_2 \right) \right| \right| \\ & \qquad + \left| \left| \left( P_1 \otimes \left( \frac{1}{\sqrt{\alpha}} P_2 L_2^{(1)} - \sqrt{\alpha} L_2^{(2)} P_2 \right) \right) \right| \right|_F \\ &= \left| \left| \frac{1}{\sqrt{\alpha}} P_1 L_1^{(1)} - \sqrt{\alpha} L_1^{(2)} P_1 \right| \right|_F \left| \left| P_2 \right| \right|_F \\ & \qquad + \left| \left| P_1 \right| \right|_F \left| \left| \frac{1}{\sqrt{\alpha}} P_2 L_2^{(1)} - \sqrt{\alpha} L_2^{(2)} P_2 \right| \right|_F , \\ \intertext{Thus assuming P = P_1 \otimes P_2} D^{P,\alpha} \left( G_\Box^{(1)}, G_\Box^{(2)} \right) &\leq \left| \left| \tilde{P}_2 \right| \right|_F D_{\alpha, P_1} \left(G_1^{(1)},G_1^{(2)} \right) \\ & \qquad + \left| \left| \tilde{P}_1 \right| \right|_F D_{\alpha, P_2} \left( G_2^{(1)},G_2^{(2)} \right), %\end{aligned} which is a weighted sum of objectives of the optimizations for prolongation from $G_1^{(1)}$ to $G_1^{(2)}$ and $G_2^{(1)}$ to $G_2^{(2)}$. Recall that our original constraint on $P$ was that $P^T P = I$; since $P = P_1 \otimes P_2$ this is equivalent (by a property of the Kronecker product; see Corollary 13.8 in (Laub 2005)) to the coupled constraints on $P_1$ and $P_2$: \begin{aligned} \label{eqn:p1p2const} \left( {P_1}^T P_1 = \frac{1}{\eta} I_1^{(1)} \right) \qquad \wedge \qquad \left( {P_2}^T P_2 = \eta I_2^{(1)} \right)\end{aligned} for some $\eta \in \mathbb{R}$. For any $P_1, P_2$ which obey [eqn:p1p2const], we may rescale them by $\eta$ to make them orthogonal without changing the value of the objective, so we take $\eta = 1$ in subsequent calculations. Noting that $||A||_F = \sqrt{\text{Tr}(A^T A)}$, we see that \begin{aligned} \left| \left| P_1 \right| \right|_F = \sqrt{\text{Tr}( I_1^{(1)})} = \sqrt{n_1^{(1)}} \quad \text{and similarly} \quad \left| \left| P_2 \right| \right|_F = \sqrt{ n_2^{(1)}}.\end{aligned}

Thus, we have proven the following:

[thm:decompthm] Assuming that $P$ decomposes as $P = P_1 \otimes P_2$, the diffusion distance $D_{P, \alpha} \left(G_\Box^{(1)}, G_\Box^{(2)} \right)$ between $G_\Box^{(1)}$ and $G_\Box^{(2)}$ is bounded above by the strictly monotonically increasing function of the two distances $D_{P_1,\alpha}$ and $D_{P_2,\alpha}$: \begin{aligned} \mathcal{F}(D_{P_1,\alpha}, D_{P_2,\alpha}) &= \sqrt{n_2^{(1)}} D_{P_1,\alpha} + \sqrt{n_1^{(1)}} D_{P_2,\alpha} , \intertext{Namely,} D_{P, \alpha} \left(G_\Box^{(1)}, G_\Box^{(2)} \right) &\leq \mathcal{F} \left( D_{P_1,\alpha} \left( G_1^{(1)}, G_1^{(2)} \right), D_{P_2,\alpha} \left( G_2^{(1)}, G_2^{(2)} \right) \right)\end{aligned}

Thus, the original optimization over the product graphs decouples into separate optimizations over the two sets of factors, constrained to have the same value of $\alpha$. Additionally, since the requirement that $P = P_1 \otimes P_2$ is an additional constraint,

[thm:decompcorollary] If $(\alpha_1, P_1)$ and $(\alpha_2, P_2)$, subject to orthogonality constraints, are optima of $D_{\alpha, P} \left( G_1^{(1)}, G_1^{(1)} \right)$ and $D_{\alpha, P} \left( G_2^{(1)}, G_2^{(1)} \right)$, and furthermore if $\alpha_1 = \alpha_2$, then the value of $D_{P, \alpha} (G_1^{(1)} \Box G_2^{(1)}, G_1^{(2)} \Box G_2^{(2)})$ for an optimal $P$ is bounded above by $D_{P_1 \otimes P_2, \alpha_1} (G_1^{(1)} \Box G_2^{(1)}, G_1^{(2)} \Box G_2^{(2)})$.

This upper bound on the original objective function is a monotonically increasing function of the objectives for the two smaller problems. A consequence of this upper bound is that if ${D_{P_1,\alpha} \left( G_1^{(1)}, G_1^{(2)} \right) \leq \epsilon_1}$ and ${D_{P_2,\alpha} \left( G_2^{(1)}, G_2^{(2)} \right) \leq \epsilon_2}$, then the composite solution $P_1 \otimes P_2$ must have ${D_{P_1 \otimes P_2,\alpha} \left( G_\Box^{(1)}, G_\Box^{(2)} \right) \leq \epsilon = \left( \sqrt{n_1} + \sqrt{n_2} \right) \max(\epsilon_1, \epsilon_2)}$. Thus if both of these distances are arbitrarily small then the composite distance must also be small. Furthermore, if only one of these is small, so that ${D_{P_1,\alpha} \left( G_1^{(1)}, G_1^{(2)} \right) \approx 0}$ or ${D_{P_2,\alpha} \left( G_2^{(1)}, G_2^{(2)} \right) \approx 0}$, then ${D_{P_1 \otimes P_2,\alpha} \approx D_{P_2, \alpha}}$ or ${D_{P_1 \otimes P_2,\alpha} \approx D_{P_1, \alpha}}$, respectively.

We have experimentally found that many families of graphs do not require scaling between the two diffusion processes: the optimal $(\alpha, P)$ pair has $\alpha = 1$. In particular, prolongation between path (cycle) graphs of size $n$ and size $2n$ always have $\alpha_\text{optimal} = 1$, since the spectrum of the former graph is a subset of that of the larger - therefore, there is a matching solution of cost 0 which by the argument above can be mapped to a graph-space $P$ with objective function value 0 (we prove this in Section 3.7). In this case, the two terms of the upper bound are totally decoupled and may each be optimized separately (whereas in the form given above, they both depend on a $\alpha$).

## Upper Bounds for Graph Products (Exponential Version)

We now consider the case where we want to compute the distance of two graph box products, i.e. $D \left(\mathbf{G}_1, \mathbf{G}_2 \right)$ where \begin{aligned} \mathbf{G}_1 = G_1^{(1)} \Box G_1^{(2)} \quad \text{and} \quad \mathbf{G}_2 = G_2^{(1)} \Box G_2^{(2)}\end{aligned} and $\begin{split} P^{(1)} = \arg \inf_{P_c \left| \mathcal{C} (P_c) \right. } D \left( G_1^{(1)}, G_2^{(1)} | t_c, \alpha_c, P_c \right) \\ P^{(2)}= \arg \inf_{P_c \left| \mathcal{C} (P_c) \right. } D \left( G_1^{(2)}, G_2^{(2)} | t_c, \alpha_c, P_c \right) \end{split}$

are known for some $t_c, \alpha_c$.

[thm:sq_grid_lim_lem] Let $\mathbf{G}_1$ and $\mathbf{G}_2$ be graph box products as described above, and for a graph $G$ let $L(G)$ be its Laplacian. For fixed $t=t_c$, $\alpha = \alpha_c$, $P^{(i)}$ as given above, for any $\lambda \in [0,1]$, we have \begin{aligned} &\inf_{P_c \left| \mathcal{C}(P_c) \right. } D \left(\mathbf{G}_1, \mathbf{G}_2 \right) \quad \leq \quad \\ &\qquad \lambda \left({\left| \left| e^{\frac{t_c}{\alpha_c} L(G^{(2)}_1)} \right| \right|}_F + {\left| \left| e^{t_c \alpha_c L(G^{(2)}_2)} \right| \right|}_F \right) D \left( G_1^{(1)}, G_2^{(1)} | P^{(1)} \right) \\ &\qquad \qquad + \left( 1 - \lambda \right) \left({\left| \left| e^{\frac{t_c}{\alpha_c} L(G^{(1)}_1)} \right| \right|}_F + {\left| \left| e^{t_c \alpha_c L(G^{(1)}_2)} \right| \right|}_F \right) D \left( G_1^{(2)}, G_2^{(2)} | P^{(2)} \right) \end{aligned} where all distances are evaluated at $t=t_c$, $\alpha = \alpha_c$, but we have omitted that notation for simplicity.

For graph products $\mathbf{G}_i$, we have \begin{aligned} L(\mathbf{G}_i) &= L(G^{(1)}_i) \oplus L(G^{(2)}_i) \\ &= \left( L(G^{(1)}_i) \otimes I_{\left| L(G^{(2)}_i) \right|} \right) + \left( I_{\left| L(G^{(1)}_i) \right|} \otimes L(G^{(2)}_i) \right) \end{aligned} (this fact can be easily verified from the formula for the adjacency matrix of a graph box product, given in the definition in Section 1.4.2), and so \begin{aligned} \exp \left[{c L(\mathbf{G}_i)}\right] &= \exp \left[c {\left( L(G^{(1)}_i) \otimes I_{\left| L(G^{(2)}_i) \right|} \right) + \left( I_{\left| L(G^{(1)}_i) \right|} \otimes L(G^{(2)}_i) \right)}\right].\\ \end{aligned} Because $A \otimes I_{|B|}$ and $I_{|A|} \otimes B$ commute for any $A$ and $B$, \begin{aligned} \exp \left[{c L(\mathbf{G}_i)}\right] &= \exp \left[ c {\left( L(G^{(1)}_i) \otimes I_{\left| L(G^{(2)}_i) \right|} \right)}\right] \exp \left[ c {\left( I_{\left| L(G^{(1)}_i) \right|} \otimes L(G^{(2)}_i) \right)}\right] \\ &= \left( \exp \left[ c { L(G^{(1)}_i) } \right] \otimes I_{\left| L(G^{(2)}_i) \right|} \right) \left(I_{\left| L(G^{(1)}_i) \right|} \otimes \exp \left[ c { L(G^{(2)}_i) }\right]\right) \\ &= \exp \left[ c { L(G^{(1)}_i) } \right] \otimes \exp \left[ c { L(G^{(2)}_i) }\right]\ \end{aligned} We will make the following abbreviations: $\begin{array}{ccc} \mathbf{E}_1 = e^{\frac{t_c}{\alpha_c} L(\mathbf{G}_1)} & E^{(1)}_1 = e^{\frac{t_c}{\alpha_c} L(G^{(1)}_1)} & E^{(2)}_1 = e^{\frac{t_c}{\alpha_c} L(G^{(2)}_1)} \\ \mathbf{E}_2 = e^{t_c \alpha_c L(\mathbf{G}_2)} & E^{(1)}_2 = e^{t_c \alpha_c L(G^{(1)}_2)} & E^{(2)}_2 = e^{t_c \alpha_c L(G^{(2)}_2)} \end{array}$ Then, \begin{aligned} \inf_{P | \mathcal{C}(P)} D \left(\mathbf{G}_1, \mathbf{G}_2 \right) &\leq D \left(\mathbf{G}_1, \mathbf{G}_2 | P^{(1)} \otimes P^{(2)} \right) \\ &= {\left| \left| \left( P^{(1)} \otimes P^{(2)} \right) \mathbf{E}_1 - \mathbf{E}_2 \left( P^{(1)} \otimes P^{(2)} \right) \right| \right|}_F \nonumber\end{aligned} \begin{aligned} &= {\left| \left| \left( P^{(1)} \otimes P^{(2)} \right) \left(E^{(1)}_1 \otimes E^{(2)}_1 \right) - \left(E^{(1)}_2 \otimes E^{(2)}_2 \right) \left( P^{(1)} \otimes P^{(2)} \right) \right| \right|}_F \nonumber \\ &= {\left| \left| \left( P^{(1)} E^{(1)}_1 \otimes P^{(2)} E^{(2)}_1 \right) - \left(E^{(1)}_2 P^{(1)} \otimes E^{(2)}_2 P^{(2)} \right) \right| \right|}_F^2 \nonumber \\ &= {\left| \left| \left( P^{(1)} E^{(1)}_1 \otimes P^{(2)} E^{(2)}_1 \right) - \left( P^{(1)} E^{(1)}_1 \otimes E^{(2)}_2 P^{(2)} \right) \right. \right.} \label{eqn:cross_term_eqn} \\* & \quad + {\left. \left. \left( P^{(1)} E^{(1)}_1 \otimes E^{(2)}_2 P^{(2)} \right) - \left(E^{(1)}_2 P^{(1)} \otimes E^{(2)}_2 P^{(2)} \right) \right| \right|}_F \nonumber \\ & \leq {\left| \left| \left( P^{(1)} E^{(1)}_1 \otimes P^{(2)} E^{(2)}_1 \right) - \left( P^{(1)} E^{(1)}_1 \otimes E^{(2)}_2 P^{(2)} \right) \right| \right|}_F \nonumber \\ & \quad + {\left| \left| \left( P^{(1)} E^{(1)}_1 \otimes E^{(2)}_2 P^{(2)} \right) - \left(E^{(1)}_2 P^{(1)} \otimes E^{(2)}_2 P^{(2)} \right) \right| \right|}_F \nonumber \\ &= {\left| \left| P^{(1)} E^{(1)}_1 \otimes \left( P^{(2)} E^{(2)}_1 - E^{(2)}_2 P^{(2)} \right) \right| \right|}_F \\ & \quad \quad + \quad {\left| \left| \left( P^{(1)} E^{(1)}_1 - E^{(1)}_2 P^{(1)} \right) \otimes E^{(2)}_2 P^{(2)} \right| \right|}_F \nonumber \\ &= {\left| \left| P^{(1)} E^{(1)}_1 \right| \right|}_F {\left| \left| P^{(2)} E^{(2)}_1 - E^{(2)}_2 P^{(2)} \right| \right|}_F \\ & \quad \quad + \quad {\left| \left| P^{(1)} E^{(1)}_1 - E^{(1)}_2 P^{(1)} \right| \right|}_F {\left| \left| E^{(2)}_2 P^{(2)} \right| \right|}_F . \nonumber \end{aligned} By Lemma [lem:p_lem_1], $\begin{split} \inf_{P | \mathcal{C}(P)} D \left(\mathbf{G}_1, \mathbf{G}_2 \right) &\leq {\left| \left| E^{(1)}_1 \right| \right|}_F {\left| \left| P^{(2)} E^{(2)}_1 - E^{(2)}_2 P^{(2)} \right| \right|}_F \\ & \quad \quad + \quad {\left| \left| P^{(1)} E^{(1)}_1 - E^{(1)}_2 P^{(1)} \right| \right|}_F {\left| \left| E^{(2)}_2 \right| \right|}_F . \end{split}$ \begin{aligned} \intertext{If we instead use \left( E^{(1)}_2 P^{(1)} \otimes P^{(2)} E^{(2)}_1 \right) as the cross term in Equation \eqref{eqn:cross_term_eqn}, we have } \inf_P D \left(\mathbf{G}_1, \mathbf{G}_2 \right) \leq {\left| \left| E^{(1)}_2 \right| \right|}_F {\left| \left| P^{(2)} E^{(2)}_1 - E^{(2)}_2 P^{(2)} \right| \right|}_F \\ \quad \quad + \quad {\left| \left| P^{(1)} E^{(1)}_1 - E^{(1)}_2 P^{(1)} \right| \right|}_F {\left| \left| E^{(2)}_1 \right| \right|}_F \nonumber\end{aligned} A linear combination of these two bounds gives us the desired bound.

This has the additional consequence that \begin{aligned} &\inf_{P_c \left| \mathcal{C}(P_c) \right.} D \left(\mathbf{G}_1, \mathbf{G}_2 \right) \quad \leq & \\ &\min \left[ \left({\left| \left| e^{\frac{t_c}{\alpha_c} L(G^{(2)}_1)} \right| \right|}_F + {\left| \left| e^{t_c \alpha_c L(G^{(2)}_2)} \right| \right|}_F \right) D \left( G_1^{(1)}, G_2^{(1)} | P^{(1)} \right), \right. \\ & \quad \quad \left. \left({\left| \left| e^{\frac{t_c}{\alpha_c} L(G^{(1)}_1)} \right| \right|}_F + {\left| \left| e^{t_c \alpha_c L(G^{(1)}_2)} \right| \right|}_F \right) D \left( G_1^{(2)}, G_2^{(2)} | P^{(2)} \right) \right] \end{aligned} Additionally, if \begin{aligned} E^{(1)}_i = E^{(2)}_i \text{ for } i \in 1,2 \quad \text{and} \quad P^{(1)} = P^{(2)},\end{aligned} This reduces further to \begin{aligned} D \left(\mathbf{G}_1, \mathbf{G}_2 | P^{(1)} \otimes P^{(1)} \right) &\leq \min \left( {\left| \left| E^{(1)}_1 \right| \right|}_F , {\left| \left| E^{(1)}_2 \right| \right|}_F \right) {\left| \left| P^{(1)} E^{(1)}_1 - E^{(1)}_2 P^{(1)} \right| \right|}_F \end{aligned} and so \begin{aligned} & D \left( \left. G_1^{(1)} \Box G_1^{(1)}, G_2^{(1)} \Box G_2^{(1)} \right| t_c, \alpha_c \right) \label{eqn:graph_product_special_case} \\ &\null \quad \quad \quad \leq \min \left( {\left| \left| e^{\frac{t_c}{a_c} L(G^{(1)}_1)} \right| \right|}_F , {\left| \left| e^{t_c a_c L(G^{(1)}_2)} \right| \right|}_F \right) D \left( G_1^{(1)}, G_2^{(1)} \left. \right| t_c, \alpha_c \right) \nonumber \end{aligned} An example of such a graph sequence is the sequence of two-dimensional square grids, which are each the box product of two identical one-dimensional grids i.e., path graphs: $\text{Sq}_n = \text{Pa}_n \Box \text{Pa}_n$.

## Existence of Zero-Error $P$ for Cycle Graphs

[theorem:spectralzero] In this section, we give an example closed-form solution for a prolongation matrix which achieves zero error when prolonging between cycle graphs. Let $G^{(1)}$ and $G^{(2)}$ be graphs with spectra $\lambda^{(1)}_i$ and $\lambda^{(2)}_j$, respectively, with $n_1 = \left| G^{(1)} \right| \leq n_2 = \left| G^{(2)} \right|$. Suppose that for every $\lambda^{(1)}_i = r$ of multiplicity $k$, $r$ is also an eigenvalue of $G^{(2)}$ of multiplicity $\geq k$. Then there is a zero-cost eigenvalue matching $M$ between $G^{(1)}$ and $G^{(2)}$.

Let $(i_1, j_1), (i_1, j_1) \ldots (i_{n_1}, j_{n_1})$ be a list of pairs of indices such that the following hold:

• All of the $i_k$ are unique.

• All of the $j_k$ are unique.

• For any pair $(i_k, j_k)$, $\lambda^{(1)}_k = \lambda^{(2)}_k$.

Define $P$ as follows: $P_{ij} = \begin{cases} 1 & (i,j) \text{ appears in the above list.} \\ 0 & \text{else.} \\ \end{cases}$ $P$ is clearly orthogonal, since it has exactly one $1$ in each row and each column and zeros elsewhere ($P$ is a permutation matrix for $n_1 = n_2$ and a subpermutation matrix otherwise). Furthermore, we must have $\sum_{i=1}^{n_2} \sum_{j=1}^{n_1} P_{ij} \left( \lambda^{(1)}_j - \lambda^{(2)}_i \right) = 0$ and therefore \begin{aligned} \label{eqn:p_zero_cost} \left| \left| P \Lambda^{(1)} - \Lambda^{(2)} P \right| \right|_F = 0 \end{aligned}

For any $n$, there exist zero-error matchings between cycle graphs $C_n$ and $C_{2n}$.

The spectra of $C_n$ are given by the formula (see (Hogben 2006) Section 39.3): $\lambda(C_n) = 2 \cos (\frac{2 \pi j}{n}) \qquad \text{for} \qquad (j=0, 1, \ldots n-1)$ Thus $\lambda(C_n)$ and $\lambda(C_{2n})$ clearly satisfy the conditions of Theorem [theorem:spectralzero] above. In particular, the matrix $P_{ij} = \begin{cases} 1 & \text{if} \quad i = 2j \\ 0 & \text{else.} \\ \end{cases}$ has 0 cost as in Equation [eqn:p_zero_cost]

## Spectral Version of Decoupling for the Diffusion Term of Graph Product Prolongations

In this section we derive an eigenspace version of the bound derived in Section 3.5.

The eigenspace version of the diffusion term of the objective function of a graph product prolongation also decouples into two smaller prolongation problems.

As in section 3.1, we adopt the following notation: \begin{aligned} \null && L_i^{(j)} = L(G_i^{(j)}) = {U_i^{(j)}}^T \Lambda_i^{(j)} U_i^{(j)} && \text{for i \in \{1,2\}} \\ \null && L_\Box^{(j)} = L(G_\Box^{(j)}) ={U_\Box^{(j)}}^T \Lambda_\Box^{(j)} U_\Box^{(j)} && \null\end{aligned}

For the graph box product, we have \begin{aligned} L_\Box^{(j)} &= L(G_1^{(j)} \Box G_2^{(j)}) \\ &= L(G_1^{(j)}) \oplus L(G_2^{(j)}) \\ &= (L_1^{(j)} \otimes I_2^{(j)}) + (I_1^{(j)} \otimes L_2^{(j)}) \\\end{aligned} Where $\oplus$ is the Kronecker sum of matrices as previously defined. Additionally, if $U_1^{(j)}$ diagonalizes $L_1^{(j)}$ and $U_2^{(j)}$ diagonalizes $L_2^{(j)}$, then $U_\Box^{(j)} \equiv U_1^{(j)} \otimes U_2^{(j)}$ diagonalizes $L_1^{(j)} \otimes L_2^{(j)}$, since \begin{aligned} U_\Box^{(j)} L_\Box^{(j)} {U_\Box^{(j)}}^T &= (U_1^{(j)} \otimes U_2^{(j)})(L_1^{(j)} \otimes I_2^{(j)} + I_1^{(j)} \otimes L_2^{(j)}){(U_1^{(j)} \otimes U_2^{(j)})}^T \\ &= (U_1^{(j)} \otimes U_2^{(j)})(L_1^{(j)} \otimes I_2^{(j)} + I_1^{(j)} \otimes L_2^{(j)})({U_1^{(j)}}^T \otimes {U_2^{(j)}}^T) \\ &= (U_1^{(j)} \otimes U_2^{(j)})(L_1^{(j)} \otimes I_2^{(j)})({U_1^{(j)}}^T \otimes {U_2^{(j)}}^T) \\ & \qquad + (U_1^{(j)} \otimes U_2^{(j)})(I_1^{(j)} \otimes L_2^{(j)})({U_1^{(j)}}^T \otimes {U_2^{(j)}}^T) \\ &= (U_1^{(j)} L_1^{(j)} {U_1^{(j)}}^T \otimes U_2^{(j)} I_2^{(j)} {U_2^{(j)}}^T) + (U_1^{(j)} I_1^{(j)} {U_1^{(j)}}^T \otimes U_2^{(j)} L_2^{(j)} {U_2^{(j)}}^T) \\ &= (\Lambda_1^{(j)} \otimes I_2^{(j)}) + (I_1^{(j)} \otimes \Lambda_2^{(j)})\end{aligned} The Kronecker product of two diagonal matrices is itself diagonal, and so this is a sum of two diagonal matrices. Furthermore, the Kronecker product of two rotation matrices is itself a rotation matrix, and therefore $U_\Box^{(j)}$ is a rotation matrix.

Applying this diagonalization to map to eigenspace in the same manner as subsection 3.1, we have \begin{aligned} D^2_{P,\alpha} \left( G_\Box^{(1)}, G_\Box^{(2)} \right) &= \left| \left| \frac{1}{\sqrt{\alpha}} P L_\Box^{(1)} - \sqrt{\alpha} L_\Box^{(2)} P \right| \right|_F\\ &= \left| \left| \frac{1}{\sqrt{\alpha}} \tilde{P} \Lambda_\Box^{(1)} - \sqrt{\alpha} \Lambda_\Box^{(2)} \tilde{P} \right| \right|_F \\ &= \left| \left| \frac{1}{\sqrt{\alpha}} \tilde{P} \left( \left( \Lambda_1^{(1)} \otimes I_2^{(1)} \right) + \left(I_1^{(1)} \otimes \Lambda_2^{(1)}\right) \right) \right. \right. \\ & \left. \left. \qquad - \sqrt{\alpha} \left( \left( \Lambda_1^{(2)} \otimes I_2^{(2)}\right) + \left(I_1^{(2)} \otimes \Lambda_2^{(2)} \right) \right) \tilde{P} \right| \right|_F \\ &= \left| \left| \left( \frac{1}{\sqrt{\alpha}} \tilde{P} \left(\Lambda_1^{(1)} \otimes I_2^{(1)}\right) - \sqrt{\alpha}\left(\Lambda_1^{(2)} \otimes I_2^{(2)}\right) \tilde{P} \right) \right. \right. \\ & \left. \left. \qquad + \left( \frac{1}{\sqrt{\alpha}} \tilde{P} \left( I_1^{(1)} \otimes \Lambda_2^{(1)} \right) - \sqrt{\alpha} \left( I_1^{(2)} \otimes \Lambda_2^{(2)} \right) \tilde{P} \right) \right| \right|_F \\ \intertext{Now we make the assumption that P = P_1 \otimes P_2 and therefore \tilde{P} = \tilde{P}_1 \otimes \tilde{P}_2 for some \tilde{P}_1 and \tilde{P}_2,} &= \left| \left| \left( \frac{1}{\sqrt{\alpha}} \left( \tilde{P}_1 \otimes \tilde{P}_2 \right)\left(\Lambda_1^{(1)} \otimes I_2^{(1)}\right) - \sqrt{\alpha}\left(\Lambda_1^{(2)} \otimes I_2^{(2)}\right) \left( \tilde{P}_1 \otimes \tilde{P}_2 \right) \right) \right. \right. \\ & \left. \left. \qquad + \left( \frac{1}{\sqrt{\alpha}} \left( \tilde{P}_1 \otimes \tilde{P}_2 \right) \left( I_1^{(1)} \otimes \Lambda_2^{(1)} \right) - \sqrt{\alpha} \left( I_1^{(2)} \otimes \Lambda_2^{(2)} \right) \left( \tilde{P}_1 \otimes \tilde{P}_2 \right) \right) \right| \right|_F \\ &= \left| \left| \left( \frac{1}{\sqrt{\alpha}} \left( \tilde{P}_1 \Lambda_1^{(1)} \otimes \tilde{P}_2 \right) - \sqrt{\alpha}\left(\Lambda_1^{(2)} \tilde{P}_1 \otimes \tilde{P}_2 \right) \right) \right. \right. \\ & \left. \left. \qquad + \left( \frac{1}{\sqrt{\alpha}} \left( \tilde{P}_1 \otimes \tilde{P}_2 \Lambda_2^{(1)} \right) - \sqrt{\alpha} \left( \tilde{P}_1 \otimes \Lambda_2^{(2)} \tilde{P}_2 \right) \right) \right| \right|_F \\ &= \left| \left| \left( \left( \frac{1}{\sqrt{\alpha}} \tilde{P}_1 \Lambda_1^{(1)} - \sqrt{\alpha} \Lambda_1^{(2)} \tilde{P}_1 \right) \otimes \tilde{P}_2 \right) \right. \right. \\ & \left. \left. \qquad + \left( \tilde{P}_1 \otimes \left( \frac{1}{\sqrt{\alpha}} \tilde{P}_2 \Lambda_2^{(1)} - \sqrt{\alpha} \Lambda_2^{(2)} \tilde{P}_2 \right) \right) \right| \right|_F \\ \intertext{Since || A + B ||_F \leq ||A||_F + ||B||_F,} \\ &\leq \left| \left| \left( \left( \frac{1}{\sqrt{\alpha}} \tilde{P}_1 \Lambda_1^{(1)} - \sqrt{\alpha} \Lambda_1^{(2)} \tilde{P}_1 \right) \otimes \tilde{P}_2 \right) \right| \right| \\ & \qquad + \left| \left| \left( \tilde{P}_1 \otimes \left( \frac{1}{\sqrt{\alpha}} \tilde{P}_2 \Lambda_2^{(1)} - \sqrt{\alpha} \Lambda_2^{(2)} \tilde{P}_2 \right) \right) \right| \right|_F \\ &= \left| \left| \frac{1}{\sqrt{\alpha}} \tilde{P}_1 \Lambda_1^{(1)} - \sqrt{\alpha} \Lambda_1^{(2)} \tilde{P}_1 \right| \right|_F \left| \left| \tilde{P}_2 \right| \right|_F \\ & \qquad + \left| \left| \tilde{P}_1 \right| \right|_F \left| \left| \frac{1}{\sqrt{\alpha}} \tilde{P}_2 \Lambda_2^{(1)} - \sqrt{\alpha} \Lambda_2^{(2)} \tilde{P}_2 \right| \right|_F \\\end{aligned} From Theorem [thm:decompthm] of the main manuscript, we know that for \begin{aligned} \mathbf{G}_\Box^{(1)} = G^{(1)}_1 \Box G^{(2)}_1 \qquad & \text{and} \qquad \mathbf{G}_\Box^{(2)} = G^{(1)}_2 \Box G^{(2)}_2 ,\end{aligned} and assuming $P = P_1 \otimes P_2$, \begin{aligned} D^{P,\alpha} \left( G_\Box^{(1)}, G_\Box^{(2)} \right) &= \left| \left| P L_\Box^{(1)} - L_\Box^{(2)} P \right| \right|_F\\ &\leq \sqrt{n_2^{(1)}} D_{P_1,\alpha} \left( G_1^{(1)}, G_1^{(2)} \right) + \sqrt{n_1^{(1)}} D_{P_2,\alpha} \left( G_2^{(1)}, G_2^{(2)} \right) \\ &= \sqrt{n_2^{(1)}} \left| \left| \frac{1}{\sqrt{\alpha}} P_1 L_1^{(1)} - \sqrt{\alpha} L_1^{(2)} P_1 \right| \right|_F \\ & \qquad + \sqrt{n_1^{(1)}} \left| \left| \frac{1}{\sqrt{\alpha}} P_2 L_2^{(1)} - \sqrt{\alpha} L_2^{(2)} P_2 \right| \right|_F , \\\end{aligned} Trivially, we can rewrite each of these Frobenius norms to be their spectral version, as in Equation [eqn:distancedefn]. Thus, \begin{aligned} \left| \left| P L_\Box^{(1)} - L_\Box^{(2)} P \right| \right|_F &= \left| \left| \tilde{P} \Lambda_\Box^{(1)} - \Lambda_\Box^{(2)} \tilde{P} \right| \right|_F \\ &\leq \sqrt{n_2^{(1)}} \left| \left| \frac{1}{\sqrt{\alpha}} \tilde{P}_1 \Lambda_1^{(1)} - \sqrt{\alpha} \Lambda_1^{(2)} \tilde{P}_1 \right| \right|_F \\ & \qquad + \sqrt{n_1^{(1)}} \left| \left| \frac{1}{\sqrt{\alpha}} \tilde{P}_2 \Lambda_2^{(1)} - \sqrt{\alpha} \Lambda_2^{(2)} \tilde{P}_2 \right| \right|_F , \\\end{aligned} which is a weighted sum of objectives of the two spectral prolongation problems for the two factor lineages. We have thus also decoupled this eigenvalue-matching version of the objective function into two separate prolongation problems.

Finally, we show that if $\tilde{P}_1$ and $\tilde{P}_2$ are solutions to the eigenvalue matching problem $m*(L_1^{(1)}, L_1^{(2)})$ and $m*(L_2^{(1)}, L_2^{(2)})$ respectively, then $\tilde{P} = \tilde{P}_1 \otimes \tilde{P}_2$ is a valid, but not necessarily optimal, solution to the eigenvalue matching problem $m*(L_\Box^{(1)}, L_\Box^{(2)})$. By valid we mean that $\tilde{P}$ satisfies the constraints given in the definition of matching problems in Section [subsub:definitions].

This fact follows directly from the constraints on $\tilde{P}_1$ and $\tilde{P}_2$. A matrix $M$ is a valid matching matrix iff its entries are in $\{ 0, 1\}$ and it is orthogonal (this is an equivalent expression of the constraints given in Section [subsub:definitions]. If $\tilde{P}_1$ and $\tilde{P}_2$ are both orthogonal and $\{0,1\}$-valued, then we observe the following facts about their Kronecker product $\tilde{P}_1 \otimes \tilde{P}_2$:

• it is also $\{0,1\}$-valued, since any of its entries is the product of an entry of $\tilde{P}_1$ and one of $\tilde{P}_2$.

• it is orthogonal, since $A \otimes B$ is orthogonal iff $A^T A = \zeta I$ and $B^T B = \frac{1}{\zeta} I$ for some $\zeta > 0$ (Laub 2005). $\tilde{P}_1$ and $\tilde{P}_2$ satisfy these conditions with $\zeta = 1$.

So $\tilde{P}_1 \otimes \tilde{P}_2$ satisfies the constraints given for the eigenvalue matching problem ${m*(L_\Box^{(1)}, L_\Box^{(2)})}$.

### Distortion-penalized Distance

We can add a term to the graph diffusion distance which penalizes large distortions induced by $\alpha$, as follows: define \begin{aligned} D_\text{reg}(G_1, G_2) &= \sup_t \inf_{P|\mathcal{C}(P)} \inf_{\alpha > 0} \left\{ {\left| \left| P e^{\frac{t}{\alpha} L_1} - e^{t \alpha L_2 } P \right| \right|}_F + {\left| \left| e^{\frac{t}{\alpha} L_1} - e^{t L_1 } \right| \right|}_F \right. \nonumber \\ &\quad \quad \quad + \left. {\left| \left| e^{t L_2} P - e^{t \alpha L_2 } P \right| \right|}_F \right\} \nonumber\end{aligned} We can show analytically that this distance satisfies the triangle inequality:

$D_\text{reg}$ satisfies the triangle inequality.

For graphs $G_1, G_2, G_3$ and Laplacians $L_1, L_2, L_3$, for any fixed $t \geq 0$, we have: \begin{aligned} D_\text{reg}(G_1, G_3 | t) &= \inf_{P|\mathcal{C}(P)} \inf_{\alpha > 0} \left\{ {\left| \left| P e^{\frac{t}{\alpha} L_1} - e^{t \alpha L_3 } P \right| \right|}_F + {\left| \left| e^{\frac{t}{\alpha} L_1} - e^{t L_1 } \right| \right|}_F \right. \nonumber \\* &\quad \quad \quad + \left. {\left| \left| e^{t L_3} P - e^{t \alpha L_3 } P \right| \right|}_F \right\} \nonumber \\ % &\leq D_\text{reg}(G_1, G_3 | t, \alpha=1) \nonumber \\ % &= \inf_{P|\mathcal{C}(P)} \left\{ {\left| \left| P e^{t L_1} - e^{t L_3 } P \right| \right|}_F + {\left| \left| e^{t L_1} - e^{t L_1 } \right| \right|}_F \right. \nonumber \\ &\quad \quad \quad + \left. {\left| \left| e^{t L_3} P - e^{t L_3 } P \right| \right|}_F \right\} \nonumber \\ % &= \inf_{P|\mathcal{C}(P)} {\left| \left| P e^{t L_1} - e^{t L_3 } P \right| \right|}_F \nonumber \end{aligned} Suppose that \begin{aligned} \alpha_{32}, P_{32} &= \arg \inf_{a > 0} \inf_{P | \mathcal{C}(P)} \left\{ {\left| \left| P e^{\frac{t}{\alpha} L_2} - e^{t \alpha L_3 } P \right| \right|}_F + {\left| \left| e^{\frac{t}{\alpha} L_2} - e^{t L_2 } \right| \right|}_F \right. \nonumber \\ &\quad \quad \quad + \left. {\left| \left| e^{t L_3} P - e^{t \alpha L_3 } P \right| \right|}_F \right\} \nonumber \\ % \alpha_{21}, P_{21} &= \arg \inf_{a > 0} \inf_{P | \mathcal{C}(P)} \left\{ {\left| \left| P e^{\frac{t}{\alpha} L_1} - e^{t \alpha L_2 } P \right| \right|}_F + {\left| \left| e^{\frac{t}{\alpha} L_1} - e^{t L_1 } \right| \right|}_F \right. \nonumber \\ &\quad \quad \quad + \left. {\left| \left| e^{t L_2} P - e^{t \alpha L_2 } P \right| \right|}_F \right\} \nonumber \end{aligned} Then, \begin{aligned} \inf_{P|\mathcal{C}(P)} {\left| \left| P e^{t L_1} - e^{t L_3 } P \right| \right|}_F &\leq {\left| \left| P_{32} P_{21} e^{t L_1} - e^{t L_3 } P_{32} P_{21} \right| \right|}_F \\ \inf_{P|\mathcal{C}(P)} {\left| \left| P e^{t L_1} - e^{t L_3 } P \right| \right|}_F &\leq {\left| \left| P_{32} P_{21} e^{t L_1} - P_{32} P_{21} e^{\frac{t}{\alpha_{21}} L_1} \right. \right. } + P_{32} P_{21} e^{\frac{t}{\alpha_{21}} L_1} \\* & \quad - P_{32} e^{t \alpha_{21} L_2 } P_{21} + P_{32} e^{t \alpha_{21} L_2 } P_{21} - P_{32} e^{t L_2 } P_{21} \\* & \quad + P_{32} e^{t L_2 } P_{21} - P_{32} e^{\frac{t}{\alpha_{32}} L_2 } P_{21} + P_{32} e^{\frac{t}{\alpha_{32}} L_2 } P_{21} \\* & \quad - e^{t \alpha_{32} L_3 } P_{32} P_{21} {\left. \left. + e^{t \alpha_{32} L_3 } P_{32} P_{21} - e^{t L_3 } P_{32} P_{21} \right| \right|}_F \\ &\leq {\left| \left| P_{32} P_{21} e^{t L_1} - P_{32} P_{21} e^{\frac{t}{\alpha_{21}} L_1}\right| \right|}_F \\* & \quad + {\left| \left| P_{32} P_{21} e^{\frac{t}{\alpha_{21}} L_1} - P_{32} e^{t \alpha_{21} L_2 } P_{21} \right| \right|}_F \\* & \quad + {\left| \left| P_{32} e^{t \alpha_{21} L_2 } P_{21} -P_{32} e^{t L_2 } P_{21} \right| \right|}_F \\* & \quad + {\left| \left| P_{32} e^{t L_2 } P_{21} - P_{32} e^{\frac{t}{\alpha_{32}} L_2 } P_{21} \right| \right|}_F \\* & \quad + {\left| \left| P_{32} e^{\frac{t}{\alpha_{32}} L_2 } P_{21} - e^{t \alpha_{32} L_3 } P_{32} P_{21} \right| \right|}_F \\* & \quad + {\left| \left| e^{t \alpha_{32} L_3 } P_{32} P_{21} - e^{t L_3 } P_{32} P_{21} \right| \right|}_F \\ \intertext{by Lemma \ref{lem:p_lem_1},} \inf_{P|\mathcal{C}(P)} {\left| \left| P e^{t L_1} - e^{t L_3 } P \right| \right|}_F &\leq {\left| \left| e^{t L_1} - e^{\frac{t}{\alpha_{21}} L_1}\right| \right|}_F + {\left| \left| P_{21} e^{\frac{t}{\alpha_{21}} L_1} - e^{t \alpha_{21} L_2 } P_{21} \right| \right|}_F \\ & \quad + {\left| \left| e^{t \alpha_{21} L_2 } P_{21} - e^{t L_2 } P_{21} \right| \right|}_F \\ &+ {\left| \left| e^{t L_2 } - e^{\frac{t}{\alpha_{32}} L_2 } \right| \right|}_F + {\left| \left| P_{32} e^{\frac{t}{\alpha_{32}} L_2 } - e^{t \alpha_{32} L_3 } P_{32} \right| \right|}_F \\ & \quad + {\left| \left| e^{t \alpha_{32} L_3 } P_{32} - e^{t L_3 } P_{32} \right| \right|}_F \\ &= D_\text{reg}(G_1, G_2 | t = c) + D_\text{reg}(G_2, G_3 | t = c)\end{aligned} Since this is true for any fixed $t$, let \begin{aligned} t^* = \arg \sup_t D_\text{reg}(G_1, G_3 | t).\end{aligned} Then \begin{aligned} D_\text{reg}(G_1, G_3) &= \sup_c D_\text{reg}(G_1, G_3 | t) \\ &= D_\text{reg}(G_1, G_3 | t^*) \\ & \leq D_\text{reg}(G_1, G_2 | t^*) + D_\text{reg}(G_2, G_3 | t = t^*) \\ & \leq \sup_{t_{21}} D_\text{reg}(G_1, G_2 | t_{21}) + \sup_{t_{32}} D_\text{reg}(G_2, G_3 | t_{32}) \\ &= D_\text{reg}(G_1, G_2) + D_\text{reg}(G_2, G_3)\end{aligned}

We can construct a similar regularized version of the linear objective function: \begin{aligned} \tilde{D}_\text{reg}(G_1, G_2) &= {\left| \left| \frac{1}{\alpha} P L_1 - \alpha L_2 P \right| \right|} + {\left| \left| \frac{1}{\alpha} L_1 - L_1 \right| \right|} + {\left| \left| \frac{\null}{\null} P L_2 - \alpha L_2 P \right| \right|}\end{aligned}

The additional terms included in $D_\text{reg}$ and $\tilde{D}_\text{reg}$ penalize $\alpha$ distorting the respective Laplacians far from their original values. In practice, many of the theoretical guarantees provided earlier in this manuscript may not apply to optimization of the augmented objective function. Hence, a major area of future work will be modification of our optimization procedure to compute this form of distance.

## Theory Summary

Triangle inequalities are proven for some members of the proposed family of graph distortion or “distance” measures, including infinitesimal and finite diffusion time, a power law for sparsity, and/or a power law for the time scaling factor between coarse and fine scales. However, the case of an optimal (not power law) time conversion factor $\alpha$ needs to be investigated by numerical experiment, and that requires new algorithms, introduced in Section 4.3. We also show that in the case of distances between graph box products, optimization over $P$ for the product graphs is bounded above by a monotonic function of the optimum over the component graphs.