Conditional Graph Factorization October 23, 2019 Harshad Deo 3 minutes read Twitter Facebook Reddit LinkedIn Email Conditional Graph Factorization is a submodule of the SimianQuant code generator that generalizes the concept of calibration. Given a sequence of partitions of the domain, it factorizes the full computational graph into a sequence of subgraphs conditional on the respective partition and the inferred codomain of the previous subgraph. This article illustrates the principle and some applications. Concretely, given a computational graph $\Bbb G(\Bbb X)$ with its input partitioned into a sequence of $n$ subsets $x_i, i\in {1 \dots n}$, conditional graph factorization generates a sequence of subgraphs $g_i$ such that: $\Bbb G(\Bbb X) = g_n\Big(x_n | g _ {n-1} \big(x _ {n-1} | \cdots g_1(x_1) \big) \Big)$ The arguments corresponding to partition $x_k$ don’t need to be known while evaluating subgraph $g _ {k-1}$, and multiple instances of those arguments can be used with a computed value of $g_ {k-1}$. The subgraph $g_k$ treats the outputs of $g _ {k-1}$ as preset constants. This construction therefore elegantly models cases where some of the arguments to a function change more frequently than others, or where the influence of varying one or more variables while keeping others constant needs to be studied. As an illustration, consider a toy example: linear interpolation with flat extrapolation over two data points. The system has five inputs: the four data points $x_1$, $y_1$, $x_2$, $y_2$ and the test point $x$, and one output $y$. A stylized representation of the full computational graph is: In practice, the data points are fixed (usually to match an observed system) and the test point is varied. Therefore the natural choice for partitioning the domain is into the data points ($x_1$, $y_1$, $x_2$, $y_2$) and the test point, $x$. Conditional graph factorization will split the full graph into two subgraphs: $g_1(x_1, x_2, y_1, y_2)$ with outputs $x_1$, $x_2$, $y_1$, $y_2$ and $e_4$ $g_2(x|x_1, x_2, y_1, y_2, e_4)$ with output $y$ A stylized representation of the split computational graph is: It is important to emphasize that this is not the only choice for the partition. For example, it is equally valid (though probably not as useful), to partition the domain as ($x_1$, $x_2$, $y_1$ ) and ($y_2$, $x$). As expected, conditional graph factorization can and does significantly improve performance. As an example, consider 20 point cubic spline interpolation with and without conditional graph factorization: In the non-factorized case, the function solves the equations for the spline coefficients each time it is evaluated. It should be noted that the non-factorized version is still 7x faster than Strata’s calibrated implementation, as presented here Harshad If a matrix is used in physics, it is invertible