SimianQuant

SimianQuant

Bringing Back Moore's Law

Harshad Deo

7 minutes read

The SimianQuant library provides a Fluent API on top of its symbolic engine to model and solve Initial Value Problems. This article demonstrates how it can be used to solve a reference problem in systems biology, the repressilator, and how the SimianQuant approach compares with alternatives like NumPy, SymPy and Octave.

In summary,

  1. For less effort on part of the programmer, the SimianQuant implementations are ~ 200x faster than NumPy, ~ 250x faster than SymPy and ~ 1800x faster than Octave
  2. Advanced numerical techniques like sensitivity analysis using algorithmic differentiation, which are tricky and tedious at best in Python, are trivial using the library

The Model

Following Elowitz’s classic paper, the system consists of six coupled non-linear first order differential equations, three modeling mRNA concentrations, $m_i$, and three modeling repressor protein concentrations $p_i$. Time, $t$, is the independent variable, and the system has four parameters: $\alpha_0$, $\alpha$, $n$ and $\beta$.

$\dot m_i = -m_i + \alpha_0 + \frac{\alpha}{1 + p_j^n}$, $i$ = {1, 2, 3}, $j$ = {2, 3, 1}

$\dot p_i = -\beta (p_i - m_i)$

This system will be solved numerically using a 4(5) order Runge-Kutta-Felhberg scheme.

Implementing the Model

SimianQuant

The actual function is modelled in the withFuction closure, and all of the expressions there are symbolic. In the general case, this is useful because symbolic expressions are significantly less verbose than non-symbolic ones. In this particular case, it is even more useful because the equations are symmetric and can be written from a base case using symbol substitution. This makes the code clearer and prevents silly mistakes due to wrongly typed indices.

import clientsupport.ClientSupport._
import clientsupport.StdDoubleContext._

InitialValueProblem
  .newInstance(IVPStep.RKF45)
  .withParameters("a", "a0", "n", "b")
  .withIndependent("t")
  .withDependent("m1", "m2", "m3", "p1", "p2", "p3")
  .withFunction { bundle =>
    	val dm1 = -bundle.m1 + bundle.a0 + bundle.a / (bundle.p2 ** bundle.n + 1)
    	bundle.m1 = dm1
    	bundle.m2 = dm1.weakSymSub("m1", bundle.m2).weakSymSub("p2", bundle.p3)
    	bundle.m3 = dm1.weakSymSub("m1", bundle.m3).weakSymSub("p2", bundle.p1)                
    	bundle.p1 = -bundle.b * (bundle.p1 - bundle.m1)
    	bundle.p2 = -bundle.b * (bundle.p2 - bundle.m2)
    	bundle.p3 = -bundle.b * (bundle.p3 - bundle.m3)
  }

Given the description, an executable implementation can either be materialized to be used in-situ (i.e. within the web-client), or be exported to be used in an external codebase. This article will present benchmarks for the exported Scala implementation because Scala is nearly as readable as Python, as fast as Java and has as much type safety as one wants. To illustrate, the invocation of the generated executable would look like:

val builder = Vector.newBuilder[RepressilatorResult]
val repressilator = RepressilatorImpl.newInstance(a, a0, 0.2, 2, 0, 0.2, 0.1, 0.3, 0.1, 0.4, 0.5, 1, 1e-3)
while (repressilator.t < 1000) {
  repressilator.step()
  builder += repressilator.result
}
val res = builder.result()

NumPy

The canonical (and fastest) way of implementing an Initial Value Problem in Python is to use NumPy. Since the implementation is not symbolic, programmers have to careful about getting the indices right.

import numpy as np
from scipy.integrate import RK45

def repressilator(alpha, alpha0, n, beta, t0, t_end, m10, m20, m30, p10, p20, p30):
    def repressilator_specialized(t, V):
        dm1_dt = -V[0] + alpha0 + alpha / (1 + V[4] ** n)
        dm2_dt = -V[1] + alpha0 + alpha / (1 + V[5] ** n)
        dm3_dt = -V[2] + alpha0 + alpha / (1 + V[3] ** n)
        dp1_dt = -beta * (V[3] - V[0])
        dp2_dt = -beta * (V[4] - V[1])
        dp3_dt = -beta * (V[5] - V[2])
        f = [dm1_dt, dm2_dt, dm3_dt, dp1_dt, dp2_dt, dp3_dt]
        return f        
    y0 = [m10, m20, m30, p10, p20, p30]
    solver = RK45(repressilator_specialized, t0, y0, t_end)
    res_t = []
    res_y = []
    while solver.t < t_end:
        solver.step()
        res_t.append(solver.t)
        res_y.append(solver.y)
    return (np.asarray(res_t), np.asarray(res_y))

(t, y) = repressilator(298.2, 0.03, 2, 0.2, 0, 1000, 0.2, 0.1, 0.3, 0.1, 0.4, 0.5)

SymPy

An alternative way to write the gradient function in Python is to use SymPy to specify the equations symbolically and then use its lambdifyfunction to generate NumPy implementations. Comparing a pure SymPy gradient function would just be stupid.

import numpy as np
import sympy as sp
from scipy.integrate import RK45

alpha, alpha0, n, beta, t, m1, m2, m3, p1, p2, p3 = sp.var('alpha, alpha0, n, beta, t, m1, m2, m3, p1, p2, p3')

dm1_dt = -m1 + alpha0 + alpha / (1 + p2 ** n)
dm2_dt = dm1_dt.subs('m1', m2).subs('p2', p3)
dm3_dt = dm1_dt.subs('m1', m3).subs('p2', p1)
dp1_dt = -beta * (p1 - m1)
dp2_dt = -beta * (p2 - m2)
dp3_dt = -beta * (p3 - m3)

symbols = [alpha, alpha0, n, beta, t, m1, m2, m3, p1, p2, p3]
exprs = [dm1_dt, dm2_dt, dm3_dt, dp1_dt, dp2_dt, dp3_dt]
funs = [sp.lambdify(symbols, expr, "numpy") for expr in exprs]

def repressilator(alpha, alpha0, n, beta, t0, t_end, m10, m20, m30, p10, p20, p30):
    def repressilator_specialized(t, V):
        return [fun(alpha, alpha0, n, beta, t, V[0], V[1], V[2], V[3], V[4], V[5]) for fun in funs]
    y0 = [m10, m20, m30, p10, p20, p30]
    solver = RK45(repressilator_specialized, t0, y0, t_end)
    res_t = []
    res_y = []
    while solver.t < t_end:
        solver.step()
        res_t.append(solver.t)
        res_y.append(solver.y)
    return (np.asarray(res_t), np.asarray(res_y))

(t, y) = repressilator(298.2, 0.03, 2, 0.2, 0, 1000, 0.2, 0.1, 0.3, 0.1, 0.4, 0.5)

Octave

This is by far the ugliest approach, but is probably the best known and understood due to the popularity of MATLAB in college campuses (at least in my time in my country). The ugliness is unsurprising given that indices in the language begin at 1.

function dydt = cost_func(t, y, alpha, alpha0, n, beta)
    dm1dt = -y(1) + alpha / (1 + y(5) ^ n) + alpha0;
    dm2dt = -y(2) + alpha / (1 + y(6) ^ n) + alpha0;
    dm3dt = -y(3) + alpha / (1 + y(4) ^ n) + alpha0;
    dp1dt = -beta * (y(4) - y(1));
    dp2dt = -beta * (y(5) - y(2));
    dp3dt = -beta * (y(6) - y(3));
    dydt = [dm1dt; dm2dt; dm3dt; dp1dt; dp2dt; dp3dt];
end

function repressilator = repressilator_parametric(alpha, alpha0, n, beta, t0, t_end, m10, m20, m30, p10, p20, p30)
  tspan = [t0 t_end];
  x0 = [m10; m20; m30; p10; p20; p30];
  [t_res, y_res] = ode45(@(t, y) cost_func(t, y, alpha, alpha0, n, beta), tspan, x0, odeset('Refine', 6));
  repressilator = [t_res, y_res];
end

res = repressilator(298.2, 0.03, 2, 0.2, 0, 1000, 0.2, 0.1, 0.3, 0.1, 0.4, 0.5);

Benchmarks

The performance for all four implementations was measured for multiple evaluations on two machines, one with a clock speed of 2Ghz and the other with a clock speed of 3GHz. The operating system, microarchitecture, library versions and runtime configurations were held constant. The median runtimes are:


This, however, is not entirely a fair comparison. Implementing an ODE solver is more of a dark art than a science, and owing to its raw performance, the SimianQuant implementation uses a more conservative step update function. The SimianQuant solver took 1427 steps to cover the domain, compared with 980 by Octave and 787 by NumPy/SymPy. A more fair metric would be to compare the time taken per step. The results in that case are:

Beyond Simple Function Evaluation

An analysis that simply solves the function is incomplete, it also needs to evaluate sensitivity with respect to the parameters. This sensitivity is of two kinds - infinitesimal sensitivity, equivalent to $\frac{dy}{dx}$, and finite sensitivity, equivalent to $\frac{\Delta y}{\Delta x}$.

In Python and Octave, the only way to reasonably compute either for the system in consideration is to use bump and recompute, i.e. finite differences, that too by evaluating the function sequentially for bumped parameters. This approach is slow and inaccurate.

  1. To compute infinitesimal sensitivity, forward mode algorithmic differentiation should be used. This allows the value and all first order sensitivities to be computed in one pass
  2. To compute finite sensitivity, SIMD vectorization (either CPU or GPU based) should be used.

Doing either in Python is tedious and tricky at best. For example, while Python has a module for algorithmic differentiation, it doesn’t work very well with NumPy arrays, and the RK45 solver expects scalar inputs. Doing both with SimianQuant is trivial. For example, to use forward mode algorithmic differentiation using Spire/Ceres, all you have to do is change StdDoubleContext in the second line to SpireDoubleJetContext.


Harshad

Students can learn less if, to improve graduation rates, you make classes easier

Recent posts

See more

Categories