SimianQuant

SimianQuant

Bringing Back Moore's Law

A useful way to understand the context is to read the section titled Bring Back Moore’s Law from a 2012 essay by Paul Graham. His thesis is that the only way to make programs faster is to rewrite them to take advantage of the parallellization available in modern hardware, which is a hard problem. While correct in essence, his characterization of the problem is deficient along three axes:

  1. The behavior of the building blocks of the solution will be constrained by a set of rules, often called an algebra. Using this algebra to rewrite a supplied program can improve the performance by an order of magnitude, as shown here.
  2. There are several non-interchangeable approaches to parallellize a program (for example CPU vs. GPU), with different approaches suitable for different use cases of the solution, as shown here.
  3. Using a more modern algorithm can improve performance and accuracy by an order of magnitude, for example finite differences vs algorithmic differentiation. However, learning and applying these algorithms requires users to learn skills that are orthogonal to their business domain.

Resolving these deficiencies is a problem that is solvable in the simplest case. However, these solutions are not used (or not commonly used) in practice because scaling up these techniques to solve realistic cases is a hard problem. The algorithms of the SimianQuant library solve the hard problem by disintegrating the real case into a ‘combination’ of the simplest cases, and then reintegrating the solution. The interfaces exposed by the library abstract away the real world complexities and allow the programmer to code as if she was implementing the simplest case.

Precursors

The motivating objective with SimianQuant was to provide an interface that would be at least as easy to use as symbolic languages and generate code that was at least as fast as parallelized imperative code; partly inspired by John Backus’ Turing lecture. Heuristically, the library may be interpreted as ‘middleware’ that rewrites symbolic input to generate imperative output and encapsulates solutions to the three challenges outlined above in its configuration. A configurable middleware is obviously not a new idea and it is useful to understand what came before to understand what makes SimianQuant different:

  1. Historically, compilers have been the middleware used by programmers to make their code easier to write and faster to run. However, as Graham noted in his essay and later encoded in a YCombinator RFS, they are not ‘sufficiently smart’ to solve the outlined problems.
  2. Typeclasses are programming constructs developed to decouple the description of an algorithm from an implementation. In principle, they can allow parallel code to look like sequential code. However, implementing typeclasses is unfeasible for anything beyond elementary operations.
  3. Facebook’s React uses similar principles to solve an orthogonal problem - it is hard to track the dynamic evolution of an interactive user interface. As a rough analogy, React is to a browser’s rendering engine what SimianQuant is to a standard compiler (like gcc).
  4. Lightweight Modular Staging is a software library that uses code generation (like SimianQuant) to reduce the overhead associated with abstraction. However, its optimization is largely syntactic, i.e. based on the structure of the input, rather than semantic, i.e. based on the meaning of the input.

The common deficiency of the precursors, as well as the reason for the lack of existence of a ‘sufficiently smart compiler’, is that they parse code syntactically rather than semantically. Therefore they require programmers to use their intelligence and conceptual knowledge of mathematics and computing to reexpress real problems in a way that they can solve. This is the most tedious and mind-numbing aspect of a programmers job.

SimianQuant’s encoding algorithms, which in aggregate are externally similar to an expert system, can replicate this aspect of programming. This module, along with a synthesis of existing tools, allows the implementation process to be fully automated. Programmers now only need to specify a description of the problem, which can be done symbolically.

This expert system is called the semantic engine.

Semantic Engine

The expressiveness, richness and structure of algebraic expressions comes from the combinations of a few simple primitives and operators. The hard problems in being able to reason about algebraic expressions are managing the combinatorial explosion and defining a clear objective function for what is a computationally better but numerically equivalent expression. Both usually depend on the global properties of the expressions; for example, in some cases it might be better to factorize a subexpression and in some cases it might be better to expand it. These problems are qualitatively different from those of ‘traditional’ artificial intelligence/machine learning systems, where the hard problem is discovering and interpreting the latent structure.

SimianQuant manages the compression and interpretation of the combinatorial space by using a synthesis of three ideas from cognitive science and generative social sciences:

  1. Metacognition, or reasoning about reasoning, and how that affects reasoning
  2. Self Organization, or the study of how simple interacting units form complex structures without a central organizing entity
  3. Emergence, or the study of how macroscopic entities can have properties that are not possessed by their constituent elements

Using the synthesis, the library is able to express the optimization objective in terms of invariants over the flow of a simple tensor, which can be efficiently computed from the local properties of subexpressions. The global optimum is therefore a search problem over the potential directions of this tensor’s flow.

Writing a working implementation of this synthesis and integrating it with the syntactic modules described in the previous section results in a sufficiently smart compiler that solves the hard problem in quantitative development - simultaneously maximizing programmer productivity and runtime performance, while eliminating maintenance.

This brings back Moore’s Law.

Schedule A Demo

Join The Team