back

CACL

Computer Algebra Contrastive Learning

Problem

Computer algebra systems (CAS) are rewrite rule engines.

Example: x*1 = x and a*(b+c) = a*b + a*c applied to a*(x+1) yields a*x + a.

Issue: Equalities are two-way. The rule a^b {b ∈ ℕ, b ≥ 2} = a * a^(b - 1) applied to (x+1)^3 yields (x+1)*(x+1)^2 — valid, but moving away from the simpler form. Including both directions loops infinitely. The usual fix (separate Factor/Expand) offloads directionality to users.

Knuth–Bendix can sometimes produce confluent rewrite systems, but not always, and results may be impractical. There's no natural heuristic for guiding rewrite search. We propose learning one.

Approach

Contrastive RL for a distance heuristic

Learn d(current_expr, goal_expr) → estimated rewrite distance. Contrastive learning fits: positive pairs are close expressions, negatives are distant. LSH mines hard negatives (syntactically similar, semantically distant).

Engine philosophy

RIES brute-forces short RPN programs with obsessive optimization (tens of millions of evals/sec). Lean is slow—AlphaProof needs large nets to balance inference vs validation. A RIES-style engine shifts the optimum: more search, smaller nets, cheaper compute (cf. Stockfish vs LC0).

Bootstrapping

  1. Train on short rewrite paths (random walks)
  2. Solve harder problems with the learned heuristic
  3. Add solutions to training data
  4. Repeat

Architecture

  • Symbolic engine: fast pattern matcher + rewrite applicator; generates legal rewrites.
  • Distance net: small encoder + contrastive head; scores (expr, goal).
  • Search: Beam/A* guided by distance estimates.

Target: laptop-runnable inference. Training on a single rented GPU.

MVP

Symbolic integration. Outperform Mathematica on the Lample & Charton integration benchmark using:

  • polynomial rules (commutativity, associativity, distribution)
  • trig identities (pythagorean, angle addition, derivatives)
  • integration by parts / u-substitution as rewrite rules

Success criteria: solve problems that require non-obvious rule sequencing; show the learned heuristic beating hand-coded ordering.