Chris Swierczewski's Website

Transitioning to Native Sage Datatypes

Jun 24, 2016

Nils Bruin at Simon Fraser University brought up the idea / request to compute period matrices of plane curves to high precision. Sage has built-in capabilities to work with arbitrary-precision complex numbers via the CC ring. It also implements complex doubles in the the CDF ring.

In the original version of Abelfunctions I decided to use Numpy + Cython for numerical performance. Now I’m wondering if it’s worth investigating using native Sage datatypes instead. Below, I perform some timings between different fast_callable declarations, where the domain is varied, and different input datatypes: numpy.complex, CDF, and CC.

Setup

from sage.all import fast_callable, QQ, CDF, CC, timeit
import numpy
import time

R = QQ['x,y']
x,y = R.gens()

# functions:
f = y**3 + 2*x**3*y - x**7
f_complex = fast_callable(f, vars=[x,y], domain=complex)
f_CDF = fast_callable(f, vars=[x,y], domain=CDF)
f_CC = fast_callable(f, vars=[x,y], domain=CC)

# inputs
a = numpy.complex(1.000000000007 + 1.0000000000009j)
b = numpy.complex(1.00000000005 - 2.0000000000008j)
a_CDF = CDF(a)
b_CDF = CDF(b)
a_CC = CC(a)
b_CC = CC(b)

N = 100000

Timings

timeit('for _ in xrange(N): f(a,b)')
timeit('for _ in xrange(N): f(a_CDF,b_CDF)')
timeit('for _ in xrange(N): f(a_CC,b_CC)')

5 loops, best of 3: 2.66 s per loop
5 loops, best of 3: 2.49 s per loop
5 loops, best of 3: 3.47 s per loop

timeit('for _ in xrange(N): f_complex(a,b)')
timeit('for _ in xrange(N): f_complex(a_CDF,b_CDF)')
timeit('for _ in xrange(N): f_complex(a_CC,b_CC)')

5 loops, best of 3: 234 ms per loop
5 loops, best of 3: 271 ms per loop
5 loops, best of 3: 275 ms per loop

timeit('for _ in xrange(N): f_CDF(a,b)')
timeit('for _ in xrange(N): f_CDF(a_CDF,b_CDF)')
timeit('for _ in xrange(N): f_CDF(a_CC,b_CC)')

5 loops, best of 3: 1.18 s per loop
25 loops, best of 3: 20 ms per loop
5 loops, best of 3: 88.3 ms per loop

timeit('for _ in xrange(N): f_CC(a,b)')
timeit('for _ in xrange(N): f_CC(a_CDF,b_CDF)')
timeit('for _ in xrange(N): f_CC(a_CC,b_CC)')

5 loops, best of 3: 2.4 s per loop
5 loops, best of 3: 1.73 s per loop
5 loops, best of 3: 1.18 s per loop

The fastest result using only native Sage datatypes is f_CDF(a_CDF, b_CDF). This is only about x2 slower than directly during f_CDF.call_f(...) with native C complex datatypes. (Here, f_CDF is of type Wrapper_cdf which provides low-level access to the underlying JIT-compiled C function.) That is, by using native C get a x2 performance increase from the fastest native Sage version. Given that the pure-CDF version is already x100 faster than a non-CDF implementation I don’t think it’s worth the last bit of performance if it means easier to read code. (Not to mention, the ease of transitioning to using CC.)

It will be challenging refactoring the numerics code for Sage. I will most likely not do anything about it until I’m done with my Ph.D. but it’s worth investigating in the future.