Chris Swierczewski's Website

Scientific Software Design (Part I)

Dec 16, 2013

I’m happy to say that abelfunctions can now correctly compute period matrices. It has been about a year and a half, or perhaps close to two years, of studying and development before I was able to get it right.

In [1]:

from abelfunctions import RiemannSurface
from numpy import dot
from numpy.linalg import inv
from sympy.abc import x,y

f = y**3 + 2*x**3*y - x**7
X = RiemannSurface(f,x,y)
A,B = X.period_matrix()
Omega = dot(inv(A),B)
print(Omega)

Out[1]:

array([[-1.30901699+0.95105652j, -0.80901699+0.58778525j],
       [-0.80901699+0.58778525j, -1.00000000+1.1755705j ]])

William Stein taught me a very important lesson while I was his undergraduate student, “First, write code that works. Then make it work better.” This is basically a paraphrasing of Donald Knuth’s mantra “Premature optimization is the root of all evil.” Now that my code works it’s time to optimize it.

There are, of course, various performance-related optimizations that I could do. For example, I’ve already Cythonized most of the analytic continuation portion of the code reducing the total time taken for the above calculation from forty seconds to about five on my machine: a lowly first-generation Macbook Air. There are other pockets here and there where some careful use of C-types could shave off precious seconds. I’m convinced that I’ll be beating Maple’s implementation in no time…if I haven’t done so already.

However, what I’m mostly interested in right now is proper object-oriented design of my code. I’m investigating this world of software development for several reasons:

better organized code leads to fewer bugs,
make it easier for others to read my code,
make it easier for others to add to my code.

This is the first in a series of posts where I would like to record my quest in understanding proper object oriented design and applying this new knowledge to my abelfunctions code.

Reproducible Research and Open-Source Software

Before diving into the beautiful field of software design patterns I want to take a moment to look at the last two bullet points mentioned above. These last two points are in line with a recent movement in applied mathematics / computational sciences towards emphasizing “reproducible research”. My main source of all things related to this field has been Randy LeVeque, who has written several interesting articles on the subject . [1] [2]

Around the department I see a good amount of script-quality code. By no means is this a criticism of the programming abilities of my peers — a large number are experts in the scientific computing techniques of their field. However, I feel like much of the reason behind the lack of organization and openness of code is that the majority of the code written by my peers have a one-time use purpose in their career. A student needs to simulate X for a paper or produce results Y so as to determine the next course of action in their research. I don’t expect everyone in the department to want to turn their scripts into libraries. Their code is rarely and end itself.

But who knows when a lowly script will become ever so useful in the future!

But things change when this code is tied to a paper. Consider my advisor, Bernard. He is an excellent person to write a paper with. He is scrupulous about paper organization, grammar, formatting, etc. and every time I discuss a paper or presentation with him I come out a better writer. So my question is, why not apply these same principles to the code used to produce the results in your paper?

Hence, my dive into object-oriented design. My code has become long enough (> 10,000 lines) and complicated enough to necessitate this change. So in the following related articles I’ll share some thoughts and insights into how I’ve applied object oriented design to my code.

References

1) ICERM “Setting the Default to Reproducible: Reproducibility in Computational and Experimental Mathematics”, ICERM Workshop report, with D. Bailey, J. Borwein,R. LeVeque, W. Rider, and W. Stein.

2) “Top Ten Reasons to Not Share Your Code (and why you should anyway)”, (based on a talk given at the SIAM CSE minisymposium on “Verifiable, Reproducible Research and Computational Science”), R. LeVeque

3) Bar