In April 2008 there was a
thread on the scipy-dev list regarding the inclusion of Cython code in SciPy. In that thread, I mentioned a particular use case of interest to me -- creating and manipulating an array of objects (rather than arrays of elementary data types) and being able to do that with Cython easily and efficiently.
The problem I was considering is a simple one. I create a list of "Vortex" objects and compute (naively) the velocity of a collection of these particles on one another. This is an O(N^2) computation since every particle influences every other. The idea was to create simple OO code to be able to perform these computations. Here is an outline of the Python code for doing this:
class Vortex(object):
def __init__(self, pos=0.0, strength=1.0):
# ...
def eval_velocity(self, pos):
return -1j*self.strength/(2*pi*(pos - self.position))
class VortexManager(object):
def __init__(self, vortices=None):
# vortices is a list of vortex objects.
self.vortices = vortices
def set(self, pos, str):
# ...
def velocity(self):
for va in self.vortices:
vel = complex(0, 0)
for vb in self.vortices:
if va is vb:
continue
else:
vel += vb.eval_velocity(va.position)
va.velocity = vel
Very straightforward code. Now, back in April I implemented this in pure Python, C++ and wrapped the C++ code to Python using
SWIG. I also implemented it in
D and wrapped that using
PyD. I found that D was about 1.7 times slower than C++. C++ was about 300-400 times faster than the pure Python version.
I attended Robert Bradshaw's
Cython tutorial at SciPy08 and really liked it. About 10 days ago I finally found the time to create a Cython version and the winner is ...
Cython!
I've put up all of the code
here. To use the code, untar the tarball and do the following:
$ cd cpython_d_cpp
$ python setup.py build_ext --inplace
This requires SWIG, numpy and Cython to build. If you have PyD installed it will build the PyD extension also. To test the performance do this:
$ python test_perf.py
This produces the following output for me on a P4 (3 Ghz):
dee(4000): 1.87730193138
(1411.53285812+1411.53285812j) (945.091286479+945.091286479j)
swig(4000): 1.10782289505
(1411.53285812+1411.53285812j) (945.091286479+945.091286479j)
cython(4000): 1.15034103394
(1411.53285812+1411.53285812j) (945.091286479+945.091286479j)
Pure Python(200): 1.14771318436
# N SWIG Cython Ratio
1000 0.071 0.069 0.967
2000 0.283 0.274 0.968
3000 0.638 0.619 0.970
4000 1.135 1.100 0.970
5000 1.767 1.720 0.973
6000 2.517 2.473 0.983
7000 3.474 3.370 0.970
8000 4.541 4.403 0.970
9000 5.698 5.575 0.978
10000 7.000 6.879 0.983
The first few numbers just test one single case of 4000 particles. D is slower than both C++ and Cython. Python is dog slow (or donkey slow as I like to say it)! For some reason I was getting segfaults when I tried to test the D wrapper for more than 3000 particles. On my Macbook the Cython version is actually 30% faster than the C++ version and on a Sempron 2800 Cython is about 25% slower. So different machines produce different numbers. However, C++ and Cython are both in the same ballpark.
What I loved about the Cython code is that I use a Python list to manage the Vortex objects. This shows that we can use the normal Python containers to manage objects. This is extremely convenient. This isn't very surprising either since Python containers are also heavily optimized. "cython -a" was a huge help when getting things done. For more details on the Cython code look at the tarball.
Clearly, if you are building code from scratch and need speed, Cython is an excellent option. For this I really must congratulate the Cython and Pyrex developers.