The Fast Multipole method (FMM) was proclaimed as one of the top 10 algorithms of the 20th century. It reduces the O(N^2) computation of certain N-body problems to O(N) using hierarchical summation techniques.
I wrote a 3 dimensional FMM for summing the effect of a cloud of point electrical charges (the fundamental solution to Laplace's equation), which was a critical component of the paper Linear Time Smoke Animation with Vortex Sheet Meshes and was also used in Ocean Waves Animation Using Boundary Integral Equations and Explicit Mesh Tracking
My FMM implementation was done in templated C++ on linux. The FMM is a complicated algorithm to implement, with both geometric and numerical complexities. To facilitate testing during development, I wrapped key pieces of the FMM operators in Python using the Numpy headers to allow dynamic debugging and testing.
I'm currently working on a GPU computing version in OpenCL