Large-scale internet-enabled quantum information science and a poll

Several algorithms have been discovered that cannot be run on classical computing systems. These quantum algorithms require operations that are allowed by quantum mechanics but not by classical physics. In order to run any one of these algorithms, hardware is required that is capable of supporting all of the operations required by that particular algorithm.

One important class of quantum algorithms are the adiabatic quantum algorithms. Initially discovered in 2000 by scientists at MIT, these algorithms solve formally intractable computational problems in a novel way. These algorithms are of intense interest for two main reasons. The first is that special-purpose hardware build specifically to run them is the most mature enablement of any quantum computing algorithm to date. The second is that early analysis of these algorithms show exponential speed-ups over classical approaches on certain typical case NP-hard problems. If the polynomial scaling seen for small problems continues into the low thousands of variables, this would be of extreme practical importance, as NP-hard optimization problems of this size are ubiquitous in science and business.

Recently techniques have been introduced for calculating the run time of the quantum adiabatic algorithms for problems up to about 128 variables; see here and here. While this is still too small to ultimately answer questions about the asymptotic scaling of these approaches, it is sufficient to predict the expected performance of adiabatic quantum computers of up to 128 qubits, which perhaps not coincidentally is the number of qubits in the Rainier design.

The approach outlined in the above referenced papers are based on Quantum Monte Carlo (QMC) techniques. These approaches allows the run time of a quantum adiabatic algorithm to be estimated. The approach is embarrassingly parallel and can efficiently make use of large numbers of independent processors. It is also extremely computationally expensive. These two points make the calculation of the runtime of the adiabatic quantum algorithms using QMC techniques ideally suited to a large-scale Internet-based distributed computation.

D-Wave and Dr. Peter Young (a co-author on one of the QMC papers referenced previously) have built a distributed QMC platform to calculate the run time of a quantum adiabatic algorithm relevant to the experimental hardware at D-Wave. The project is called AQUA (Adiabatic QUantum Algorithms), with the distributed version called AQUA@home. The distributed computing technology is based on BOINC, the same technology that enables SETI@home and a host of other large-scale distributed scientific computations.

AQUA is currently running internally at D-Wave. We have set up an external server at that will shortly go live. This server is set up to accept volunteer cycles from individuals who wish to contribute computer time to the AQUA project and make a direct contribution to the advancement of  scientific understanding of the quantum adiabatic algorithms. The AQUA@home program runs at low priority in the background of any internet-connected computer. All of the data acquired from AQUA will be published, and everyone who contributes cycles to the project will receive copies of all the publications arising from this work.

The specific project we have been working on internally to test the AQUA system calculates the runtimes for a particular type of problem ideally suited to our hardware. These problems are spin glass problems and are known to be NP-hard. This type of problem will be the first thing we run on AQUA@home to ensure that everything is working properly and will be the basis of the first publication.

After this we plan to run AQUA@home to compute the expected run time of our 128-qubit superconducting adiabatic quantum computing system, running the quantum adiabatic algorithm it enables, on problems generated by the binary classification machine learning application we co-developed with Vasil Dentchev and Hartmut Neven at Google.

I believe that the outcome of this particular project has significant implications for the field of quantum computation, as currently there are no widely accepted meaningful commercial applications of quantum computers. If quantum adiabatic algorithms can solve machine learning problems better than the best known classical approaches that would be a game changer for quantum computation, which currently relies on insufficient amounts of government funding, primarily for its potential role in breaking certain asymmetric cryptosystems, which of course has limited interest to partners and investors in the commercial world.

To conclude I am including here a poll; this is the first time I’ve tried to do this… should be easier than implementing a distributed continuous imaginary time quantum monte carlo program.

21 thoughts on “Large-scale internet-enabled quantum information science and a poll

  1. Correct me if I am speaking nonsense, but it sounds like you could get rather close to demonstrating runtimes for calculating (small) protein folding trajectories over certain lattice models (like Dill’s hydrophobic packing model, as you proposed earlier).

  2. Hi Rob, 128 variables, maybe a bit more. Yes you could run very small lattice protein folding problems through aqua. With enough volunteers we could do this fairly quickly. I’m hoping we can get a lot of folks participating so we can start doing many different types of problems.

  3. Sounds terrific – the protein folding example would be particularly interesting… well, at least in my opinion. It seems like proof-of-principle ‘typical’ runtimes on small (~50 amino acid) peptides would scale well for much larger structures. That said, I’m not very familier with how well folding results (calculated using classical monte carlo) match crystal/NMR data.

  4. Dr Suz: If you could lend some cycles from any evil supercomputers you have built in your basement that would be cool. By the way on a related note I have been attempting to create a spontaneous collective life form out of a collection of odd trinkets and idols that I assumed would reach critical mass on a table in my office but no luck yet. Maybe I will post a picture.


    Geordie, build a supercomputer out of programmable graphics cards

    It wouldn’t be that hard I think … You should have a cuda implementation anyway so that people with graphics cards could get n times speed up… some algorithms are more than 50x …

    How much time does the algorithm spend in “embarrassingly parallel” mode? that is the question for cuda implementation.

    But maybe the algorithm is too serial. who knows. try to find some subject matter experts … like the mathematician that made pycuda. worth checking out..
    also check out the youtube vid. AGI shoutout at the end

  6. Awesome idea! Plus, it looks like I should be able to put together a 6-node system with 24 cores and 24GB of RAM for $3,600 including tax (or a 4-node system for $2,400). I might even be able to get a deal for buying 6 of everything. :D I’ve got plans…

  7. Neil D,

    6 4-core processors? 24x speed-up??? Phht, man, hey, *you* can do better than that by just applying your awesome assembler skills (and a bit of a hand from a profiler or two regarding cache misses, have you looked into all the options of oprofile yet? :) )!

    (Not speaking for The Company),

    Paul B.

  8. Oh believe me, I intend to do both, and much more, hehe. :) My honours project and directed study next term will hopefully lead to cool stuff along these lines. I’m done exams on December 13th (and so is the rest of the team, as far as I know), and then it’s CODE TIME! I don’t want to bore people with the non-quantum details here, so I’ll just email Geordie about it and some other stuff once I have time (e.g. December 14th or 15th).

    If I do decide to spend all that money on a 6-node system, I’d probably consider 100x speedup, relative to the particular base I’ll be comparing against, a complete failure. 200x might be acceptable. It needs to be worth the $3,600 plus electricity plus time, after all, so I’ll try not to get my hopes up too much.

  9. Each node is just a Intel Core 2 Quad CPU (over half the price of each node), 4GB of RAM, 80GB SATA harddrive, stock motherboard, power supply, and case (really optional, and I’ll probably cut them up eventually, but I’d like something to hold the nodes together). No display, no keyboard, no mouse, no optical drive, integrated graphics, no sound, integrated network (not good, but fine for only 6 nodes). A stock network switch also isn’t good for large systems, but probably plenty for 6 nodes that won’t be using the network much. Getting parts from NCIX and looking for lower prices, you can put such a system together for roughly $3,600 including tax.

    From just the parallelization over 24 CPUs, one can really only get up to (and not including) 24x speedup, of course, but I’m hoping that the system, algorithmic, and function-level changes will make a big difference too.

  10. Neil D.,

    Drop in 3 Ethernet boards/node, connect them in hypercube (well, for 8 nodes it will be regular cube) topology and setup proper routing tables — no switch required, and pretty sweet bandwidth with low latency between nodes). Of course you would not need such a thing for embarrassingly parallel problem, but that interconnect (with single 4-port card/node) was what hold together the very first Beowulf cluster (the story was actually quite entertaining).

    Paul B.

  11. no one here into CUDA?

    If the algorithm is highly parallel for most of its execution then CUDA, from nvidia is something to look at.

    google: pycuda, nvidia supercomputer

  12. Geordie: CUDA programs run on on most new nvidia chips, and the python bindings for cuda make programming the GPU simpler..

    I don’t know anything about distributed processing or how AQUA is implemented, but linear algebra and graph algorithms can bee speeded up crazily wuth cuda… I don’t know the technical complications of throwing cuda onto a distributed computing project ( maybe a different flavor of aqua?) but the posibilitiesthat arise out of general purpose GPU programming are extreme…

    On a side note, QUBO seems like it would require education for programmers to use it effectively. Cuda is close to the PRAM ( Parallel random access memory) model. Out in the larger world it has been an “uphill” battle to get the average programmer to write multithreaded programs. The world will be greatly improved once all these new computer architectures can be leveraged to their fullest … but kids will have to start learning python, haskell, and QUBO in kindergarden, which would also be cool

  13. Hey Geordie, any chance that you guys could compile a 64-bit build of the AQUA@Home Windows core? There’s a huge performance hit running 32-bit code on a 64-bit OS. It’s actually one of the two big reasons that I stopped running Folding@Home. The other was that the GROMACS core (what Folding@Home uses most) is single-threaded.

    Also, would you like a visualization for the screensaver? For my Computational Geometry project, I’m making a visualization for hypercubes with up to 12 dimensions and colours associated with each vertex (e.g. 12 qubits and the relative amplitudes of each classical state). I have no idea how to integrate with the BOINC screensaver, but I could send over the code if someone there is interested.

  14. Neil: Yes there is a 64-bit windows version, I am checking up on it.

    Also if you could send me that visualization that would be great, I also don’t know how to attach but I’ll forward it to someone who does!

  15. Pingback: AQUA@home is live «

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s