“Inside the chip” – new video showing Rainier 128 processor

Here is a video showing how some of the parts of a D-Wave Rainier processor go together to create the fabric of the quantum computer.

The animation shows how the processor is made up of 128 qubits, 352 couplers and nearly 24,000 Josephson junctions. The qubits are arranged in a tiling pattern to allow them to connect to one another.

Enjoy!

Vesuvius: A closer look – 512 qubit processor gallery

The next generation of D-Wave’s technology is called Vesuvius, and it’s going to be a very interesting processor. The testing and development of this new generation of quantum processor is going well. In the meantime, here are some beautiful images of Vesuvius!

quantum computer quantum computing D-Wave Systems Vesuvius

Above: An entire wafer of Vesuvius processors after the full fabrication process has completed.

.

quantum computer quantum computing D-Wave Systems Vesuvius

Above: Photographing the wafer from a different angle allows more of the structure to be seen. Exercise for the reader: Estimate the number of qubits in this image :)

.

quantum computer quantum computing D-Wave Systems Vesuvius

Above: A slightly closer view of part of the wafer. The small scale of the structures (<1um) produces a diffraction grating effect (like you see on the underside of a CD) resulting in a beautiful spectrum of colours reflecting from the wafer surface.

.

quantum computer quantum computing D-Wave Systems Vesuvius

Above: A different angle of shot produces different colours and allows different areas of the circuitry to become visible.

.

quantum computer quantum computing D-Wave Systems Vesuvius

Above: A close-up image of a single Vesuvius processor on the wafer. The white square seen to the right of the image contains the main ‘fabric’ of 512 connected qubits.

.

quantum computer quantum computing D-Wave Systems Vesuvius

Above: An image of a processor wire-bonded to the chip carrier, ready to be installed into the computer system. The wires carry the signals to the quantum components and associated circuitry on the chip.

.

quantum computer quantum computing D-Wave Systems Vesuvius

Above: A larger view of the bonded Vesuvius processor. More of the chip packaging is now also visible in the image.

.

quantum computer quantum computing D-Wave Systems Vesuvius

Above: The full chip packaging is visible, complete with wafer.

.

Implementation of a Quantum Annealing Algorithm Using a Superconducting Circuit

Implementation of a Quantum Annealing Algorithm Using a Superconducting Circuit

A circuit consisting of a network of coupled compound Josephson junction rf-SQUID flux qubits has been used to implement an adiabatic quantum optimization algorithm. It is shown that detailed knowledge of the magnitude of the persistent current as a function of annealing parameters is key to implementation of the algorithm on this particular type of hardware. Experimental results contrasting two annealing protocols, one with and one without active compensation for the growth of the qubit persistent current during annealing, are presented in order to illustrate this point.

[arxiv:0903.3906]

Multi-qubit synchronization results

Synchronization of Multiple Coupled rf-SQUID Flux Qubits

A practical strategy for synchronizing the properties of compound Josephson junction rf-SQUID qubits on a multiqubit chip has been demonstrated. The impacts of small (~ 1 %) fabrication variations in qubit inductance and critical current can be minimized by the application of a custom tuned flux offset to the CJJ structure of each qubit. This strategy allows for simultaneous synchronization of the qubit persistent current and tunnel splitting over a range of external bias parameters that is relevant for the implementation of an adiabatic quantum processor.

arXiv:0903.1884v1

Rainier chips are cooling down

Three Rainier 1st silicon chips are on their way to 10mK. On two of them we are doing  device-level testing, the third has a full 8-qubit unit cell with all the programmable control circuitry bells and whistles… here is the wirebonded chip with the whole Rainier unit cell on it.

rainier_chip-qs_2592x1944

For all you many worlds QM types: Kind of cool that this chip may be shared by 2^8 other universes all doing the same experiments… I have to admit, opening doors between parallel universes is a pretty fun job.

WIRA Rainier 0-silicon pic

Here is a picture taken at WIRA of a 0-silicon Rainier wafer. The left-most chiplet in the middle is an 8-qubit unit cell break-out. The chiplet to the right of this is a 2×2 array of unit cells (32 qubits). The chiplet to the right of that is a 4×4 array of unit cells (128 qubits). Each wafer contains several hundred of each. There are several more layers of metal that are added after this step that cover up the underlying circuitry, making an optical photo at this stage more interesting than at later stages where it’s like taking a picture of a mirror. At WIRA you can see a lot of cool stuff. Most of what you’re looking at here is the digital SFQ circuitry used to program a problem instance.

WIRA wafer with Rainer processors.

Rainier V: Instance Classes

Several algorithms have been discovered that cannot be run on classical computing systems. These quantum algorithms require operations that are allowed by quantum mechanics but not by classical physics. In order to run any one of these algorithms, hardware is required that is capable of supporting all of the operations required by that particular algorithm.

One important class of quantum algorithms are the adiabatic quantum algorithms. Initially discovered in 2000 by a group at MIT, these algorithms solve formally intractable computational problems in a novel way. These algorithms are of intense interest for two main reasons. The first is that special-purpose hardware build specifically to run them is by far the most advanced enablement of quantum computation in the world today. The second is that early analysis of these algorithms show exponential speed-ups over classical approaches on certain typical case NP-hard problems. If the polynomial scaling seen for small problems turns out to be the correct asymptotic scaling, this would be of extreme practical importance, as NP-hard optimization problems are ubiquitous in science and business.

Unfortunately, the asymptotic run time for the adiabatic quantum algorithms is not known and appears surprisingly difficult to calculate. Therefore there is controversy about how well this class of algorithms is expected to perform.

In this post I will discuss one reason why it’s so difficult to generically predict the performance of adiabatic quantum algorithms. This reason is usually called instance dependence.

Recall that a problem of the sort we are interested in is specified by a vector \vec{h} and a matrix \hat{J}, where the elements in both are selected from a finite set whose elements are set by the allowed precision in hardware (see here for an introduction to this concept). For specificity let’s say we have 4 bits of precision in hardware. Then the elements of \vec{h} and \hat{J} must be selected from the 15-element set -7, -6, ..., 0, ..., +6, +7. In addition the allowed non-zero elements of \hat{J} are fixed by the hardware graph as introduced in post 1 of this series.

So for the 128-qubit Rainier chip, which has 128 qubits and 352 couplers, each of which can take on one of 15 settings, there are 15^{480} possible settings of the values of \vec{h} and \hat{J} corresponding to problems we can ask the hardware to solve. Of course many of these are the same problem, but even eliminating these symmetries the resultant number of possible instances is large.

We know that not all of these problems take the same time to solve. Some of them are trivially easy, and some of them are exceptionally hard. So how do you answer the question of how an algorithm performs on the vast numbers of problems the hardware can accept?

One way you can make progress is to restrict the type of problem you will consider by choosing a prescription for selecting the \vec{h} and \hat{J} numbers that makes all of the problems generated by that prescription somehow “similar”. We call this problem generating prescription an instance class generator. For example, a valid instance class generator would be the following prescription: assign to all \vec{h} and \hat{J} values randomly selected from the allowed element set drawn with equal probability. We call the instances generated by an instance class generator an instance class.

Defining instance classes allows much more specificity in answering the question “how well does your algorithm perform?”. This is because if the problems within an instance class are sufficiently similar, the median run times for the algorithm on the instances in that class is a meaningful measure of how you’d expect the algorithm to perform “in real life” on applications generating instances of that type.

The instance class that I am going to focus on in the next post is similar to the one I mentioned above, except that the probabilities of assigning the 15 allowed elements is gaussian–the probability of assigning zero being highest, and then tailing off gaussianly for the larger magnitude elements. The variance of the gaussian I will use is set such that the probability of getting the largest magnitude elements +7 and -7 is 1\%. The instance class thus generated is a type of random spin glass on the hardware graph defined by the hardware, which is hard (in a way I’ll more specifically define in the next post) for classical solvers.

Rainier IV: Precision

I mentioned in my last Rainier post that I was going to write next about how to predict runtime performance of the quantum adiabatic algorithms we are trying to enable in the hardware. However there is one other topic that I need to cover before getting to that.

Recall that the 128-qubit Rainier chip is designed to embody a quantum algorithm whose objective is to find the global minimum of

{\cal E}(s_1, ... , s_{128}) = \sum_{j=1}^{128} h_j s_j + \sum_{i,j \in E_p} J_{ij} s_i s_j

where s_j=\pm 1 and E_p is the edge set I defined in post #1 on this topic.

The parameters h_j and J_{ij} arise from currents that flow on the chip, which are set by a user. For a variety of reasons, including most importantly the presence of low frequency magnetic flux noise in the chip itself, the actual values of h_j and J_{ij} that are implemented in the chip are not the desired values, but the desired values \pm some offset. The typical spread between actual and desired values defines what we refer to as the precision of the setting of these parameters.

More specifically, the h_j and J_{ij} values are physically restricted to some interval between maximum positive and negative values. In our implementation the maximum range is on the order of +3.5 GHz to -3.5 GHz for both h_j and J_{ij}. We can normalize these intervals to the range (-1..+1). We then ask the following question: if we try to set a parameter to zero, what does the distribution of actual parameter values look like? Let’s assume that it is a zero mean gaussian with variance \sigma. Then we know if we try to set that parameter to a value less than \sigma the system does not know for certain whether we’re asking for zero or the small number.

In order to get around this “smearing” problem, we only allow parameters that are (almost always) distinguishable. This means that in the whole allowed parameter value interval (-1,+1) we must define a set of k allowed values. We require that k=2^n-1 and define n to be the number of bits of precision allowed by hardware. Here is a picture showing how this works for n=2 bits of precision, for allowed values of -1, 0, +1. The y-axis here is the probability of the value of the parameter being the indicated value for a large number of attempts to set to the three allowed values.

precision

In this case the spreads in trying to set to a parameter value of zero and +1 do not appreciably overlap. In practice we have to be careful about how to define the required spread but this is the basic idea.

If we abstract away from the physical origin of these offsets and think about how to model this finite precision, the first order approximation is to only allow the parameters h_j and J_{ij} in the objective function take on the k discrete values -2^{n-1}+1, -2^{n-1}+2, ..., 0, ..., +2^{n-1}-1. For example, if we specify 4 bits of precision in hardware, this means that the  h_j and J_{ij} values in the objective function can only take on values in the 15-element set \{ -7, -6, -5, -4, -3, -2, -1 , 0, +1, +2, +3, +4, +5, +6, +7 \}.