In the previous post I introduced the concept of an allowed edge set in hardware, and presented the edge set we chose for our current processor designs. I mentioned that a reduction in edges reduces the capability of the processor. In this post I will explain why this is and how it affects the performance of the system.
To begin, note that in the design we’re building, it is useful to refer to the physical qubits as vertices and physical couplers as edges in a graph . Call the hardware graph. The hardware graphs for 8, 32 and 128-qubit systems were explicitly shown in the previous post, although in principle many other types of hardware graph (say for example the one we used for the 28-qubit Leda or 16-qubit Cerberus chips) are feasible. Internally here we call graphs formed of tiling the plane with blocks Chimera graphs.
Assume a hardware graph has been defined. For the following example, we’ll use the 32-qubit version for simplicity. Here it is for reference:
We now want to use this hardware graph for solving problems. Assume that we wish to use the chip to run a quantum adiabatic algorithm to solve for the global minimum of
ie, a fully connected (all non-zero) graph on 32 variables. Can we do this? The answer is no. To see why, assign to this problem a graph , with vertices identified with variables and edges drawn whenever a non-zero is required. For this example, this graph is the fully connected (all non-zero) graph on 32 vertices. Call the problem graph.
A problem graph can be solved natively in the hardware if can be embedded in the hardware graph . A graph embedding is in general a one to many map between vertices and edges in to vertices and edges in . In the example above, cannot be embedded in , as they have the same number of vertices, while is fully connected and is not.
Let’s try another problem graph. Let’s say we want to solve for the global minimum of
where and the problem edge set are arbitrary. What can we say about whether or not we can embed the problem graph into a hardware graph? Here are some things that we know:
- You can always embed a fully connected graph on variables into a Chimera graph with variables.
- You can never embed a problem graph that has either more vertices or more edges than a hardware graph.
- Any instance that is in the middle needs to be determined by solving what is in general a hard problem in its own right, that is does embed in . In practice if you wanted to operate in this regime, you’d probably have a fast heuristic embedder to see if a problem natively embeds.
Based on these observations, there are three obvious ways these chips can be operated.
- Use the chip as a solver for complete graphs, where the algorithm calling the chip has a hard-coded embedding scheme for any problem edge set. This option has the advantage that it is the most flexible for inclusion in the master algorithm (no restrictions on problem edge set), and there is no runtime cost for embedding. Its key disadvantage is that the number of variables of problems you can solve in this mode is upper bounded by roughly the square root of the number of qubits in hardware, which is a significant cost with the current chips & their low qubit count.
- Use the chip as a solver for problem graphs that exactly match the hardware graph by making this a hard constraint in the master algorithm. In other words, as an algorithm designer you are constrained to only pose problems to the hardware that have problem graphs that exactly match the hardware graph. This has the advantage of having trivial hard-coded embedding and maximally using the resources of the hardware. Its key disadvantage is a severe loss of flexibility in algorithms design possibilities.
- Use the chip for arbitrary problem sizes by running an embedding heuristic each time a possibly embeddable graph is generated by the master algorithm. This has the advantage of a significant gain in flexibility for algorithm designers. The main disadvantage is the increased runtime burden of having to compute embeddings on the fly.
My own strong preference is for #2 above. The loss of flexibility in algorithms design is a big problem, but we’ve been able to find ways to build algorithms respecting fixed hardware graphs already so I think that the advantages of #2 carry the day, at least in the short term.
So how does embedding work in practice? I will defer delving too deeply into the technical details here, leaving this to this publication. Here is a picture from that paper. The problem graph is on the left, and a simple hardware graph (not the one we use) is on the right. This is an example of a successful embedding.
One very interesting subject that Vicky’s paper discusses is that the embedding details actually affect the runtime of the quantum adiabatic algorithm for solving the problem, not just whether the problem can be natively solved. In other words, two different embeddings which are both completely valid can have vastly differing adiabatic runtimes. The reason for this is that the runtime is a function of the minimum gap between ground and first excited states, and the matrix element between them. Both of these are functions of the actual embedding and embedding parameters used. In fact, it is likely that different embedding strategies could change the asymptotic scaling of quantum adiabatic algorithms. As far as I know, no-one is currently investigating this aspect of the scaling of these algorithms–this is a feature of operation in a regime where the AQC does not natively support full inter-qubit connection (ie. any real machine).
To finish this introduction, note that the 128-qubit Rainier chip can embed two complete graphs of size and as shown here:
Here the blue and red circles are couplers, the black lines are qubits (a 4×4 array of 8-qubit unit cells), the yellow circles are individual qubit biases (the ). The dashed lines show regions where fully connected graphs can be embedded. If an algorithm designer wanted to use this chip in mode #1 above (a complete graph solver), you’d architect the system to feed this chip 16 and/or 12 variable problems. This illustrates the significant hit using the chips as complete solvers gives. You go from having 128 variables to 16. This is the primary reason why I think using this scale of system as a fully connected graph solver is not the best idea.