<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Hack The Multiverse</title>
	<atom:link href="http://dwave.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://dwave.wordpress.com</link>
	<description>Programming quantum computers for fun and profit</description>
	<lastBuildDate>Tue, 18 Jun 2013 04:38:54 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='dwave.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Hack The Multiverse</title>
		<link>http://dwave.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://dwave.wordpress.com/osd.xml" title="Hack The Multiverse" />
	<atom:link rel='hub' href='http://dwave.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Sparse coding on D-Wave hardware: finding an optimal structured dictionary</title>
		<link>http://dwave.wordpress.com/2013/06/06/sparse-coding-on-d-wave-hardware-finding-an-optimal-structured-dictionary/</link>
		<comments>http://dwave.wordpress.com/2013/06/06/sparse-coding-on-d-wave-hardware-finding-an-optimal-structured-dictionary/#comments</comments>
		<pubDate>Fri, 07 Jun 2013 00:00:52 +0000</pubDate>
		<dc:creator>Geordie</dc:creator>
				<category><![CDATA[D-Wave Science & Technology]]></category>
		<category><![CDATA[Quantum computer programming]]></category>
		<category><![CDATA[Sparse Coding]]></category>

		<guid isPermaLink="false">http://dwave.wordpress.com/?p=2574</guid>
		<description><![CDATA[I spend most of my time thinking about machine intelligence. I would like to build machines that can think and act like we do. There are many hard problems to solve before we get there. A thing I&#8217;ve been thinking &#8230; <a href="http://dwave.wordpress.com/2013/06/06/sparse-coding-on-d-wave-hardware-finding-an-optimal-structured-dictionary/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2574&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I spend most of my time thinking about machine intelligence. I would like to build machines that can think and act like we do. There are many hard problems to solve before we get there.</p>
<div id="attachment_2584" class="wp-caption alignright" style="width: 229px"><a href="http://dwave.files.wordpress.com/2013/06/crawlingcelegans.gif"><img class="size-full wp-image-2584" alt="C. elegans is unsegmented, vermiform, and bilaterally symmetrical, with a cuticle integument, four main epidermal cords and a fluid-filled pseudocoelomate cavity... pfn'glui!!!!" src="http://dwave.files.wordpress.com/2013/06/crawlingcelegans.gif?w=584"   /></a><p class="wp-caption-text">C. elegans is vermiform, with a cuticle integument and a fluid-filled pseudocoelomate cavity. There are two sexes: male and hermaphrodite.</p></div>
<p>A thing I&#8217;ve been thinking about recently is how the limits of cognition matter in understanding cognition itself. Clearly you can have examples where the machinery is not sophisticated enough. For example a C. elegans roundworm is not going to reverse engineer its own neural system any time soon, even though it is very simple (<a href="http://www.openworm.org/">although the descendents of openworm might</a>&#8230;).</p>
<p>Openworm is a really interesting idea. I hope it does well and succeeds. Although the idea of the first life forms on the internet being worms, that OBVIOUSLY will grow super intelligent and take over the universe and consume all its atoms <a href="http://www.youtube.com/watch?v=4U4N9MQXcgc">making bigger and bigger Harlem Shake videos</a>, is a little off-putting from a human perspective. <a href="http://en.wikipedia.org/wiki/De_Vermis_Mysteriis">De Vermis Mysteriis</a>&#8230;</p>
<p><strong>We are so smart, s-m-r-t</strong></p>
<div id="attachment_2586" class="wp-caption alignright" style="width: 222px"><a href="http://dwave.files.wordpress.com/2013/06/homer.jpg"><img class="size-medium wp-image-2586" alt="Hoomans are smrt." src="http://dwave.files.wordpress.com/2013/06/homer.jpg?w=212&#038;h=300" width="212" height="300" /></a><p class="wp-caption-text">Hoomans are smrt.</p></div>
<p>Imagine the most intelligent entity possible. Would that thing be able to understand its own cognition? As you crank up a cognitive system&#8217;s ability to model its environment, presumably the cognition system itself gets more difficult to understand.</p>
<p class="size-medium wp-image-2586">Is the human cognitive system both smart enough and simple enough to self reverse engineer? It&#8217;s probably in the right zone. We seem to be smart enough to understand enough of the issues to take a good run at the problem, because our cognition system is simple enough to not be beyond the ability of our cognition system itself to understand. <a href="http://en.wikipedia.org/wiki/I_Am_a_Strange_Loop">How&#8217;s that for some Hofstadter style recursion</a>.</p>
<p>Anyway enough with the <a href="http://www.deepthoughtsbyjackhandey.com/">Deep Thoughts</a>. Let&#8217;s do some math! Math is fun. Not as fun as universe eating worms. But solving this problem well is important. At least to me and my unborn future vermiform army. Maybe you can help solve it.</p>
<p><strong>A short review of L0-norm sparse coding with structured dictionaries</strong></p>
<p>Last time we discussed sparse coding on the hardware, I introduced an idea for getting around a problem in using D-Wave style processor architectures effectively &#8211; the <a href="http://dwave.wordpress.com/2013/04/17/sparse-coding-on-d-wave-hardware-things-that-dont-work/">mismatch between the connectivity of the problem we want to solve and the connectivity of the hardware</a>.</p>
<p>Let&#8217;s begin by first reviewing the idea. If you&#8217;d like a more in-depth overview, <a href="http://dwave.wordpress.com/2013/04/29/sparse-coding-on-d-wave-hardware-structured-dictionaries/">here is the original post I wrote about it</a>. Here is the condensed version.</p>
<p>Given</p>
<ol>
<li>A set of <img src='http://s0.wp.com/latex.php?latex=S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S' title='S' class='latex' /> data objects <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bz%7D_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{z}_s' title='&#92;vec{z}_s' class='latex' />, where each <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bz%7D_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{z}_s' title='&#92;vec{z}_s' class='latex' /> is a real valued vector with <img src='http://s0.wp.com/latex.php?latex=N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N' title='N' class='latex' /> components;</li>
<li>An <img src='http://s0.wp.com/latex.php?latex=N+x+K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N x K' title='N x K' class='latex' /> real valued matrix <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' />, where <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> is the number of dictionary atoms we choose, and we define its <img src='http://s0.wp.com/latex.php?latex=k%5E%7Bth%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k^{th}' title='k^{th}' class='latex' /> column to be the vector <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_k' title='&#92;vec{d}_k' class='latex' />;</li>
<li>A <img src='http://s0.wp.com/latex.php?latex=K+x+S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K x S' title='K x S' class='latex' /> binary valued matrix <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{W}' title='&#92;hat{W}' class='latex' />, whose matrix elements are <img src='http://s0.wp.com/latex.php?latex=w_%7Bks%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='w_{ks}' title='w_{ks}' class='latex' />;</li>
<li>And a real number <img src='http://s0.wp.com/latex.php?latex=%5Clambda&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lambda' title='&#92;lambda' class='latex' />, which is called the regularization parameter,</li>
</ol>
<p>Find <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{W}' title='&#92;hat{W}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' /> that minimize</p>
<p><img src='http://s0.wp.com/latex.php?latex=G%28%5Chat%7BW%7D%2C+%5Chat%7BD%7D+%3B+%5Clambda%29+%3D+%5Csum_%7Bs%3D1%7D%5ES+%7C%7C+%5Cvec%7Bz%7D_%7Bs%7D+-+%5Csum_%7Bk%3D1%7D%5E%7BK%7D+w_%7Bks%7D+%5Cvec%7Bd%7D_k+%7C%7C%5E2+%2B+%5Clambda+%5Csum_%7Bs%3D1%7D%5ES+%5Csum_%7Bk%3D1%7D%5E%7BK%7D+w_%7Bks%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G(&#92;hat{W}, &#92;hat{D} ; &#92;lambda) = &#92;sum_{s=1}^S || &#92;vec{z}_{s} - &#92;sum_{k=1}^{K} w_{ks} &#92;vec{d}_k ||^2 + &#92;lambda &#92;sum_{s=1}^S &#92;sum_{k=1}^{K} w_{ks}' title='G(&#92;hat{W}, &#92;hat{D} ; &#92;lambda) = &#92;sum_{s=1}^S || &#92;vec{z}_{s} - &#92;sum_{k=1}^{K} w_{ks} &#92;vec{d}_k ||^2 + &#92;lambda &#92;sum_{s=1}^S &#92;sum_{k=1}^{K} w_{ks}' class='latex' /></p>
<p>subject to the constraints that <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_j+%5Ccdot+%5Cvec%7Bd%7D_m+%3D+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m = 0' title='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m = 0' class='latex' /> for all pairs <img src='http://s0.wp.com/latex.php?latex=j%2Cm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j,m' title='j,m' class='latex' /> that are not connected in the quantum chip being used.</p>
<p>To solve this problem, we use block coordinate descent, which works like this:</p>
<ol>
<li>First, we generate a random dictionary <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' />, subject to meeting the orthogonality constraints we&#8217;ve imposed on the dictionary atoms.</li>
<li>Assuming these fixed dictionaries, we solve the optimization problem for the dictionary atoms <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{W}' title='&#92;hat{W}' class='latex' />. These optimization problems are now Chimera-structured QUBOs that fit exactly onto the hardware by construction.</li>
<li>Now we fix the weights to these values, and find the optimal dictionary <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' />, again subject to our constraints.</li>
</ol>
<p>We then iterate steps 2 and 3 until <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> converges to a minimum, keeping in mind that this problem is jointly non-convex and the minimum you get will be a local minimum. Each restart of the whole algorithm from a new standing point will lead to a different local minimum, so a better answer can be had by running this procedure several times.</p>
<p><strong>Step 3: Finding an optimal structured dictionary given fixed weights</strong></p>
<p>The hard problem is Step 3 above. Here the weights <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{W}' title='&#92;hat{W}' class='latex' /> are fixed, and we want to find an optimal structured dictionary. Here is the formal statement of the problem.</p>
<p>Given</p>
<ol>
<li>An <img src='http://s0.wp.com/latex.php?latex=N+x+S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N x S' title='N x S' class='latex' /> real valued matrix <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{Z}' title='&#92;hat{Z}' class='latex' />, where <img src='http://s0.wp.com/latex.php?latex=S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S' title='S' class='latex' /> is the number of data objects, and we define the <img src='http://s0.wp.com/latex.php?latex=s%5E%7Bth%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s^{th}' title='s^{th}' class='latex' /> column to be the <img src='http://s0.wp.com/latex.php?latex=s%5E%7Bth%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s^{th}' title='s^{th}' class='latex' /> data object <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bz%7D_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{z}_s' title='&#92;vec{z}_s' class='latex' />, where each <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bz%7D_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{z}_s' title='&#92;vec{z}_s' class='latex' /> is a real valued vector with <img src='http://s0.wp.com/latex.php?latex=N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N' title='N' class='latex' /> components, and the matrix elements of <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BZ%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{Z}' title='&#92;hat{Z}' class='latex' /> are <img src='http://s0.wp.com/latex.php?latex=z_%7Bns%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='z_{ns}' title='z_{ns}' class='latex' />;</li>
<li>An <img src='http://s0.wp.com/latex.php?latex=N+x+K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N x K' title='N x K' class='latex' /> real valued matrix <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' />, where <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> is the number of dictionary atoms we choose, and we define its <img src='http://s0.wp.com/latex.php?latex=k%5E%7Bth%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k^{th}' title='k^{th}' class='latex' /> column to be the vector <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_k' title='&#92;vec{d}_k' class='latex' />, and the matrix elements of <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' /> are <img src='http://s0.wp.com/latex.php?latex=d_%7Bnk%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d_{nk}' title='d_{nk}' class='latex' />;</li>
<li>And a <img src='http://s0.wp.com/latex.php?latex=K+x+S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K x S' title='K x S' class='latex' /> binary valued matrix <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{W}' title='&#92;hat{W}' class='latex' /> with matrix elements <img src='http://s0.wp.com/latex.php?latex=w_%7Bks%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='w_{ks}' title='w_{ks}' class='latex' />;</li>
</ol>
<p>Find <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' /> that minimizes</p>
<p><img src='http://s0.wp.com/latex.php?latex=G%5E%7B%2A%7D%28%5Chat%7BD%7D%29+%3D+%5Csum_%7Bs%3D1%7D%5ES+%7C%7C+%5Cvec%7Bz%7D_%7Bs%7D+-+%5Csum_%7Bk%3D1%7D%5E%7BK%7D+w_%7Bks%7D+%5Cvec%7Bd%7D_k+%7C%7C%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G^{*}(&#92;hat{D}) = &#92;sum_{s=1}^S || &#92;vec{z}_{s} - &#92;sum_{k=1}^{K} w_{ks} &#92;vec{d}_k ||^2' title='G^{*}(&#92;hat{D}) = &#92;sum_{s=1}^S || &#92;vec{z}_{s} - &#92;sum_{k=1}^{K} w_{ks} &#92;vec{d}_k ||^2' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=%7B%3D%5Csum_%7Bs%3D1%7D%5ES+%5Csum_%7Bn%3D1%7D%5EN+%28z_%7Bns%7D-%5Csum_%7Bk%3D1%7D%5E%7BK%7D+w_%7Bks%7D+d_%7Bnk%7D%29%5E2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='{=&#92;sum_{s=1}^S &#92;sum_{n=1}^N (z_{ns}-&#92;sum_{k=1}^{K} w_{ks} d_{nk})^2}' title='{=&#92;sum_{s=1}^S &#92;sum_{n=1}^N (z_{ns}-&#92;sum_{k=1}^{K} w_{ks} d_{nk})^2}' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=%7B%3D%7C%7C%5Chat%7BZ%7D-%5Chat%7BD%7D+%5Chat%7BW%7D%7C%7C%5E2%3DTr%28%5Chat%7BA%7D%5ET+%5Chat%7BA%7D%29%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='{=||&#92;hat{Z}-&#92;hat{D} &#92;hat{W}||^2=Tr(&#92;hat{A}^T &#92;hat{A})}' title='{=||&#92;hat{Z}-&#92;hat{D} &#92;hat{W}||^2=Tr(&#92;hat{A}^T &#92;hat{A})}' class='latex' /></p>
<p>where <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BA%7D%3D%5Chat%7BZ%7D-%5Chat%7BD%7D+%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{A}=&#92;hat{Z}-&#92;hat{D} &#92;hat{W}' title='&#92;hat{A}=&#92;hat{Z}-&#92;hat{D} &#92;hat{W}' class='latex' />, subject to the constraints that <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_j+%5Ccdot+%5Cvec%7Bd%7D_m+%3D+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m = 0' title='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m = 0' class='latex' /> for all pairs <img src='http://s0.wp.com/latex.php?latex=j%2Cm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j,m' title='j,m' class='latex' /> that are not connected in the quantum chip being used.</p>
<p>What makes this problem hard is that the constraints on the dictionary atoms are non-linear, and there are a lot of them (one for each pair of variables not connected in hardware).</p>
<p><strong>Ideas for attacking this problem</strong></p>
<p>I&#8217;m not sure what the best approach is for trying to solve this problem. Here are some observations:</p>
<ul>
<li>We want to be operating in the regime where <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{W}' title='&#92;hat{W}' class='latex' /> is sparse. In this limit most of the <img src='http://s0.wp.com/latex.php?latex=w_%7Bks%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='w_{ks}' title='w_{ks}' class='latex' /> will be zero. Because the coupling term is quadratic in <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{W}' title='&#92;hat{W}' class='latex' />&#8216;s matrix elements, for all L0-norm sparse coding problems most of the coupling terms are going to be zero. This suggests a possible strategy where we could first solve for <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' /> assuming that the quadratic term was zero, and then whatever we do next could use this as an initial starting point.</li>
<li>There are some types of matrix operations that would not mess up the structure of the dictionary but would allow parametrization of changes within the allowed space. If we could then optimize over those parameters we could take care of the constraints without having to do any work to enforce them.</li>
<li>There is a local search heuristic where you can optimize each dictionary atom <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_k' title='&#92;vec{d}_k' class='latex' /> moving from <img src='http://s0.wp.com/latex.php?latex=k%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=1' title='k=1' class='latex' /> to <img src='http://s0.wp.com/latex.php?latex=k%3DK&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k=K' title='k=K' class='latex' /> in order while keeping the other columns fixed, and just iterating until convergence (you have to do some rearranging to ensure the orthogonality is maintained throughout using the null space idea in the previous post). This will probably by itself not be a great strategy and will probably get you stuck in local optima but maybe it would work OK.</li>
</ul>
<p>What do you think? I might be able to get you time on a machine if you can come up with an interesting way to solve this problem effectively&#8230; <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dwave.wordpress.com/2574/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dwave.wordpress.com/2574/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2574&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dwave.wordpress.com/2013/06/06/sparse-coding-on-d-wave-hardware-finding-an-optimal-structured-dictionary/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/373bb2536951daebfdc760c9099af9af?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Geordie</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/06/crawlingcelegans.gif" medium="image">
			<media:title type="html">C. elegans is unsegmented, vermiform, and bilaterally symmetrical, with a cuticle integument, four main epidermal cords and a fluid-filled pseudocoelomate cavity... pfn&#039;glui!!!!</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/06/homer.jpg?w=212" medium="image">
			<media:title type="html">Hoomans are smrt.</media:title>
		</media:content>
	</item>
		<item>
		<title>Moonshot thinking</title>
		<link>http://dwave.wordpress.com/2013/05/28/moonshot-thinking/</link>
		<comments>http://dwave.wordpress.com/2013/05/28/moonshot-thinking/#comments</comments>
		<pubDate>Tue, 28 May 2013 14:37:48 +0000</pubDate>
		<dc:creator>Geordie</dc:creator>
				<category><![CDATA[Media]]></category>

		<guid isPermaLink="false">http://dwave.wordpress.com/?p=2572</guid>
		<description><![CDATA[Here is a very cool video from friends at Google&#8230; couldn&#8217;t agree more. The world needs more big dreams. You don&#8217;t spent your time being bothered that you can&#8217;t teleport from here to Japan. Because there&#8217;s a part of you &#8230; <a href="http://dwave.wordpress.com/2013/05/28/moonshot-thinking/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2572&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Here is a very cool video from friends at Google&#8230; couldn&#8217;t agree more. The world needs more big dreams.</p>
<blockquote><p>You don&#8217;t spent your time being bothered that you can&#8217;t teleport from here to Japan. Because there&#8217;s a part of you that thinks it&#8217;s impossible. Moonshot thinking is choosing to be bothered by that&#8230; our ambitions are a glass ceiling on what we can accomplish.       &#8212; Astro Teller</p></blockquote>
<p><span class='embed-youtube' style='text-align:center; display: block;'><iframe class='youtube-player' type='text/html' width='584' height='359' src='http://www.youtube.com/embed/0uaquGZKx_0?version=3&#038;rel=1&#038;fs=1&#038;showsearch=0&#038;showinfo=1&#038;iv_load_policy=1&#038;wmode=transparent' frameborder='0'></iframe></span></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dwave.wordpress.com/2572/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dwave.wordpress.com/2572/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2572&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dwave.wordpress.com/2013/05/28/moonshot-thinking/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/373bb2536951daebfdc760c9099af9af?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Geordie</media:title>
		</media:content>
	</item>
		<item>
		<title>New Nature Communications paper and a bonus NPR interview</title>
		<link>http://dwave.wordpress.com/2013/05/22/new-nature-communications-paper-and-a-bonus-npr-interview/</link>
		<comments>http://dwave.wordpress.com/2013/05/22/new-nature-communications-paper-and-a-bonus-npr-interview/#comments</comments>
		<pubDate>Wed, 22 May 2013 15:16:25 +0000</pubDate>
		<dc:creator>Geordie</dc:creator>
				<category><![CDATA[D-Wave Science & Technology]]></category>
		<category><![CDATA[Media]]></category>
		<category><![CDATA[Musings about stuff]]></category>
		<category><![CDATA[World Domination]]></category>

		<guid isPermaLink="false">http://dwave.wordpress.com/?p=2565</guid>
		<description><![CDATA[One of the most important things to try to understand about real quantum computers is how they behave in the presence of environments. Sometimes these environments are called &#8216;baths&#8217; by physicists. I like this term because it&#8217;s really evocative of &#8230; <a href="http://dwave.wordpress.com/2013/05/22/new-nature-communications-paper-and-a-bonus-npr-interview/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2565&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div id="attachment_2566" class="wp-caption alignright" style="width: 160px"><a href="http://dwave.files.wordpress.com/2013/05/qubits_love_the_bath.jpg"><img class="size-thumbnail wp-image-2566" alt="Woman Resting in Bath" src="http://dwave.files.wordpress.com/2013/05/qubits_love_the_bath.jpg?w=150&#038;h=120" width="150" height="120" /></a><p class="wp-caption-text">Baths can be beneficial, even for qubits.</p></div>
<p>One of the most important things to try to understand about real quantum computers is how they behave in the presence of environments. Sometimes these environments are called &#8216;baths&#8217; by physicists. I like this term because it&#8217;s really evocative of what&#8217;s physically going on. You can imagine any quantum system you&#8217;re building as always being &#8216;bathed&#8217; in the glow of these environments.</p>
<p>It&#8217;s a very interesting fact that you can never get away from these baths, even in principle. No object in our universe &#8212; as far as we know &#8212; can be completely isolated from the rest of the universe. <a href="http://www.youtube.com/watch?v=1OLz6uUuMp8">As Lawrence Krauss so eloquently describes, even &#8216;nothing&#8217; is something.</a></p>
<p>Even if we were to build a quantum computer in the depths of interstellar space, and cool it to zero temperature, it would still be bathed in a bath formed of the virtual particles that boil and seethe in the fabric of space-time itself. There is no escape from our connections to the physical universe.</p>
<p><strong>A Lovecraftian aside that has nothing to do with the paper or the NPR interview</strong></p>
<p>By the way you Lovecraft fans out there &#8212; here is a famous bit from <a href="http://en.wikipedia.org/wiki/The_Dream-Quest_of_Unknown_Kadath">The Dream-Quest of Unknown Kadath</a>:</p>
<blockquote><p>[O]utside the ordered universe [is] that amorphous blight of nethermost confusion which blasphemes and bubbles at the center of all infinity—the boundless daemon sultan Azathoth, whose name no lips dare speak aloud, and who gnaws hungrily in inconceivable, unlighted chambers beyond time and space amidst the muffled, maddening beating of vile drums and the thin monotonous whine of accursed flutes.</p></blockquote>
<div id="attachment_2567" class="wp-caption alignleft" style="width: 223px"><a href="http://dwave.files.wordpress.com/2013/05/nuclear_chaos.jpg"><img class=" wp-image-2567 " alt="&quot;Azathoth has existed since the universe began. He dwells outside normal time and space. He is blind, idiotic, and indifferent.&quot; Now go watch Krauss describe &quot;Something from Nothing.&quot;" src="http://dwave.files.wordpress.com/2013/05/nuclear_chaos.jpg?w=213&#038;h=300" width="213" height="300" /></a><p class="wp-caption-text">&#8220;Azathoth has existed since the universe began. He dwells outside normal time and space. He is blind, idiotic, and indifferent.&#8221; Now go watch Krauss describe &#8220;Something from Nothing&#8221; and tell me they&#8217;re not talking about the same thing!</p></div>
<p>Lovecraft had an uncanny ability to grok modern concepts from physics and weave them into his stories. His descriptions of Azathoth, and the physics underlying Krauss&#8217; explanations of what seems to be physically occurring deep inside the fabric of spacetime, are just too close to not point out. Of course they use different language. But think carefully about the context in which these ideas are being delivered. (Am I stretching making a connection between Krauss&#8217; something that lives in nothing and Lovecraft&#8217;s description of Azathoth? Definitely. But I think it&#8217;s an interesting thing to think about how these two descriptions might not be incompatible.)</p>
<p><strong>Back to qubits and baths</strong></p>
<p>Anyway back to qubits and baths. This is not just fascinating science (although it is that). It is also a fundamentally important issue in constructing computing machines that harness quantum mechanics. Because all quantum systems MUST live in baths, it&#8217;s extremely important to understand in detail how these baths affect their behavior.</p>
<p>Not so long ago, it was suspected that these baths would always destroy the curious properties of quantum mechanics for large objects. <a href="http://www.nature.com/nature/journal/v406/n6791/abs/406043a0.html">But then this turned out to not be true. </a>The first large objects where quantum behavior remained even in the presence of really big and hot baths were loops of superconducting metal &#8212; the great &#8211; great &#8211; great grandparents of our qubits.</p>
<p>Now the question of what effect these baths really have on large collections of large objects is being debated, and goes to the heart of many of the technical issues in building useful quantum computers.</p>
<p><strong>The paper that just published</strong></p>
<p>The paper that just published is called <a href="http://www.nature.com/ncomms/journal/v4/n5/full/ncomms2920.html">Thermally assisted quantum annealing of a 16-qubit problem</a>.</p>
<p>It describes what I believe to be a key result in advancing this understanding. It looks very carefully at what happens to a quantum system in the presence of a bath, where both the quantum system and the bath have been exquisitely characterized. As was the case when macroscopic quantum coherence was first observed, the results are counter-intuitive.</p>
<p>Here is the abstract from the paper.</p>
<blockquote><p>Efforts to develop useful quantum computers have been blocked primarily by environmental noise. Quantum annealing is a scheme of quantum computation that is predicted to be more robust against noise, because despite the thermal environment mixing the system’s state in the energy basis, the system partially retains coherence in the computational basis, and hence is able to establish well-defined eigenstates. Here we examine the environment’s effect on quantum annealing using 16 qubits of a superconducting quantum processor. For a problem instance with an isolated small-gap anticrossing between the lowest two energy levels, we experimentally demonstrate that, even with annealing times eight orders of magnitude longer than the predicted single-qubit decoherence time, the probabilities of performing a successful computation are similar to those expected for a fully coherent system. Moreover, for the problem studied, we show that quantum annealing can take advantage of a thermal environment to achieve a speedup factor of up to 1,000 over a closed system.</p></blockquote>
<p>The key result is that for the specific type of bath acting on a real processor, the quantum effects required for quantum computation can successfully be tapped by protecting them in a specific way. Specifically &#8212; and this is a point that has caused much confusion &#8212; the decoherence time of the individual qubits, which is the time to decohere in the energy basis, does not set the timescale for losing quantum coherence in the measurement basis. Quantum coherence in the measurement basis (which is the resource tapped in this approach) is an equilibrium property of the system, as long as the bath is not so big and hot that well defined energy eigenstates disappear.</p>
<p>While the paper is primarily an experimental paper, the theory underlying all of this is very satisfactory in my view. Mohammad and his collaborators have developed a very good theoretical understanding of what really happens in real open quantum systems, and the agreement between these models and what is seen in the lab is striking.</p>
<p>So congratulations to all on this result.</p>
<p><strong>The NPR interview and my proudness at working &#8216;meatiest&#8217; into a national radio program</strong></p>
<p><a href="http://www.wbur.org/npr/185532608/quantum-or-not-new-supercomputer-is-certainly-something-else">On a mostly unrelated note, here is a radio piece that Geoff Brumfiel of NPR did recently.</a> It is of note because I managed to work in the word &#8216;meatiest&#8217; into the discussion, of which I am understandably quite proud.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dwave.wordpress.com/2565/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dwave.wordpress.com/2565/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2565&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dwave.wordpress.com/2013/05/22/new-nature-communications-paper-and-a-bonus-npr-interview/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/373bb2536951daebfdc760c9099af9af?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Geordie</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/05/qubits_love_the_bath.jpg?w=150" medium="image">
			<media:title type="html">Woman Resting in Bath</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/05/nuclear_chaos.jpg?w=213" medium="image">
			<media:title type="html">&#34;Azathoth has existed since the universe began. He dwells outside normal time and space. He is blind, idiotic, and indifferent.&#34; Now go watch Krauss describe &#34;Something from Nothing.&#34;</media:title>
		</media:content>
	</item>
		<item>
		<title>The Google / NASA Quantum Artificial Intelligence Lab</title>
		<link>http://dwave.wordpress.com/2013/05/16/the-quantum-artificial-intelligence-lab/</link>
		<comments>http://dwave.wordpress.com/2013/05/16/the-quantum-artificial-intelligence-lab/#comments</comments>
		<pubDate>Fri, 17 May 2013 04:48:49 +0000</pubDate>
		<dc:creator>Geordie</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[D-Wave Science & Technology]]></category>
		<category><![CDATA[World Domination]]></category>

		<guid isPermaLink="false">http://dwave.wordpress.com/?p=2549</guid>
		<description><![CDATA[Update 20/05/2013: Here is how you can apply for time on the system. Exciting! Applying for time on the D-Wave Two at the Quantum Artificial Intelligence Lab Update 16/05/2013: Here is some press coverage of the announcement. Google Buys a &#8230; <a href="http://dwave.wordpress.com/2013/05/16/the-quantum-artificial-intelligence-lab/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2549&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><strong>Update 20/05/2013:</strong> Here is how you can apply for time on the system. Exciting!</p>
<ul>
<li><a href="http://www.usra.edu/quantum/">Applying for time on the D-Wave Two at the Quantum Artificial Intelligence Lab</a></li>
</ul>
<p><strong>Update 16/05/2013:</strong> Here is some press coverage of the announcement.</p>
<ul>
<li><a href="http://bits.blogs.nytimes.com/2013/05/16/google-buys-a-quantum-computer/#postComment">Google Buys a Quantum Computer; Quentin Hardy, New York Times</a></li>
<li><a href="http://www.forbes.com/sites/alexknapp/2013/05/16/nasa-and-google-partner-to-purchase-a-d-wave-quantum-computer/">NASA and Google Purchase a D-Wave Quantum Computer; Alex Knapp, Forbes</a></li>
<li><a href="http://www.nature.com/news/google-and-nasa-snap-up-quantum-computer-1.12999">Google and NASA Snap up Quantum Computer; Nicola Jones, Nature</a></li>
<li><a href="http://www.technologyreview.com/news/514846/google-and-nasa-launch-quantum-computing-ai-lab/">Google and NASA Launch Quantum AI Lab; Charles Choi, MIT Technology Review</a></li>
<li><a href="http://online.wsj.com/public/page/news-tech-technology.html">Google Joins Supercomputing Project; Wall Street Journal (subscription required)</a></li>
</ul>
<p>When D-Wave was founded in 1999, our objective was to build the world&#8217;s first useful quantum computer.</p>
<p>The way I thought about it was that we&#8217;d have succeeded if: (a) <a href="http://www.youtube.com/watch?v=Fls523cBD7E">someone bought one for more than $10M</a>; (b) <a href="http://arxiv.org/abs/1304.4595">it was clearly using quantum mechanics to do its thing</a>; and (c) <a href="https://www.amherst.edu/aboutamherst/news/faculty/node/466477">it was better at something than any other option available</a>. Now all of these have been accomplished, and the original objectives that we&#8217;d set for ourselves have all been met.</p>
<div id="attachment_2555" class="wp-caption alignright" style="width: 250px"><a href="http://dwave.files.wordpress.com/2013/05/picture-52.jpg"><img class=" wp-image-2555 " alt="Me, Suzanne Gildert, Hartmut and Eddie Farhi at QIP-2010." src="http://dwave.files.wordpress.com/2013/05/picture-52.jpg?w=240&#038;h=135" width="240" height="135" /></a><p class="wp-caption-text">A historic shot? Hartmut and friends at QIP-2010.</p></div>
<p>As the hardware matured, we began exploring ways to use its special capabilities. One of the first people I met who was also interested in this problem was <a href="http://en.wikipedia.org/wiki/Hartmut_Neven">Dr. Hartmut Neven</a>, who works at Google. Hartmut is a world leading expert in computer vision, and believed that there might be a role for our technology in computer vision and more generally machine learning.</p>
<p>Machine learning is an important subfield of artificial intelligence. While it is very difficult to even define what intelligence is (<a href="http://en.wikipedia.org/wiki/Intelligence#Definitions">there are even more definitions than for quantum computers</a>), one thing that is pretty much universally recognized is that anything we&#8217;d call intelligent must be able to learn. Trying to understand how learning from experience works has driven a lot of progress in understanding how human perception and cognition might work.</p>
<p>The Quantum Artificial Intelligence Lab&#8217;s mandate is to bring the world&#8217;s best machine learning experts together with the world&#8217;s most advanced quantum computers, and perform thousands of experiments to explore to what extent machine intelligence and cognition can be advanced by using these new types of computers.</p>
<p>The quest to understand intelligence is one of the most interesting and important challenges that humanity has ever faced. It is a daunting problem. But so was building quantum computers, or even conventional computers for that matter. I believe we can apply the same principles we used to solve the quantum computing problem to the (much harder) problem of understanding how intelligence works.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dwave.wordpress.com/2549/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dwave.wordpress.com/2549/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2549&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dwave.wordpress.com/2013/05/16/the-quantum-artificial-intelligence-lab/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/373bb2536951daebfdc760c9099af9af?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Geordie</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/05/picture-52.jpg?w=300" medium="image">
			<media:title type="html">Me, Suzanne Gildert, Hartmut and Eddie Farhi at QIP-2010.</media:title>
		</media:content>
	</item>
		<item>
		<title>First ever head to head win in speed for a quantum computer</title>
		<link>http://dwave.wordpress.com/2013/05/08/first-ever-head-to-head-win-in-speed-for-a-quantum-computer/</link>
		<comments>http://dwave.wordpress.com/2013/05/08/first-ever-head-to-head-win-in-speed-for-a-quantum-computer/#comments</comments>
		<pubDate>Wed, 08 May 2013 15:32:44 +0000</pubDate>
		<dc:creator>Geordie</dc:creator>
				<category><![CDATA[D-Wave Science & Technology]]></category>

		<guid isPermaLink="false">http://dwave.wordpress.com/?p=2540</guid>
		<description><![CDATA[Update 5/9/2013: Here is an article about the result from Quentin Hardy at the New York Times. Here&#8217;s one from Tom Simonite at MIT Technology Review. One from Nicola Jones at Nature. Jacob Aron at New Scientist. An interesting blog &#8230; <a href="http://dwave.wordpress.com/2013/05/08/first-ever-head-to-head-win-in-speed-for-a-quantum-computer/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2540&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><strong>Update 5/9/2013:</strong></p>
<ul>
<li><a href="http://bits.blogs.nytimes.com/2013/05/08/a-quantum-computer-aces-its-test/?smid=fb-share">Here is an article about the result from Quentin Hardy at the New York Times.</a></li>
<li><a href="http://www.technologyreview.com/view/514686/d-waves-quantum-computer-goes-to-the-races-wins/">Here&#8217;s one from Tom Simonite at MIT Technology Review.</a></li>
<li><a href="http://blogs.nature.com/news/2013/05/quantum-computer-passes-speed-test.html">One from Nicola Jones at Nature.</a></li>
<li><a href="http://www.newscientist.com/article/dn23519-commercial-quantum-computer-leaves-pc-in-the-dust.html">Jacob Aron at New Scientist.</a></li>
<li><a href="http://ajitjadhav.wordpress.com/2013/05/12/the-qc-pulls-ahead-of-the-cc/">An interesting blog post from Ajit Jadhav.</a></li>
</ul>
<p><strong>Update 5/15/2013:</strong></p>
<ul>
<li>An update from the conference from Cathy: &#8220;Computing Frontiers 2013 Best Paper Award: Experimental Evaluation of an Adiabatic Quantum Computation System for Combinatorial Optimization, by McGeoch and Wang.&#8221; Congratulations to Cathy and Carrie!</li>
</ul>
<p><strong>AMHERST, Mass.</strong>—<strong>A computer science professor at Amherst College</strong> who recently devised and conducted experiments to test the speed of a quantum computing system against conventional computing methods will soon be presenting a paper with her verdict: quantum computing is, “in some cases, really, really fast.”</p>
<p>“Ours is the first paper to my knowledge that compares the quantum approach to conventional methods using the same set of problems,” says Catherine McGeoch, the Beitzel Professor in Technology and Society (Computer Science) at Amherst. “I’m not claiming that this is the last word, but it’s a first word, a start in trying to sort out what it can do and can’t do.”</p>
<p>The quantum computer system she was testing, produced by <a href="http://www.dwavesys.com/en/dw_homepage.html">D-Wave</a> just outside Vancouver, BC, has a thumbnail-sized chip that is stored in a dilution refrigerator within a shielded cabinet at near absolute zero, or .02 degrees Kelvin in order to perform its calculations. Whereas conventional computing is binary, 1s and 0s get mashed up in quantum computing, and within that super-cooled (and non-observable) state of flux, a lightning-quick logic takes place, capable of solving problems thousands of times faster than conventional computing methods can, according to her findings.</p>
<p><a href="https://www.amherst.edu/aboutamherst/news/faculty/node/466477">Read the whole article here!</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dwave.wordpress.com/2540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dwave.wordpress.com/2540/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2540&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dwave.wordpress.com/2013/05/08/first-ever-head-to-head-win-in-speed-for-a-quantum-computer/feed/</wfw:commentRss>
		<slash:comments>56</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/373bb2536951daebfdc760c9099af9af?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Geordie</media:title>
		</media:content>
	</item>
		<item>
		<title>Bo Ewald joins D-Wave</title>
		<link>http://dwave.wordpress.com/2013/05/02/bo-ewald-joins-d-wave/</link>
		<comments>http://dwave.wordpress.com/2013/05/02/bo-ewald-joins-d-wave/#comments</comments>
		<pubDate>Thu, 02 May 2013 14:45:36 +0000</pubDate>
		<dc:creator>Geordie</dc:creator>
				<category><![CDATA[World Domination]]></category>

		<guid isPermaLink="false">http://dwave.wordpress.com/?p=2536</guid>
		<description><![CDATA[I&#8217;m very pleased to welcome Bo Ewald to D-Wave! Bo will lead our newly formed US business as President. Bo has a storied history in high performance computing with leadership roles at Los Alamos, Cray, Linux Networks, and Silicon Graphics.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2536&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://dwave.files.wordpress.com/2013/05/20130502091132enprnprn-d-wave-systems-robert-bo-ewald-90-1-1367485892mr.jpg"><img class="alignright size-thumbnail wp-image-2537" alt="D-WAVE SYSTEMS, INC. ROBERT &quot;BO&quot; EWALD" src="http://dwave.files.wordpress.com/2013/05/20130502091132enprnprn-d-wave-systems-robert-bo-ewald-90-1-1367485892mr.jpg?w=150&#038;h=150" width="150" height="150" /></a>I&#8217;m very pleased to welcome <a href="http://www.linkedin.com/pub/bo-ewald/29/72b/50a">Bo Ewald</a> to D-Wave! <a href="http://www.itnewsonline.com/showprnstory.php?storyid=268897">Bo will lead our newly formed US business as President.</a></p>
<p><a href="http://www.cisl.ucar.edu/dig/cuglog/summer97/text/1.cowboy.html">Bo has a storied history in high performance computing</a> with leadership roles at Los Alamos, Cray, Linux Networks, and Silicon Graphics.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dwave.wordpress.com/2536/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dwave.wordpress.com/2536/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2536&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dwave.wordpress.com/2013/05/02/bo-ewald-joins-d-wave/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/373bb2536951daebfdc760c9099af9af?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Geordie</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/05/20130502091132enprnprn-d-wave-systems-robert-bo-ewald-90-1-1367485892mr.jpg?w=150" medium="image">
			<media:title type="html">D-WAVE SYSTEMS, INC. ROBERT &#34;BO&#34; EWALD</media:title>
		</media:content>
	</item>
		<item>
		<title>Sparse coding on D-Wave hardware: structured dictionaries</title>
		<link>http://dwave.wordpress.com/2013/04/29/sparse-coding-on-d-wave-hardware-structured-dictionaries/</link>
		<comments>http://dwave.wordpress.com/2013/04/29/sparse-coding-on-d-wave-hardware-structured-dictionaries/#comments</comments>
		<pubDate>Mon, 29 Apr 2013 15:25:45 +0000</pubDate>
		<dc:creator>Geordie</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[D-Wave Science & Technology]]></category>
		<category><![CDATA[Learning to program the D-Wave One]]></category>
		<category><![CDATA[Quantum computer programming]]></category>
		<category><![CDATA[Sparse Coding]]></category>

		<guid isPermaLink="false">http://dwave.wordpress.com/?p=2508</guid>
		<description><![CDATA[The underlying problem we saw last time, that prevented us from using the hardware to compete with tabu on the cloud, was the mismatch of the connectivity of the problems sparse coding generates (which are fully connected) and the connectivity &#8230; <a href="http://dwave.wordpress.com/2013/04/29/sparse-coding-on-d-wave-hardware-structured-dictionaries/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2508&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>The underlying problem we saw last time, that prevented us from using the hardware to compete with tabu on the cloud, was the mismatch of the connectivity of the problems sparse coding generates (which are fully connected) and the connectivity of the hardware.</p>
<p>The source of this mismatch is the quadratic term in the objective function, which for the <img src='http://s0.wp.com/latex.php?latex=j%5E%7Bth%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j^{th}' title='j^{th}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=m%5E%7Bth%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m^{th}' title='m^{th}' class='latex' /> variables is proportional to <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_j+%5Ccdot+%5Cvec%7Bd%7D_m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m' title='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m' class='latex' />. The coupling terms are proportional to the dot product of the dictionary atoms.</p>
<p>Here&#8217;s an idea. What if we demand that <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_j+%5Ccdot+%5Cvec%7Bd%7D_m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m' title='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m' class='latex' /> has to be zero for all pairs of variables <img src='http://s0.wp.com/latex.php?latex=j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j' title='j' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> that are not connected in hardware? If we can achieve this structure in the dictionary, we get a very interesting result. Instead of being fully connected, the QUBOs with this restriction can be engineered to exactly match the underlying problem the hardware solves. If we can do this, we get closer to using the full power of the hardware.</p>
<p><strong>L0-norm sparse coding with structured dictionaries</strong></p>
<p>Here is the idea.</p>
<p>Given</p>
<ol>
<li>A set of <img src='http://s0.wp.com/latex.php?latex=S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S' title='S' class='latex' /> data objects <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bz%7D_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{z}_s' title='&#92;vec{z}_s' class='latex' />, where each <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bz%7D_s&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{z}_s' title='&#92;vec{z}_s' class='latex' /> is a real valued vector with <img src='http://s0.wp.com/latex.php?latex=N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N' title='N' class='latex' /> components;</li>
<li>An <img src='http://s0.wp.com/latex.php?latex=N+x+K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N x K' title='N x K' class='latex' /> real valued matrix <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' />, where <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> is the number of dictionary atoms we choose, and we define its <img src='http://s0.wp.com/latex.php?latex=k%5E%7Bth%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k^{th}' title='k^{th}' class='latex' /> column to be the vector <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_k&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_k' title='&#92;vec{d}_k' class='latex' />;</li>
<li>A <img src='http://s0.wp.com/latex.php?latex=K+x+S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K x S' title='K x S' class='latex' /> binary valued matrix <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{W}' title='&#92;hat{W}' class='latex' />;</li>
<li>And a real number <img src='http://s0.wp.com/latex.php?latex=%5Clambda&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lambda' title='&#92;lambda' class='latex' />, which is called the regularization parameter,</li>
</ol>
<p>Find <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{W}' title='&#92;hat{W}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' /> that minimize</p>
<p><img src='http://s0.wp.com/latex.php?latex=G%28%5Chat%7BW%7D%2C+%5Chat%7BD%7D+%3B+%5Clambda%29+%3D+%5Csum_%7Bs%3D1%7D%5ES+%7C%7C+%5Cvec%7Bz%7D_%7Bs%7D+-+%5Csum_%7Bk%3D1%7D%5E%7BK%7D+w_%7Bks%7D+%5Cvec%7Bd%7D_k+%7C%7C%5E2+%2B+%5Clambda+%5Csum_%7Bs%3D1%7D%5ES+%5Csum_%7Bk%3D1%7D%5E%7BK%7D+w_%7Bks%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G(&#92;hat{W}, &#92;hat{D} ; &#92;lambda) = &#92;sum_{s=1}^S || &#92;vec{z}_{s} - &#92;sum_{k=1}^{K} w_{ks} &#92;vec{d}_k ||^2 + &#92;lambda &#92;sum_{s=1}^S &#92;sum_{k=1}^{K} w_{ks}' title='G(&#92;hat{W}, &#92;hat{D} ; &#92;lambda) = &#92;sum_{s=1}^S || &#92;vec{z}_{s} - &#92;sum_{k=1}^{K} w_{ks} &#92;vec{d}_k ||^2 + &#92;lambda &#92;sum_{s=1}^S &#92;sum_{k=1}^{K} w_{ks}' class='latex' /></p>
<p>subject to the constraints that <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_j+%5Ccdot+%5Cvec%7Bd%7D_m+%3D+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m = 0' title='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m = 0' class='latex' /> for all pairs <img src='http://s0.wp.com/latex.php?latex=j%2Cm&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j,m' title='j,m' class='latex' /> that are not connected in the quantum chip being used.</p>
<p>The only difference here from what we did before is the last sentence, where we add a set of constraints on the dictionary atoms.</p>
<p><strong>Solving the sparse coding problem using block coordinate descent</strong></p>
<p>We&#8217;re going to use the same strategy for solving this as before, with a slight change. Here is the strategy we&#8217;ll use.</p>
<ol>
<li>First, we generate a random dictionary <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' />, subject to meeting the orthogonality constraints we&#8217;ve imposed on the dictionary atoms.</li>
<li>Assuming these fixed dictionaries, we solve the optimization problem for the dictionary atoms <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BW%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{W}' title='&#92;hat{W}' class='latex' />. These optimization problems are now Chimera-structured QUBOs that fit exactly onto the hardware by construction.</li>
<li>Now we fix the weights to these values, and find the optimal dictionary <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' />, again subject to our constraints.</li>
</ol>
<p>We then iterate steps 2 and 3 until <img src='http://s0.wp.com/latex.php?latex=G&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G' title='G' class='latex' /> converges to a minimum.</p>
<p>Now we&#8217;re in a different regime than before &#8212; step 2 requires the solution of a large number of chimera-structured QUBOs, not fully connected QUBOs. So that makes those problems better fits to the hardware. But now we have to do some new things to allow for both steps 1 and 3, and these initial steps have some cost.</p>
<p>The first of these is not too hard, and introduces a key concept we&#8217;ll use for Step 3 (which is harder). In this post I&#8217;ll go over how to do Step 1.</p>
<p><strong>Step 1: Setting up an initial random dictionary that obeys our constraints</strong></p>
<p>Alright so the first step we need to do is to figure out under what conditions we can achieve Step 1.</p>
<p>There is a very interesting result in a paper called <a href="http://www.math.uregina.ca/~kmeagher/DMRG/shaun/LSS89.pdf">Orthogonal Representations and Connectivity of Graphs</a>. Here is a short explanation of the result.</p>
<p>Imagine you have a graph on <img src='http://s0.wp.com/latex.php?latex=V&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V' title='V' class='latex' /> vertices. In that graph, each vertex is connected to a bunch of others. Call <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' /> the number corresponding to the connectivity of the least connected variable in the graph. Then this paper proves that you can define a set of real vectors in dimension <img src='http://s0.wp.com/latex.php?latex=V+-+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V - p' title='V - p' class='latex' /> where non-adjacent nodes in the graph can be assigned orthogonal vectors.</p>
<p>So what we want to do &#8212; find a random dictionary <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_j+%5Ccdot+%5Cvec%7Bd%7D_m+%3D+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m = 0' title='&#92;vec{d}_j &#92;cdot &#92;vec{d}_m = 0' class='latex' /> for all <img src='http://s0.wp.com/latex.php?latex=k%2C+m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='k, m' title='k, m' class='latex' /> not connected in hardware &#8212; can be done if the length of the vectors <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}' title='&#92;vec{d}' class='latex' /> is greater than <img src='http://s0.wp.com/latex.php?latex=V+-+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V - p' title='V - p' class='latex' />.</p>
<p>For Vesuvius, the number <img src='http://s0.wp.com/latex.php?latex=V&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V' title='V' class='latex' /> is 512, and the lowest connectivity node in a Chimera graph is <img src='http://s0.wp.com/latex.php?latex=p+%3D+5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p = 5' title='p = 5' class='latex' />. So as long as the dimension of the dictionary atoms is greater than 512 &#8211; 5 = 507, we can always perform Step 1.</p>
<p>Here is a little more color on this very interesting result. Imagine you have to come up with two vectors <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bg%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{g}' title='&#92;vec{g}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bh%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{h}' title='&#92;vec{h}' class='latex' /> that are orthogonal (the dot product <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bg%7D+%5Ccdot+%5Cvec%7Bh%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{g} &#92;cdot &#92;vec{h}' title='&#92;vec{g} &#92;cdot &#92;vec{h}' class='latex' /> is zero). What&#8217;s the minimum dimension these vectors have to live in such that this can be done? Well imagine that they both live in one dimension &#8212; they are just numbers on a line. Then clearly you can&#8217;t do it. However if you have two dimensions, you can. Here&#8217;s an example: <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bg%7D+%3D+%5Chat%7Bx%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{g} = &#92;hat{x}' title='&#92;vec{g} = &#92;hat{x}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bh%7D+%3D+%5Chat%7By%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{h} = &#92;hat{y}' title='&#92;vec{h} = &#92;hat{y}' class='latex' />. If you have more that two dimensions, you can also, and the choices you make in this case are not unique.</p>
<p>More generally, if you ask the question &#8220;how many orthogonal vectors can I draw in an <img src='http://s0.wp.com/latex.php?latex=V&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V' title='V' class='latex' />-dimensional space?&#8221;, the answer is <img src='http://s0.wp.com/latex.php?latex=V&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V' title='V' class='latex' /> &#8212; one vector per dimension. So that is a key piece of the above result. If we had a graph with <img src='http://s0.wp.com/latex.php?latex=V&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V' title='V' class='latex' /> vertices where NONE of the vertices were connected to any others (minimum vertex connectivity <img src='http://s0.wp.com/latex.php?latex=p+%3D+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p = 0' title='p = 0' class='latex' />), and want to assign vectors to each vertex such that all of these vectors are orthogonal to all the others, that&#8217;s equivalent to asking &#8220;given a <img src='http://s0.wp.com/latex.php?latex=V&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V' title='V' class='latex' />-dimensional space, what&#8217;s the minimum dimension of a set of vectors such that they are all orthogonal to each other?&#8221;, and the answer is <img src='http://s0.wp.com/latex.php?latex=V&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V' title='V' class='latex' />.</p>
<p>Now imagine we start drawing edges between some of the vertices in the graph, and we don&#8217;t require that the vectors living on these vertices be orthogonal. Conceptually you can think of this as relaxing some constraints, and making it &#8216;easier&#8217; to find the desired set of vectors &#8212; so the minimum dimension of the vectors required so that this will work is reduced as the graph gets more connected. The fascinating result here is the very simple way this works. Just find the lowest connectivity node in the graph, call its connectivity <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, and then ask &#8220;given a graph on <img src='http://s0.wp.com/latex.php?latex=V&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V' title='V' class='latex' /> vertices, where the minimum connectivity vertex has connectivity <img src='http://s0.wp.com/latex.php?latex=p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p' title='p' class='latex' />, what&#8217;s the minimum dimension of a set of vectors such that non-connected vertices in the graph are all assigned orthogonal vectors?&#8221;. The answer is <img src='http://s0.wp.com/latex.php?latex=V+-+p&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='V - p' title='V - p' class='latex' />.</p>
<p><strong>Null Space</strong></p>
<div id="attachment_2534" class="wp-caption alignright" style="width: 241px"><a href="http://dwave.files.wordpress.com/2013/04/null_space_box.jpg"><img class="size-medium wp-image-2534" alt="Null Space is also an ASCII-based adventure game: https://students.digipen.edu/~tbrosman/null_space_download.html ." src="http://dwave.files.wordpress.com/2013/04/null_space_box.jpg?w=231&#038;h=300" width="231" height="300" /></a><p class="wp-caption-text">Null Space is also an ASCII-based adventure game: <a href="https://students.digipen.edu/~tbrosman/null_space_download.html" rel="nofollow">https://students.digipen.edu/~tbrosman/null_space_download.html</a> .</p></div>
<p>Now just knowing we can do it isn&#8217;t enough. But thankfully it&#8217;s not hard to think of a constructive procedure to do this. Here is one:</p>
<ol>
<li>Generate a matrix <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' /> where all entries are random numbers between +1 and -1.</li>
<li>Renormalize each column such that each column&#8217;s norm is one.</li>
<li>For each column in <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BD%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{D}' title='&#92;hat{D}' class='latex' /> from the leftmost to the rightmost in order, compute the <a href="http://en.wikipedia.org/wiki/Kernel_%28matrix%29">null space</a> of that column, and then replace that column with a random column written in the null space basis.</li>
</ol>
<p>If you do this you will get an initial random orthonormal basis as required in our new procedure.</p>
<p>By the way, here is some Python code for computing a null space basis for a matrix <img src='http://s0.wp.com/latex.php?latex=%5Chat%7BA%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;hat{A}' title='&#92;hat{A}' class='latex' />. It&#8217;s easy but there isn&#8217;t a native function in numpy or scipy that does it.</p>
<p><span style="color:#0000ff;"><strong>import</strong> </span>numpy<br />
<strong><span style="color:#0000ff;">from</span></strong> scipy.linalg <strong><span style="color:#0000ff;">import</span> </strong>qr</p>
<p><span style="color:#0000ff;"><strong>def</strong></span> nullspace_qr(A):</p>
<p style="padding-left:30px;">A = numpy.atleast_2d(A)<br />
Q, R = qr(A.T)<br />
ns = Q[:, R.shape[1]:].conj()<br />
<strong><span style="color:#0000ff;">return</span></strong> ns</p>
<p>OK so step 1 wasn&#8217;t too bad! Now we have to deal with step 3. This is a harder problem, which I&#8217;ll tackle in the next post.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dwave.wordpress.com/2508/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dwave.wordpress.com/2508/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2508&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dwave.wordpress.com/2013/04/29/sparse-coding-on-d-wave-hardware-structured-dictionaries/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/373bb2536951daebfdc760c9099af9af?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Geordie</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/04/null_space_box.jpg?w=231" medium="image">
			<media:title type="html">Null Space is also an ASCII-based adventure game: https://students.digipen.edu/~tbrosman/null_space_download.html .</media:title>
		</media:content>
	</item>
		<item>
		<title>Some new Rainier science</title>
		<link>http://dwave.wordpress.com/2013/04/17/some-new-rainier-science/</link>
		<comments>http://dwave.wordpress.com/2013/04/17/some-new-rainier-science/#comments</comments>
		<pubDate>Thu, 18 Apr 2013 05:21:51 +0000</pubDate>
		<dc:creator>Geordie</dc:creator>
				<category><![CDATA[D-Wave Science & Technology]]></category>
		<category><![CDATA[Quantum computing]]></category>
		<category><![CDATA[Superconducting Processors]]></category>

		<guid isPermaLink="false">http://dwave.wordpress.com/?p=2523</guid>
		<description><![CDATA[Here is a short break from the sparse coding mayhem. A recent paper by some interesting folks appeared today on the arxiv. They ran some experiments on the Rainier-based system at USC. Here is some of what they found: Our &#8230; <a href="http://dwave.wordpress.com/2013/04/17/some-new-rainier-science/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2523&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Here is a short break from the sparse coding mayhem. A recent paper by some interesting folks appeared today on the arxiv. They ran some experiments on the Rainier-based system at USC.</p>
<p>Here is some of what they found:</p>
<blockquote><p>Our experiments have demonstrated that quantum annealing with more than one hundred qubits takes place in the D-Wave One device&#8230; the device has sufficient ground state quantum coherence to realise a quantum annealing of a transverse field Ising model.</p></blockquote>
<p>Here is a link to the arxiv paper.</p>
<h1><a href="http://arxiv.org/pdf/1304.4595v1.pdf">Quantum annealing with more than one hundred qubits</a></h1>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dwave.wordpress.com/2523/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dwave.wordpress.com/2523/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2523&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dwave.wordpress.com/2013/04/17/some-new-rainier-science/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/373bb2536951daebfdc760c9099af9af?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Geordie</media:title>
		</media:content>
	</item>
		<item>
		<title>Sparse coding on D-Wave hardware: things that don&#8217;t work</title>
		<link>http://dwave.wordpress.com/2013/04/17/sparse-coding-on-d-wave-hardware-things-that-dont-work/</link>
		<comments>http://dwave.wordpress.com/2013/04/17/sparse-coding-on-d-wave-hardware-things-that-dont-work/#comments</comments>
		<pubDate>Wed, 17 Apr 2013 15:46:48 +0000</pubDate>
		<dc:creator>Geordie</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[D-Wave Science & Technology]]></category>
		<category><![CDATA[Learning to program the D-Wave One]]></category>
		<category><![CDATA[Quantum computer programming]]></category>
		<category><![CDATA[Sparse Coding]]></category>

		<guid isPermaLink="false">http://dwave.wordpress.com/?p=2479</guid>
		<description><![CDATA[For Christmas this year, my dad bought me a book called Endurance: Shackleton&#8217;s Incredible Voyage, by Alfred Lansing. It is a true story about folks who survive incredible hardship for a long time. You should read it. Shackleton&#8217;s family motto &#8230; <a href="http://dwave.wordpress.com/2013/04/17/sparse-coding-on-d-wave-hardware-things-that-dont-work/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2479&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div id="attachment_2500" class="wp-caption alignright" style="width: 310px"><a href="http://dwave.files.wordpress.com/2013/04/shackleton02_home_01.jpg"><img class="size-medium wp-image-2500" alt="Ice, ice baby." src="http://dwave.files.wordpress.com/2013/04/shackleton02_home_01.jpg?w=300&#038;h=225" width="300" height="225" /></a><p class="wp-caption-text">Ice, ice baby.</p></div>
<p>For Christmas this year, my dad bought me a book called <a href="http://www.amazon.com/Endurance-Shackletons-Incredible-Alfred-Lansing/dp/078670621X">Endurance: Shackleton&#8217;s Incredible Voyage, by Alfred Lansing.</a></p>
<p>It is a true story about folks who survive incredible hardship for a long time. You should read it.</p>
<p>Shackleton&#8217;s family motto was <a href="http://main.wgbh.org/imax/shackleton/sirernest.html">Fortitudine Vincimus &#8212; &#8220;by endurance we conquer&#8221;</a>. I like this a lot.</p>
<p>On April 22nd, we celebrate the 14th anniversary of the incorporation of D-Wave. Over these past 14 years, nearly everything we&#8217;ve tried hasn&#8217;t worked. While we haven&#8217;t had to eat penguin (yet), and to my knowledge no amputations have been necessary, it hasn&#8217;t been a walk in the park. The first ten things you think of always turn out to be dead ends or won&#8217;t work for some reason or other.</p>
<p>Here I&#8217;m going to share an example of this with the sparse coding problem by describing two things we tried that didn&#8217;t work, and why.</p>
<p><strong>Where we got to last time</strong></p>
<p>In the last post, we boiled down the hardness of L0-norm sparse coding to the solution of a large number of QUBOs of the form</p>
<p>Find <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bw%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{w}' title='&#92;vec{w}' class='latex' /> that minimizes</p>
<p><img src='http://s0.wp.com/latex.php?latex=G%28%5Cvec%7Bw%7D%3B+%5Clambda%29+%3D+%5Csum_%7Bj%3D1%7D%5E%7BK%7D+w_j+%5B+%5Clambda+%2B+%5Cvec%7Bd%7D_j+%5Ccdot+%28%5Cvec%7Bd%7D_j+-2+%5Cvec%7Bz%7D%29+%5D+%2B+2+%5Csum_%7Bj+%5Cleq+m%7D%5EK+w_j+w_m+%5Cvec%7Bd%7D_j+%5Ccdot+%5Cvec%7Bd%7D_m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G(&#92;vec{w}; &#92;lambda) = &#92;sum_{j=1}^{K} w_j [ &#92;lambda + &#92;vec{d}_j &#92;cdot (&#92;vec{d}_j -2 &#92;vec{z}) ] + 2 &#92;sum_{j &#92;leq m}^K w_j w_m &#92;vec{d}_j &#92;cdot &#92;vec{d}_m' title='G(&#92;vec{w}; &#92;lambda) = &#92;sum_{j=1}^{K} w_j [ &#92;lambda + &#92;vec{d}_j &#92;cdot (&#92;vec{d}_j -2 &#92;vec{z}) ] + 2 &#92;sum_{j &#92;leq m}^K w_j w_m &#92;vec{d}_j &#92;cdot &#92;vec{d}_m' class='latex' /></p>
<p>I then showed that using this form has advantages (at least for getting a maximally sparse encoding of MNIST) over the more typical L1-norm version of sparse coding.</p>
<p>I also mentioned that we used a variant of <a href="https://projects.coin-or.org/metslib">tabu search</a> to solve these QUBOs. Here I&#8217;m going to outline two strategies we tried to use the hardware to beat tabu that ended up not working.</p>
<p><strong>These QUBOs are fully connected, and the hardware isn&#8217;t</strong></p>
<p>The terms in the QUBO that connect variables <img src='http://s0.wp.com/latex.php?latex=j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j' title='j' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m' title='m' class='latex' /> are proportional to the dot product of the <img src='http://s0.wp.com/latex.php?latex=j%5E%7Bth%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='j^{th}' title='j^{th}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=m%5E%7Bth%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='m^{th}' title='m^{th}' class='latex' /> dictionary atoms <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_j&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_j' title='&#92;vec{d}_j' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bd%7D_m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{d}_m' title='&#92;vec{d}_m' class='latex' />. Because we haven&#8217;t added any restrictions on what these atoms need to look like, these dot products can all be non-zero (the dictionary atoms don&#8217;t need to be, and in general won&#8217;t be, orthogonal). This means that the problems generated by the procedure are all fully connected &#8212; each variable is influenced by every other variable.</p>
<p>Unfortunately, when you build a physical quantum computing chip, this full connectivity can&#8217;t be achieved. The chip you get to work with connects any given variable with only a small number of other variables.</p>
<p>There are two ways we know of to get around the mismatch of the connectivity of a problem we want to solve, and the connectivity of the hardware. The first is called <strong>embedding</strong>, and the second is by using the hardware to perform a type of large neighborhood local search as a component of a hybrid algorithm we call <strong>BlackBox</strong>.</p>
<p><strong>Solving problems by embedding</strong></p>
<p>In a quantum computer, qubits are physically connected to only some of the other qubits. In the most recent spin of our design, each qubit is connected to at most 6 other qubits in a specific pattern which we call a Chimera graph. In our first product chip, Rainier, there were 128 qubits. In the current processor, Vesuvius, there are 512.</p>
<p>Chimera graphs are a way to use a regular repeating pattern to tile out a processor. In Rainier, the processor graph was a four by four tiling of an eight qubit unit cell. For Vesuvius, the same unit cell was used, but with an eight by eight tiling.</p>
<p>For a detailed overview of the rationale behind embedding, and how it works in practice for Chimera graphs, see <a href="http://dwave.wordpress.com/2008/10/21/the-128-qubit-rainier-chip-i-interconnect-topology/">here</a> and <a href="http://dwave.wordpress.com/2008/10/23/the-128-qubit-rainier-processor-ii-graph-embedding/">here</a>, which discuss embedding into the 128-qubit Rainier graph (Vesuvius is the same, just more qubits).</p>
<p>The short version is that an embedding is a map from the variables of the problem you wish to solve to the physical qubits in a processor, where the map can be one-to-many (each variable can be mapped to many physical qubits). To preserve the problem structure we strongly &#8216;lock together&#8217; qubits corresponding to the same variable.</p>
<p><a href="http://dwave.files.wordpress.com/2013/04/ramsey_number_embedding.png"><img class="alignright size-medium wp-image-2490" alt="ramsey_number_embedding" src="http://dwave.files.wordpress.com/2013/04/ramsey_number_embedding.png?w=300&#038;h=293" width="300" height="293" /></a>In the case of fully connected QUBOs like the ones we have here, it is known that you can always embed a fully connected graph with <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> vertices into a Chimera graph with <img src='http://s0.wp.com/latex.php?latex=%28K-1%29%5E2%2F2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='(K-1)^2/2' title='(K-1)^2/2' class='latex' /> physical qubits &#8212; Rainier can embed a fully connected 17 variable graph, while Vesuvius can embed a fully connected 33 variable graph. Shown to the right is an <a href="http://arxiv.org/pdf/1201.1842.pdf">embedding from this paper</a> into Rainier, for solving a problem that computes Ramsey numbers. The processor graph where qubits colored the same represent the same computational variable.</p>
<p>So one way we could use Vesuvius to solve the sparse coding QUBOs is to restrict <img src='http://s0.wp.com/latex.php?latex=K&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K' title='K' class='latex' /> to be 33 or less and embed these problems. However this is unsatisfactory for two (related) reasons. The first is that 33 dictionary atoms isn&#8217;t enough for what we ultimately want to do (sparse coding on big data sets). The second is that QUBOs generated by the procedure I&#8217;ve described are really easy for tabu search at that scale. For problems this small, tabu gives excellent performance with a per problem timeout of about 10 milliseconds (about the same as the runtime for a single problem on Vesuvius), and since it can be run in the cloud, we can take advantage of massive parallellism as well. So even though on a problem by problem basis, Vesuvius is competitive at this scale, when you gang up say 1,000 cores against it, Vesuvius loses (because there aren&#8217;t a thousand of them available&#8230; yet <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  ).</p>
<p>So this option, while we can do it, is out. At the stage we&#8217;re at now this approach can&#8217;t compete with cloud-enabled tabu. Maybe when we have a lot more qubits.</p>
<p><strong>Solving sparse coding QUBOs using BlackBox</strong></p>
<p>BlackBox is an algorithm developed at D-Wave. <a href="http://www.dwavesys.com/en/dev-tutorial-blackbox.html">Here is a high level introduction to how it works.</a> It is designed to solve problems where all we&#8217;re given is a black box that converts possible answers to binary optimization problems into real numbers denoting how good those possible answers are. For example, the configuration of an airplane wing could be specified as a bit string, and to know how &#8216;good&#8217; that configuration was, we might need to actually construct that example and put it in a wind tunnel and measure it. Or maybe just doing a large-scale supercomputer simulation is enough. But the relationship between the settings of the binary variables and the quality of the answer in problems like this is not easily specified in a closed form, like we were able to do with the sparse coding QUBOs.</p>
<p>BlackBox is based on tabu search, but uses the hardware to generate a model of the objective function around each search point that expands possibilities for next moves beyond single bit flips. This modelling and sampling from hardware at each tabu step increases the time per step, but decreases the number of steps required to reach some target value of the objective function. As the cost of evaluating the objective function goes up, the gain in making fewer &#8216;steps&#8217; by making better moves at each tabu step goes up. However if the objective function can be very quickly evaluated, tabu generally beats BlackBox because it can make many more guesses per unit time because of the additional cost of the BlackBox modeling and hardware sampling step.</p>
<p>BlackBox can be applied to arbitrary sized fully connected QUBOs, and because of this is better than embedding because we lose the restriction to small numbers of dictionary atoms. With BlackBox we can try any size problem and see how it does.</p>
<p>We did this, and unfortunately BlackBox on Vesuvius is not competitive with cloud-enabled tabu search for any of the problem sizes we tried (which were, admittedly, still pretty small &#8212; up to 50 variables). I suspect that this will continue to hold, no matter how large these problems get, for the following reasons:</p>
<ol>
<li>The inherently parallel nature of the sparse coding problem (<img src='http://s0.wp.com/latex.php?latex=S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S' title='S' class='latex' /> independent QUBOs) means that we will always be up against multiple cores vs. a small number of Vesuvius processors. This factor can be significant &#8212; for a large problem with millions of data objects, this factor can easily be in the thousands or tens of thousands.</li>
<li>BlackBox is designed for objective functions that are really black boxes, so that there is no obvious way to attack the structure of the problem directly, and where it is very expensive to evaluate the objective function. This is not the case for these problems &#8212; they are QUBOs and this means that attacks can be made directly based on this known fact. For these problems, the current version of BlackBox, while it can certainly be used, is not in its sweet spot, and wouldn&#8217;t be expected to be competitive with tabu in the cloud.</li>
</ol>
<p>And this is exactly what we find &#8212; BlackBox on Vesuvius is not competitive with tabu on the cloud for any of the problem sizes we tried. Note that there is a small caveat here &#8212; it is possible (although I think unlikely) that for very large numbers of atoms (say low thousands) this could change, and BlackBox could start winning. However for both of the reasons listed above I would bet against this.</p>
<p><strong>What to do, what to do</strong></p>
<p>We tried both obvious tactics for using our gear to solve these problems, and both lost to a superior classical approach. So do we give up and go home? Of course not!</p>
<p><a href="http://www.youtube.com/watch?v=zNg4OjWkLns">We shall go on to the end&#8230; we shall never surrender!!!</a></p>
<p>We just need to do some mental gymnastics here and be creative.</p>
<p>In both of the approaches above, we tried to shoehorn the problem our application generates into the hardware. Neither solution was effective.</p>
<p>So let&#8217;s look at this from a different perspective. Is it possible to restrict the problems generated by sparse coding so that they exactly fit in hardware &#8212; so that we require the problems generated to exactly match the hardware graph? If we can achieve this, we may be able to beat the classical competition, as we know that Vesuvius is many orders of magnitude faster than anything that exists on earth for the native problems it&#8217;s solving.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dwave.wordpress.com/2479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dwave.wordpress.com/2479/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2479&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dwave.wordpress.com/2013/04/17/sparse-coding-on-d-wave-hardware-things-that-dont-work/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/373bb2536951daebfdc760c9099af9af?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Geordie</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/04/shackleton02_home_01.jpg?w=300" medium="image">
			<media:title type="html">Ice, ice baby.</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/04/ramsey_number_embedding.png?w=300" medium="image">
			<media:title type="html">ramsey_number_embedding</media:title>
		</media:content>
	</item>
		<item>
		<title>Sparse coding on D-Wave hardware: some results</title>
		<link>http://dwave.wordpress.com/2013/04/14/sparse-coding-on-d-wave-hardware-some-results/</link>
		<comments>http://dwave.wordpress.com/2013/04/14/sparse-coding-on-d-wave-hardware-some-results/#comments</comments>
		<pubDate>Sun, 14 Apr 2013 14:07:43 +0000</pubDate>
		<dc:creator>Geordie</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[D-Wave Science & Technology]]></category>
		<category><![CDATA[Sparse Coding]]></category>

		<guid isPermaLink="false">http://dwave.wordpress.com/?p=2446</guid>
		<description><![CDATA[Last week I described two variants of sparse coding. One, which is commonly used, attempts to find a dictionary of atoms which can be used to sparsely reconstruct data, where the reconstructions are linear combinations of these atoms multiplied by &#8230; <a href="http://dwave.wordpress.com/2013/04/14/sparse-coding-on-d-wave-hardware-some-results/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2446&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Last week I described two variants of sparse coding. One, which is commonly used, attempts to find a dictionary of atoms which can be used to sparsely reconstruct data, where the reconstructions are linear combinations of these atoms multiplied by real numbers, and the regularization term has the L1-norm form. We called this L1-norm sparse coding. I described the way we solve the optimization problem underlying this method.</p>
<p>The second approach is generally not used, and I believe very little is known about it. The procedure is identical to the usual approach except for one small change. Instead of real numbers, we attempt to reconstruct the data using a linear combination of dictionary atoms where the weights are <strong>binary</strong>, and we use L0-norm regularization. We called this L0-norm sparse coding.</p>
<p>To solve the L0-norm sparse coding optimization problem, we use the same procedure &#8212; called block coordinate descent &#8212; that was successfully used for the L1 version. The difference is that unlike the L1 version, where the step where optimization over the weights can be done efficiently, in the L0 version that optimization is NP-hard. The problem separates into <img src='http://s0.wp.com/latex.php?latex=S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S' title='S' class='latex' /> independent problems of finding the optimal weights for each of the <img src='http://s0.wp.com/latex.php?latex=S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S' title='S' class='latex' /> objects in our data set. These problems are of the form</p>
<p>Find <img src='http://s0.wp.com/latex.php?latex=%5Cvec%7Bw%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;vec{w}' title='&#92;vec{w}' class='latex' /> that minimizes</p>
<p><img src='http://s0.wp.com/latex.php?latex=G%28%5Cvec%7Bw%7D%3B+%5Clambda%29+%3D+%5Csum_%7Bj%3D1%7D%5E%7BK%7D+w_j+%5B+%5Clambda+%2B+%5Cvec%7Bd%7D_j+%5Ccdot+%28%5Cvec%7Bd%7D_j+-2+%5Cvec%7Bz%7D%29+%5D+%2B+2+%5Csum_%7Bj+%5Cleq+m%7D%5EK+w_j+w_m+%5Cvec%7Bd%7D_j+%5Ccdot+%5Cvec%7Bd%7D_m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='G(&#92;vec{w}; &#92;lambda) = &#92;sum_{j=1}^{K} w_j [ &#92;lambda + &#92;vec{d}_j &#92;cdot (&#92;vec{d}_j -2 &#92;vec{z}) ] + 2 &#92;sum_{j &#92;leq m}^K w_j w_m &#92;vec{d}_j &#92;cdot &#92;vec{d}_m' title='G(&#92;vec{w}; &#92;lambda) = &#92;sum_{j=1}^{K} w_j [ &#92;lambda + &#92;vec{d}_j &#92;cdot (&#92;vec{d}_j -2 &#92;vec{z}) ] + 2 &#92;sum_{j &#92;leq m}^K w_j w_m &#92;vec{d}_j &#92;cdot &#92;vec{d}_m' class='latex' /></p>
<p>which is a QUBO. So to perform the optimization over the weights, we need to solve <img src='http://s0.wp.com/latex.php?latex=S&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S' title='S' class='latex' /> independent QUBOs at each iteration of the block descent. There are many ways to do this, including using D-Wave hardware. The results I&#8217;ll show below were obtained using a variant of tabu search.</p>
<p>Here I&#8217;m going to show some interesting results on some differences between these two approaches on MNIST.</p>
<p><strong>Some initial results</strong></p>
<p>Sparse coding attempts to minimize the reconstruction error, which is the total summed difference between the initial data &#8212; the ground truth &#8212; and the system&#8217;s reconstructions of this data, using only linear combinations of the dictionary atoms. It does this subject to a condition which penalizes including too many of these atoms in any given reconstruction &#8212; this is the regularization term.</p>
<p>As the strength of the regularization increases &#8212; which means <img src='http://s0.wp.com/latex.php?latex=%5Clambda&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lambda' title='&#92;lambda' class='latex' /> is made bigger &#8212; fewer and fewer atoms can be used to reconstruct each data object. Here we define the <strong>average sparsity</strong> to be the average number of dictionary atoms used in the reconstructions.</p>
<p>For many reasons, it is desirable to have as low an average sparsity as we can get away with. Now what &#8216;we can get away with&#8217; is highly context dependent. What we decide this means depends on what we are ultimately attempting to do. One of the ways we can decide what this means is to use these ideas for <a href="https://en.wikipedia.org/wiki/Lossy_compression">lossy compression </a>&#8211; we want to find a way to represent our data set with as few bits as we can, and are willing to accept some user-defined amount of degradation in our data to achieve this. To do this, we attempt to do the following:</p>
<p>Find the value of <img src='http://s0.wp.com/latex.php?latex=%5Clambda&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lambda' title='&#92;lambda' class='latex' /> that minimizes average sparsity, subject to the requirement that the reconstruction error be lower than a user supplied threshold.</p>
<p>One of the interesting initial results of comparing the L0 and L1-norm versions of sparse coding is that the L0 version gives much sparser representations for the same reconstruction error. Here is a plot of the total reconstruction error over <img src='http://s0.wp.com/latex.php?latex=S+%3D+60%2C000&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='S = 60,000' title='S = 60,000' class='latex' /> MNIST training images as a function of the average number of atoms per image used in these reconstructions, for both L1 and L0-norm sparse coding. In this experiment we used <img src='http://s0.wp.com/latex.php?latex=K+%3D+64&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='K = 64' title='K = 64' class='latex' /> dictionary atoms, and the data objects were representations of the raw MNIST images including the first 30 SVD modes (so each image was represented in its ground truth using 30 real numbers). Each data point corresponds to a different value of <img src='http://s0.wp.com/latex.php?latex=%5Clambda&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lambda' title='&#92;lambda' class='latex' />, where the points to the left have larger <img src='http://s0.wp.com/latex.php?latex=%5Clambda&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lambda' title='&#92;lambda' class='latex' /> (more sparse) with <img src='http://s0.wp.com/latex.php?latex=%5Clambda&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lambda' title='&#92;lambda' class='latex' /> decreasing to the right (less sparse).</p>
<p><a href="http://dwave.files.wordpress.com/2013/04/20130401_reconstruction_error_l0_and_l1.png"><img alt="20130401_reconstruction_error_L0_and_L1" src="http://dwave.files.wordpress.com/2013/04/20130401_reconstruction_error_l0_and_l1.png?w=584&#038;h=391" width="584" height="391" /></a></p>
<p>This is a very cool result. For the same reconstruction error, the L0 version requires roughly one half the number of atoms. In addition, the L1 version&#8217;s weights are real numbers, which require several bits to specify (up to 64, but likely in practice 8 would be enough), whereas by construction the L0 version&#8217;s weights are binary. So if our objective is compression, L0 wins both from the perspective of number of atoms required and the number of bits per atom.</p>
<p>If we represent the L1 weights using 8 bits, and use half as many to get the same quality reconstruction, this means that for the figure above L0 achieves about 16x more compression than L1 for the same reconstruction quality.</p>
<p><strong>What a dictionary looks like</strong></p>
<p>Once the sparse coding procedure is complete, we can look at the dictionary atoms. Shown here are the dictionary atoms found during an L0 run with <img src='http://s0.wp.com/latex.php?latex=%5Clambda+%3D+0.035&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lambda = 0.035' title='&#92;lambda = 0.035' class='latex' />, which gave an average sparsity of 2.25 atoms / image. In the figure, the 64 dictionary atoms are shown in the top part. The bottom part shows the ground truth for the first 48 images of the 60,000 total. The middle part shows the reconstructions, and which dictionary atoms were used in these reconstructions. If more than two atoms were used, the notation is &gt;&gt;#Y, where Y is the number of atoms.</p>
<p><a href="http://dwave.files.wordpress.com/2013/04/20130328_lambda_0p035_svd30_k64_tabu_10.jpeg"><img alt="20130328_lambda_0p035_svd30_k64_tabu_10" src="http://dwave.files.wordpress.com/2013/04/20130328_lambda_0p035_svd30_k64_tabu_10.jpeg?w=584&#038;h=584" width="584" height="584" /></a></p>
<p>Next, let&#8217;s look at a close-up showing how the reconstructions work for three of the images. The way to understand the image below is that the atoms (on the right) are added together, and what this addition gives is the reconstruction (second column from left), which we can compare to the ground truth (leftmost column). One technical detail is that prior to doing the sparse coding procedure, we subtracted off the &#8220;average&#8221; value of the pixels over the entire 60,000 image data set, which gets added back here. So the reconstructions are always &#8216;average image&#8217; + sum over the atoms in the reconstruction. If you look at the middle one (the nine), the difference between atom #3 and the reconstruction of the nine is the average image.</p>
<p><a href="http://dwave.files.wordpress.com/2013/04/20130328_lambda_0p035_svd30_k64_tabu_10_v3.jpeg"><img alt="20130328_lambda_0p035_svd30_k64_tabu_10_v3" src="http://dwave.files.wordpress.com/2013/04/20130328_lambda_0p035_svd30_k64_tabu_10_v3.jpeg?w=584&#038;h=487" width="584" height="487" /></a></p>
<p>Next, let&#8217;s look at a dictionary obtained from the L1 procedure, with <img src='http://s0.wp.com/latex.php?latex=%5Clambda+%3D+0.25&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lambda = 0.25' title='&#92;lambda = 0.25' class='latex' /> chosen to give roughly the same average sparsity as above (here the average number of atoms / image was 2.16). You can see with your eye that the jump in reconstruction error (from about 4,000 for L0 to about 6,500 for L1 in the first graph above) makes for poor reconstructions. Note that the L1 procedure doesn&#8217;t always give poor reconstructions &#8212; they can be quite excellent. You just need more atoms on average to get there.<strong><br /> </strong></p>
<p><a href="http://dwave.files.wordpress.com/2013/04/20130401_lambda_0p25_fss_v2.jpeg"><img alt="20130401_lambda_0p25_fss_v2" src="http://dwave.files.wordpress.com/2013/04/20130401_lambda_0p25_fss_v2.jpeg?w=584&#038;h=584" width="584" height="584" /></a></p>
<p><strong>This is great, but where&#8217;s the hardware fit in?<br /> </strong></p>
<p>There is some debate about whether L0-norm sparse coding actually buys you anything over the L1 version in practice. While anecdotal results such as the above really don&#8217;t do much to settle this issue, it is nonetheless encouraging to see that recasting this basic workhorse procedure in this way does seem to provide a significant benefit &#8212; at least for compressing MNIST.</p>
<p>Of course the reason we&#8217;ve been thinking so hard about this issue is that we want to find problems we can apply quantum computation to. Given that the basic L0-norm sparse coding procedure seems to provide intriguing benefits over the L1 version, the second order question is whether quantum computers can be used to solve the optimization problems underlying this version more effectively than you could otherwise do. This will be the subject of my next post.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dwave.wordpress.com/2446/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dwave.wordpress.com/2446/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dwave.wordpress.com&#038;blog=336042&#038;post=2446&#038;subd=dwave&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dwave.wordpress.com/2013/04/14/sparse-coding-on-d-wave-hardware-some-results/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/373bb2536951daebfdc760c9099af9af?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Geordie</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/04/20130401_reconstruction_error_l0_and_l1.png" medium="image">
			<media:title type="html">20130401_reconstruction_error_L0_and_L1</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/04/20130328_lambda_0p035_svd30_k64_tabu_10.jpeg" medium="image">
			<media:title type="html">20130328_lambda_0p035_svd30_k64_tabu_10</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/04/20130328_lambda_0p035_svd30_k64_tabu_10_v3.jpeg" medium="image">
			<media:title type="html">20130328_lambda_0p035_svd30_k64_tabu_10_v3</media:title>
		</media:content>

		<media:content url="http://dwave.files.wordpress.com/2013/04/20130401_lambda_0p25_fss_v2.jpeg" medium="image">
			<media:title type="html">20130401_lambda_0p25_fss_v2</media:title>
		</media:content>
	</item>
	</channel>
</rss>
