<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Ewin Tang</title>
    <description>Ewin&apos;s website</description>
    <link>https://www.ewintang.com/</link>
    <atom:link href="https://www.ewintang.com/feed.xml" rel="self" type="application/rss+xml" />
    
      <item>
        <title>Some open problems</title>
        <description>&lt;p&gt;At last year’s FOCS I participated in a &lt;a href=&quot;https://jerryzli.github.io/focs24-workshop.html&quot;&gt;workshop&lt;/a&gt; on quantum learning, which included an open problem session. Here are the open problems I contributed. Check out the &lt;a href=&quot;https://jerryzli.github.io/focs2024-slides/open_questions.pdf&quot;&gt;complete list&lt;/a&gt; for more, including the more “classic” problems in the field (e.g. entanglement testing, shadow tomography).&lt;/p&gt;

&lt;p&gt;The questions here are mathematically precise, but I don’t have ideas for how to prove them. &lt;a href=&quot;https://www.stat.berkeley.edu/~aldous/Research/OP/index.html&quot;&gt;David Aldous&lt;/a&gt; would call these “Type 2” problems. For clarity, I won’t explain the mathematical formalism of Hamiltonians, etc, but the glossary at the bottom of this page will contain precise versions of the terms I’m using.&lt;/p&gt;

&lt;h2 id=&quot;why-work-on-quantum-learning&quot;&gt;Why work on quantum learning?&lt;/h2&gt;

&lt;p&gt;When I say quantum learning here, I mean questions in the following vein: we have an unknown quantum system. How can I run experiments on this quantum system to learn its behavior efficiently?&lt;/p&gt;

&lt;p&gt;There are a bunch of standard motivations for these questions which I’ve explained in more detail in my papers: (1) it can be used in practice (even, potentially, in the near-term); (2) it’s closely connected to basic questions about correlations in quantum systems; (3) it naturally generalizes various landmark results in classical learning theory. But beyond this, I think this field is natural and important, given the philosophical perspectives informing the development of quantum computing. These perspectives are famous in physics (see &lt;a href=&quot;https://www.science.org/doi/10.1126/science.177.4047.393&quot;&gt;More is different&lt;/a&gt; and &lt;a href=&quot;https://www.pnas.org/doi/10.1073/pnas.97.1.28&quot;&gt;The theory of everything&lt;/a&gt;) but I personally only learned about them recently, and they strongly inform my thinking about the topic now.&lt;/p&gt;

&lt;p&gt;The biggest successes in physics are in reducing the phenomena of everyday objects down to fundamental laws governing their constituent particles. To understand the behavior of a system, we can isolate a small piece, run experiments on the small-scale, and extrapolate the findings out to the large-scale behavior. The example of interest to us is non-relativistic quantum mechanics, which is a theory of most things (notable exceptions being light and gravity). There’s plenty of things that this theory explains, and by our current understanding, a few equations is enough to explain them fully.&lt;/p&gt;

&lt;p&gt;But this “reductionist” perspective is insufficient: with quantum mechanics, we understand the fundamental particles, and yet, when you put many of these particles together to form a material, a chemical reaction, or a person, we lose understanding of their behavior. The issue here is that, extrapolating the behavior of particles to systems of many particles can be very hard. This is especially true for quantum mechanics, because it has an inherent exponentiality to its mathematical formalism.&lt;/p&gt;

&lt;p&gt;With quantum computers, we want to build a bridge between the two. One barrier (perhaps &lt;em&gt;the&lt;/em&gt; barrier) to understanding large quantum systems is computational intractability: simulating them from the laws of quantum mechanics is intractable. Using quantum computers, simulation can be done efficiently, and we can actually use our small-scale understanding for large systems.&lt;/p&gt;

&lt;p&gt;Quantum learning poses the same philosophical question from another perspective. If we have a large system that we’d like to understand, experiments to characterize it can become intractable, even though we know how to run experiments to characterize small-scale systems. There are practical reasons why big quantum systems are hard to wrangle, but even supposing we had perfect quantum control over our systems, extracting information about these systems could conceivably be inherently intractable, for the same reason that simulation is hard. Quantum learning asks whether we can relate the large-scale and the small-scale for &lt;em&gt;experiments&lt;/em&gt;, by making these experiments computationally efficient.&lt;/p&gt;

&lt;p&gt;That some kind of quantum learning works is paramount to the success of quantum computing as a program: after all, what is the point of working with large quantum systems if we do not have ways to extract insights from them? I do think there’s room to disagree on whether the current directions are the right ones towards the goal of, say, making scientific progress. Nevertheless, I hope the questions which I’m suggesting to you now are natural enough to stay true to the spirit of the program, while still producing ideas that can be specialized to practically relevant settings.&lt;/p&gt;

&lt;h2 id=&quot;1-learning-ground-states-in-polynomial-time&quot;&gt;1. Learning ground states in polynomial time&lt;/h2&gt;

&lt;p&gt;A striking phenomenon in learning theory is that, even for “hard” mathematical objects, learning can be a surprisingly easy task. Our dream is that quantum computers can help scientists who work with quantum systems, and the main computational task that such scientists struggle with is simulating ground states. For a Hamiltonian \(H\), its ground state is its largest eigenvector \(\ket{\Psi}\). Simulation is known to be hard in the worst case, even on a quantum computer, so if we want efficient algorithms, we must find a way to weave through these computational barriers to find practical advantage.&lt;/p&gt;

&lt;p&gt;The dual task to simulating ground states is learning ground states. Unlike for simulation, for learning, there are no barriers: we can hope for algorithms which always just work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Question.&lt;/strong&gt; Given copies of a ground state \(\ket{\Psi}\) of a geometrically local, \(n\)-qubit Hamiltonian \(H\), can we output a geometrically local Hamiltonian \(\widehat{H}\) whose ground state is close to \(\ket{\Psi}\), in polynomial time? (For this question, you might need the additional assumption that \(H\) is gapped.)&lt;/p&gt;

&lt;p&gt;Many Hamiltonians can have the same ground state, so the problems is phrased to allow for this non-uniqueness. The classical version of this problem is trivial (given a \(b \in \{0,1\}^n\) which is the best assignment for an unknown CSP, find a CSP for which it is the best solution), but in the quantum world, with non-commuting terms, things get much harder.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What’s known?&lt;/strong&gt; Getting polynomial sample complexity is straightforward: see &lt;a href=&quot;https://arxiv.org/abs/2303.08938&quot;&gt;this paper&lt;/a&gt; of Yu and Wei, for example, though it also follows from generic &lt;a href=&quot;https://arxiv.org/abs/1711.01053&quot;&gt;shadow tomography&lt;/a&gt;. These algorithms have exponential time complexity, though.&lt;/p&gt;

&lt;p&gt;There are also heuristic algorithms in the physics literature, under the keyword of &lt;a href=&quot;https://arxiv.org/abs/1802.07827&quot;&gt;parent Hamiltonians&lt;/a&gt;. These work for simple classes of Hamiltonians, like commuting and (I think even) frustration-free, but not general gapped Hamiltonians. We (Ainesh Bakshi, Allen Liu, Ankur Moitra, and I) had ideas for this general case but none gave a polynomial-time algorithm. This question seems connected to the question of &lt;a href=&quot;https://arxiv.org/abs/1301.1162&quot;&gt;area laws&lt;/a&gt;, but my bet is that this question doesn’t require making progress on that topic.&lt;/p&gt;

&lt;h2 id=&quot;2-does-amplitude-amplification-require-inverses&quot;&gt;2. Does amplitude amplification require inverses?&lt;/h2&gt;

&lt;p&gt;One of the first quantum algorithms you learn is &lt;a href=&quot;https://en.wikipedia.org/wiki/Grover%27s_algorithm&quot;&gt;Grover’s algorithm&lt;/a&gt;, along with the closely-related algorithms of &lt;a href=&quot;https://arxiv.org/abs/quant-ph/0005055&quot;&gt;amplitude amplification and estimation&lt;/a&gt;. These give a quadratic improvement for various kinds of search and estimation tasks.&lt;/p&gt;

&lt;p&gt;For example, to estimate the size of an entry of a unitary matrix, \(\left|\braket{0|U|0}\right|^2\), to \(\varepsilon\) error with probability \(0.99\), the naive approach of:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Initialize to \(\ket{0}\);&lt;/li&gt;
  &lt;li&gt;Apply \(U\);&lt;/li&gt;
  &lt;li&gt;Measure in the computational basis;&lt;/li&gt;
  &lt;li&gt;Repeat.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;gives an \(\varepsilon\)-good estimate after \(O(1/\varepsilon^2)\) trials, applying \(U\) that many times. But with amplitude estimation, you only need \(O(1/\varepsilon)\) applications of \(U\) and \(U^\dagger\) to estimate the size of the entry.&lt;/p&gt;

&lt;p&gt;However, this improvement requires the ability to apply the inverse, \(U^\dagger\). This is a typical feature of these Grover-ish methods. It’s often not a big deal, since in many applications \(U\) is a circuit which can be straightforwardly inverted. But what if \(U\) is a unitary that Nature applies, and we can’t reverse the arrow of time? Do we need the ability to apply \(U^\dagger\)? Or can we get away with just applications of \(U\)?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Question.&lt;/strong&gt; Can we prove that amplitude amplification/estimation cannot be done as efficiently when we only have access to \(U\)?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What’s known?&lt;/strong&gt; This question was previously posed in &lt;a href=&quot;https://arxiv.org/abs/2207.08800&quot;&gt;this paper&lt;/a&gt; of van Apeldoorn, Cornelissen, Gilyén, and Nannicini, but it seems natural enough that it’s probably been posed even earlier. The main challenge here is that standard lower bound techniques are not able to distinguish \(U\) from \(U^\dagger\), and so cannot separate the inverse-ful and inverse-less access models.&lt;/p&gt;

&lt;p&gt;Note that, by &lt;a href=&quot;https://arxiv.org/abs/1810.06944&quot;&gt;this paper&lt;/a&gt; of Quintino, Dong, Shimbo, Soeda, and Murao, it is possible to convert \(d^3\) uses of \(U\) to one use of \(U^\dagger\), so when \(d\) is small, amplitude amplification/estimation can be done without \(U^\dagger\). However, this conversion from \(U^\dagger\) to \(U\) becomes extremely expensive as \(d\) grows. So, there may still be a separation between the two settings in the regime where \(d\) is sufficiently large.&lt;/p&gt;

&lt;h2 id=&quot;3-lower-bounds-for-hamiltonian-learning&quot;&gt;3. Lower bounds for Hamiltonian learning&lt;/h2&gt;

&lt;p&gt;The next two open problems deal with &lt;em&gt;Hamiltonian learning from real-time evolutions&lt;/em&gt;. The standard setup (see e.g. &lt;a href=&quot;https://arxiv.org/abs/2210.03030&quot;&gt;[HTFS23]&lt;/a&gt; and &lt;a href=&quot;https://arxiv.org/abs/2108.04842&quot;&gt;[HKT22]&lt;/a&gt; ) is as follows. Let \(H = \sum_{a = 1}^m \lambda_a E_a\) be an \(n\)-qubit \(k\)-local (but &lt;em&gt;not necessarily geometrically local&lt;/em&gt;) Hamiltonian, where we are told the \(E_a\)’s but do not know the \(\lambda_a\)’s.&lt;/p&gt;

&lt;p&gt;We are given the ability to perform the channel \(\rho \mapsto e^{-iHt} \rho e^{iHt}\) for all \(t &amp;gt; 0\), and we want to learn a specified coefficient \(\lambda_a\) to \(\varepsilon\) error (with success probability \(\geq 0.99\), say). For such a learning algorithm we consider its total time evolution, i.e. the amount of evolution time of \(H\) used over the course of the algorithm.&lt;/p&gt;

&lt;p&gt;We know how to solve this problem with \(m/\varepsilon\) total evolution time (note that \(m = \operatorname{poly}(n)\)), and when the Hamiltonian has some underlying &lt;em&gt;geometric locality&lt;/em&gt;, we can exploit that to get an algorithm with mild-to-no dependence on the system size, something like \(\operatorname{log}(n)/\varepsilon\).&lt;/p&gt;

&lt;p&gt;It makes physical sense that assuming geometric locality would effectively remove the dependence on system size. If we think about evolving with respect to a local Hamiltonian, at infinitesimal times, the effect of the particular \(\lambda_aE_a\)  term we’re interested only affects the support of \(E_a\), so most of the \(n\) qubits play no role. But at infinitesimal times, we cannot learn the coefficient, since the effect size is similarly infinitesimal. On the other hand, at larger times, the effect of \(\lambda_a E_a\) is large, but we expect this effect to be “scrambled” across all \(n\) qubits. So, to extract out this information we’d need to pay a system size-type cost to “unscramble” it. If the Hamiltonian is geometrically local, then the “scrambling” only happens in a ball around the support of \(E_a\), and so to learn a coefficient you only have to understand that ball, which is independent of system size.&lt;/p&gt;

&lt;p&gt;This intuitive argument is not easy to formalize. So, we can ask, is the cost of Hamiltonian learning actually intrinsically linked to geometric locality? I would find such a result interesting, because real-time evolutions are a very strong form of access to \(H\); it’s not obvious to me that you really need geometric locality with such a strong form of access. The simplest version of this question is as follows:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Question.&lt;/strong&gt; Does learning local (but not necessarily geometrically local) Hamiltonians require a total time evolution of \(\operatorname{poly}(n)\)?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What’s known?&lt;/strong&gt; This question has been posed previously &lt;a href=&quot;https://arxiv.org/abs/2108.04842&quot;&gt;here&lt;/a&gt; and &lt;a href=&quot;https://arxiv.org/abs/2405.00082&quot;&gt;here&lt;/a&gt;. We know very little (embarrassingly little, imo) about lower bounds for Hamiltonian learning. The best lower bound is \(1/\varepsilon\), which follows from a simple “hybrid” argument: perturbing a coefficient by \(\varepsilon\) only affects the algorithm by \(O(\varepsilon T)\) where \(T\) is the total evolution time, so \(T\) has to be large enough to detect this difference.&lt;/p&gt;

&lt;p&gt;The task of proving stronger lower bounds has been tackled by papers of &lt;a href=&quot;https://arxiv.org/abs/2410.18928&quot;&gt;Huang, Tong, Fang, and Su&lt;/a&gt; and &lt;a href=&quot;https://arxiv.org/abs/2410.18928&quot;&gt;Ma, Flammia, Preskill, and Tong&lt;/a&gt;. Neither give improved lower bounds on the problem stated here.&lt;/p&gt;

&lt;h2 id=&quot;4-learning-hamiltonians-from-long-time-evolutions&quot;&gt;4. Learning Hamiltonians from long time evolutions&lt;/h2&gt;

&lt;p&gt;This problem also has to do with Hamiltonian learning from real-time evolution; see the previous open problem for the setup.&lt;/p&gt;

&lt;p&gt;The literature on this question has gone in two different directions: on the one hand, you can ask for algorithms giving &lt;em&gt;stronger guarantees&lt;/em&gt;; on the other, you can ask for algorithms with &lt;em&gt;weaker assumptions on, or weaker access to,&lt;/em&gt; the input. This open question is in the latter category: it’s an interesting setting where we have less control in our access model and where, to my knowledge, no technique in the literature comes close to working.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Question.&lt;/strong&gt; Let \(H\) be a geometrically local Hamiltonian, and suppose we are only given the ability to perform the channel \(\rho \mapsto e^{-iHt} \rho e^{iHt}\) for all \(t &amp;gt; 100\). Can we still do Hamiltonian learning with a total time evolution of \(\operatorname{poly}(n/\varepsilon)\)? How about \(O(1/\varepsilon)\)?&lt;/p&gt;

&lt;p&gt;As discussed in the previous open problem, there’s a sense in which long time evolutions “scramble” the information we wish to learn. Mathematically, this translates to the relevant objects becoming more difficult to handle: local observables become non-local, power series converge slower, and sometimes they don’t converge at all. There’s an additional technical challenge related to &lt;em&gt;identifiability&lt;/em&gt;: for a sufficiently large constant \(t\), \(e^{-iHt} = e^{-iGt}\) does not imply that \(H = G\).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What’s known?&lt;/strong&gt; We (Ainesh Bakshi, Allen Liu, Ankur Moitra, and I) posed this question in our &lt;a href=&quot;https://arxiv.org/abs/2405.00082&quot;&gt;paper&lt;/a&gt;, but I don’t think we’re the first to pose this question. In that paper, we showed that Hamiltonian learning with total evolution time \(1/\varepsilon\) and time resolution \(\Theta(1)\) is possible, but the \(\Theta(1)\) is a small constant.&lt;/p&gt;

&lt;h2 id=&quot;glossary&quot;&gt;Glossary&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;Hamiltonian&lt;/strong&gt; on \(n\) qubits is a Hermitian matrix \(H \in \mathbb{C}^{2^n \times 2^n}\). We typically imagine this matrix being specified as a linear combination of &lt;strong&gt;terms&lt;/strong&gt; \(E_a \in \mathbb{C}^{2^n \times 2^n}\) with &lt;strong&gt;coefficients&lt;/strong&gt; \(\lambda_a \in \mathbb{R}\). In other words,&lt;/p&gt;

\[H = \sum_{a=1}^m \lambda_a E_a.\]

&lt;p&gt;We call a Hamiltonian &lt;strong&gt;\(k\)-local&lt;/strong&gt; if every \(E_a\) is supported on at most \(k\) qubits. Throughout, we additionally impose the constraint that \(E_a\) is a tensor product of &lt;a href=&quot;https://en.wikipedia.org/wiki/Pauli_matrices&quot;&gt;Pauli matrices&lt;/a&gt;, like \(\sigma_X \otimes \sigma_Z \otimes I \otimes \dots \otimes I\). This can be done without loss of generality (though note that this blows up the number of terms for non-local \(H\)), and it’s important to specify some kind of basis like this so that we can make a sensible definition for what “learning” the Hamiltonian means.&lt;/p&gt;

&lt;p&gt;For normalization, we also assume that \(-1 \leq \lambda_a \leq 1\) for all \(a = 1,\dots,m\).&lt;/p&gt;

&lt;p&gt;We call our Hamiltonian &lt;strong&gt;geometrically local&lt;/strong&gt; if there is some additional ‘geometric’ structure to the terms that appear; the strongest version of this assumption is that we imagine the \(n\) qubits are nodes on a lattice (represented by a graph \(G = ([n], E)\)); and every term \(E_a\) has a support \(\operatorname{supp}(E_a)\) with diameter at most \(\Delta\).&lt;/p&gt;

&lt;p&gt;A Hamiltonian is &lt;strong&gt;gapped&lt;/strong&gt; if its largest eigenvalue and second largest eigenvalue differ by at least (say) a constant.&lt;/p&gt;

&lt;p&gt;For these open problems, we imagine that these parameters \(k\) (and \(\Delta\) if we assume geometric locality) are constant, say 2.&lt;/p&gt;
</description>
        <pubDate>Tue, 22 Apr 2025 00:00:00 +0000</pubDate>
        <link>https://www.ewintang.com/blog/2025/04/22/open/</link>
        <guid isPermaLink="true">https://www.ewintang.com/blog/2025/04/22/open/</guid>
      </item>
    
      <item>
        <title>Accessible TeX colors</title>
        <description>&lt;p&gt;&lt;strong&gt;Summary:&lt;/strong&gt; Instead of using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;\textcolor{red}&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;green&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;blue&lt;/code&gt;, use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;purple&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;teal&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;blue&lt;/code&gt;. Want more colors? Use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;\usepackage{ninecolors}&lt;/code&gt;.&lt;/p&gt;

&lt;h1 id=&quot;which-default-latex-colors-are-readable&quot;&gt;Which default LaTeX colors are readable?&lt;/h1&gt;

&lt;p&gt;The set of &lt;a href=&quot;https://en.wikibooks.org/wiki/LaTeX/Colors#Predefined_colors&quot;&gt;LaTeX default colors&lt;/a&gt; are chosen primarily to be a mathematically clean subsample of the RGB space. For example, red, green, and blue are just &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;#FF0000&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;#00FF00&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;#0000FF&lt;/code&gt;—the codes look nice, but the actual colors have drastically different brightnesses &lt;span style=&quot;color: #FF0000&quot;&gt;■&lt;/span&gt;&lt;span style=&quot;color: #00FF00&quot;&gt;■&lt;/span&gt;&lt;span style=&quot;color: #0000FF&quot;&gt;■&lt;/span&gt;. This leads to some unintuitive behavior, like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;\textcolor{green}&lt;/code&gt; making text pretty unreadable. &lt;span style=&quot;color: #00FF00&quot;&gt;(See this example?)&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;I wanted to find some simple-to-state advice for what to use instead, so I went through the list of default colors and checked which met the web standard for being readable against a white page.
(See the tables at the bottom for more specifics, and additional data for dvipsnames colors.)&lt;/p&gt;

&lt;p&gt;6 meet this standard for use as text color against a white background: &lt;span style=&quot;color: #0000FF&quot;&gt;blue&lt;/span&gt;, &lt;span style=&quot;color: #C00040&quot;&gt;purple&lt;/span&gt;, &lt;span style=&quot;color: #008080&quot;&gt;teal&lt;/span&gt;, &lt;span style=&quot;color: #800080&quot;&gt;violet&lt;/span&gt;, &lt;span style=&quot;color: #000000&quot;&gt;black&lt;/span&gt;, and &lt;span style=&quot;color: #404040&quot;&gt;darkgray&lt;/span&gt;. Note that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;red&lt;/code&gt; is not on this list! I think this makes sense: I often see default red in papers, but it is bright to the point that it slightly hurts readability, at least for me. &lt;span style=&quot;color: #FF0000&quot;&gt;(See this example?)&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;So, in short, you only wanted to use default colors, purple, teal, and blue would serve as good replacements for a red, green, and blue.&lt;/p&gt;

&lt;h1 id=&quot;other-color-advice&quot;&gt;Other color advice&lt;/h1&gt;

&lt;p&gt;Here’s some other general strategies I use to pick colors for talks/papers.&lt;/p&gt;

&lt;h2 id=&quot;ninecolors-package&quot;&gt;ninecolors package&lt;/h2&gt;

&lt;p&gt;In 2021, the &lt;a href=&quot;https://ctan.org/pkg/ninecolors?lang=en&quot;&gt;ninecolors TeX package&lt;/a&gt; was released. It defines reasonable defaults for color choices (for example, here are their colors &lt;span style=&quot;color: #B82626&quot;&gt;red4&lt;/span&gt;, &lt;span style=&quot;color: #177017&quot;&gt;green4&lt;/span&gt;, and &lt;span style=&quot;color: #4C4CD9&quot;&gt;blue4&lt;/span&gt;), and I’ve used it for papers ever since. Recommend.&lt;/p&gt;

&lt;h2 id=&quot;steal-colors-from-data-visualization-software&quot;&gt;Steal colors from data visualization software&lt;/h2&gt;

&lt;p&gt;If I need more than one color, I often just copy existing sets of colors from matplotlib, Tableau, Excel. They’ve already done all of the &lt;a href=&quot;https://www.youtube.com/watch?v=xAoljeRJ3lU&quot;&gt;hard work&lt;/a&gt; of making colors look nice together while also making colors maximally distinguishable, even for colorblind viewers. This sort of software also often has excellent &lt;a href=&quot;https://seaborn.pydata.org/tutorial/color_palettes.html&quot;&gt;documentation&lt;/a&gt; about how to use colors in plots and things.&lt;/p&gt;

&lt;h2 id=&quot;dont-steal-color-schemes-from-text-editors&quot;&gt;Don’t steal color schemes from text editors&lt;/h2&gt;

&lt;p&gt;Once, I gave a talk using the &lt;a href=&quot;https://ethanschoonover.com/solarized/&quot;&gt;solarized dark&lt;/a&gt; theme, and then someone later told me that the slides were very hard to read. Since then, I’ve heard other people make this exact mistake as well. And I think it’s quite a bad mistake: even if everyone can technically make out the text on the slides if they try, once someone in the back starts daydreaming, you will not get them back unless your slide is immediately legible at a glance.&lt;/p&gt;

&lt;p&gt;Coding color schemes are often designed to have less contrast between light and dark. This means that they’re harder to read. And for talk slides in particular, you want your text to be easily readable, even for those sitting in the back, and for those with weaker eyesight. (And I think generally “light mode” is preferred over “dark mode”, since the former projects better across various lighting conditions.)&lt;/p&gt;

&lt;h2 id=&quot;dont-rely-on-color-too-much&quot;&gt;Don’t rely on color too much&lt;/h2&gt;

&lt;p&gt;As a final comment on accessibility, I want to point out the guideline that &lt;a href=&quot;https://www.w3.org/WAI/WCAG22/Techniques/general/G14&quot;&gt;no crucial information should be communicated solely through color&lt;/a&gt;. I highlight important things with colors all the time. But I try to follow the guideline that, if my document was printed in grayscale, someone could still understand it.&lt;/p&gt;

&lt;h1 id=&quot;appendix-methodology-and-data&quot;&gt;Appendix: methodology and data&lt;/h1&gt;

&lt;p&gt;There’s &lt;a href=&quot;https://www.w3.org/TR/WCAG22/#contrast-minimum&quot;&gt;accessibility guidelines&lt;/a&gt; for how large the contrast of text needs to be against its background. It’s designed for the web, but since pdfs are typically viewed on screens, I used it as the standard.&lt;/p&gt;

&lt;p&gt;What these guidelines give is a formula to measure the &lt;a href=&quot;https://www.w3.org/TR/WCAG22/#dfn-relative-luminance&quot;&gt;relative luminance&lt;/a&gt; of the two colors, and then measure the relative contrast between them. If the contrast is above 4.5 to 1, it meets the minimum bar for being used for body text.&lt;/p&gt;

&lt;p&gt;There are 19 default LaTeX colors. I used a &lt;a href=&quot;https://webaim.org/resources/contrastchecker/&quot;&gt;color checker&lt;/a&gt; to check the contrast.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;color&lt;/th&gt;
      &lt;th&gt;hex code&lt;/th&gt;
      &lt;th&gt;contrast ratio&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;red&lt;/td&gt;
      &lt;td&gt;FF0000 &lt;span style=&quot;color: #FF0000&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.99&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;green&lt;/td&gt;
      &lt;td&gt;00FF00 &lt;span style=&quot;color: #00FF00&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.37&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;blue&lt;/td&gt;
      &lt;td&gt;0000FF &lt;span style=&quot;color: #0000FF&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;8.59&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;brown&lt;/td&gt;
      &lt;td&gt;C08040 &lt;span style=&quot;color: #C08040&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.27&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;lime&lt;/td&gt;
      &lt;td&gt;C0FF00 &lt;span style=&quot;color: #C0FF00&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.19&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;orange&lt;/td&gt;
      &lt;td&gt;FF8000 &lt;span style=&quot;color: #FF8000&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.51&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;pink&lt;/td&gt;
      &lt;td&gt;FFC0C0 &lt;span style=&quot;color: #FFC0C0&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.54&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;purple&lt;/td&gt;
      &lt;td&gt;C00040 &lt;span style=&quot;color: #C00040&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;6.33&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;teal&lt;/td&gt;
      &lt;td&gt;008080 &lt;span style=&quot;color: #008080&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;4.77&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;violet&lt;/td&gt;
      &lt;td&gt;800080 &lt;span style=&quot;color: #800080&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;9.41&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;cyan&lt;/td&gt;
      &lt;td&gt;00FFFF &lt;span style=&quot;color: #00FFFF&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.25&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;magenta&lt;/td&gt;
      &lt;td&gt;FF00FF &lt;span style=&quot;color: #FF00FF&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.13&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;yellow&lt;/td&gt;
      &lt;td&gt;FFFF00 &lt;span style=&quot;color: #FFFF00&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.99&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;olive&lt;/td&gt;
      &lt;td&gt;808000 &lt;span style=&quot;color: #808000&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;4.19&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;black&lt;/td&gt;
      &lt;td&gt;000000 &lt;span style=&quot;color: #000000&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;21&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;darkgray&lt;/td&gt;
      &lt;td&gt;404040 &lt;span style=&quot;color: #404040&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;10.36&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;gray&lt;/td&gt;
      &lt;td&gt;808080 &lt;span style=&quot;color: #808080&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.94&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;lightgray&lt;/td&gt;
      &lt;td&gt;C0C0C0 &lt;span style=&quot;color: #C0C0C0&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.81&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;white&lt;/td&gt;
      &lt;td&gt;FFFFFF &lt;span style=&quot;color: #FFFFFF&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;And here are the colors named by dvips, sorted by contrast.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;color&lt;/th&gt;
      &lt;th&gt;hex code&lt;/th&gt;
      &lt;th&gt;contrast ratio&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;White&lt;/td&gt;
      &lt;td&gt;FFFFFF	&lt;span style=&quot;color: #FFFFFF&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Yellow&lt;/td&gt;
      &lt;td&gt;FFF200	&lt;span style=&quot;color: #FFF200&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.16&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Goldenrod&lt;/td&gt;
      &lt;td&gt;FFDF42	&lt;span style=&quot;color: #FFDF42&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.32&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;GreenYellow&lt;/td&gt;
      &lt;td&gt;DFE674	&lt;span style=&quot;color: #DFE674&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.33&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;SpringGreen&lt;/td&gt;
      &lt;td&gt;C6DC67	&lt;span style=&quot;color: #C6DC67&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.51&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Dandelion&lt;/td&gt;
      &lt;td&gt;FDBC42	&lt;span style=&quot;color: #FDBC42&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.68&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Apricot&lt;/td&gt;
      &lt;td&gt;FBB982	&lt;span style=&quot;color: #FBB982&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.69&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;YellowGreen&lt;/td&gt;
      &lt;td&gt;98CC70	&lt;span style=&quot;color: #98CC70&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.87&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Lavender&lt;/td&gt;
      &lt;td&gt;F49EC4	&lt;span style=&quot;color: #F49EC4&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;1.99&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;LimeGreen&lt;/td&gt;
      &lt;td&gt;8DC73E	&lt;span style=&quot;color: #8DC73E&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.02&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;SkyBlue&lt;/td&gt;
      &lt;td&gt;46C5DD	&lt;span style=&quot;color: #46C5DD&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.04&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;YellowOrange&lt;/td&gt;
      &lt;td&gt;FAA21A	&lt;span style=&quot;color: #FAA21A&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.04&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Melon&lt;/td&gt;
      &lt;td&gt;F89E7B	&lt;span style=&quot;color: #F89E7B&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.06&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Peach&lt;/td&gt;
      &lt;td&gt;F7965A	&lt;span style=&quot;color: #F7965A&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.21&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Salmon&lt;/td&gt;
      &lt;td&gt;F69289	&lt;span style=&quot;color: #F69289&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.23&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;BurntOrange&lt;/td&gt;
      &lt;td&gt;F7921D	&lt;span style=&quot;color: #F7921D&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.31&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tan&lt;/td&gt;
      &lt;td&gt;DA9D76	&lt;span style=&quot;color: #DA9D76&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.31&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;SeaGreen&lt;/td&gt;
      &lt;td&gt;3FBC9D	&lt;span style=&quot;color: #3FBC9D&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.36&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;CarnationPink&lt;/td&gt;
      &lt;td&gt;F282B4	&lt;span style=&quot;color: #F282B4&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.43&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;CornflowerBlue&lt;/td&gt;
      &lt;td&gt;41B0E4	&lt;span style=&quot;color: #41B0E4&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.45&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;ProcessBlue&lt;/td&gt;
      &lt;td&gt;00B0F0	&lt;span style=&quot;color: #00B0F0&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.47&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Turquoise&lt;/td&gt;
      &lt;td&gt;00B4CE	&lt;span style=&quot;color: #00B4CE&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.49&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Aquamarine&lt;/td&gt;
      &lt;td&gt;00B5BE	&lt;span style=&quot;color: #00B5BE&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.51&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Cyan&lt;/td&gt;
      &lt;td&gt;00AEEF	&lt;span style=&quot;color: #00AEEF&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.52&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;BlueGreen&lt;/td&gt;
      &lt;td&gt;00B3B8	&lt;span style=&quot;color: #00B3B8&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.57&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Orange&lt;/td&gt;
      &lt;td&gt;F58137	&lt;span style=&quot;color: #F58137&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.59&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Thistle&lt;/td&gt;
      &lt;td&gt;D883B7	&lt;span style=&quot;color: #D883B7&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.67&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;TealBlue&lt;/td&gt;
      &lt;td&gt;00AEB3	&lt;span style=&quot;color: #00AEB3&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.72&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Cerulean&lt;/td&gt;
      &lt;td&gt;00A2E3	&lt;span style=&quot;color: #00A2E3&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.88&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Emerald&lt;/td&gt;
      &lt;td&gt;00A99D	&lt;span style=&quot;color: #00A99D&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.93&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;JungleGreen&lt;/td&gt;
      &lt;td&gt;00A99A	&lt;span style=&quot;color: #00A99A&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.94&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Gray&lt;/td&gt;
      &lt;td&gt;949698	&lt;span style=&quot;color: #949698&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;2.96&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Green&lt;/td&gt;
      &lt;td&gt;00A64F	&lt;span style=&quot;color: #00A64F&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.19&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;VioletRed&lt;/td&gt;
      &lt;td&gt;EF58A0	&lt;span style=&quot;color: #EF58A0&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.19&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RedOrange&lt;/td&gt;
      &lt;td&gt;F26035	&lt;span style=&quot;color: #F26035&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.23&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Rhodamine&lt;/td&gt;
      &lt;td&gt;EF559F	&lt;span style=&quot;color: #EF559F&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.24&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Orchid&lt;/td&gt;
      &lt;td&gt;AF72B0	&lt;span style=&quot;color: #AF72B0&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.58&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;ForestGreen&lt;/td&gt;
      &lt;td&gt;009B55	&lt;span style=&quot;color: #009B55&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;3.6&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;WildStrawberry&lt;/td&gt;
      &lt;td&gt;EE2967	&lt;span style=&quot;color: #EE2967&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;4.07&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Periwinkle&lt;/td&gt;
      &lt;td&gt;7977B8	&lt;span style=&quot;color: #7977B8&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;4.08&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Magenta&lt;/td&gt;
      &lt;td&gt;EC008C	&lt;span style=&quot;color: #EC008C&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;4.24&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;PineGreen&lt;/td&gt;
      &lt;td&gt;008B72	&lt;span style=&quot;color: #008B72&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;4.25&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RubineRed&lt;/td&gt;
      &lt;td&gt;ED017D	&lt;span style=&quot;color: #ED017D&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;4.28&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;OrangeRed&lt;/td&gt;
      &lt;td&gt;ED135A	&lt;span style=&quot;color: #ED135A&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;4.33&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Red&lt;/td&gt;
      &lt;td&gt;ED1B23	&lt;span style=&quot;color: #ED1B23&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;4.39&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;CadetBlue&lt;/td&gt;
      &lt;td&gt;74729A	&lt;span style=&quot;color: #74729A&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;4.54&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Bittersweet&lt;/td&gt;
      &lt;td&gt;C04F17	&lt;span style=&quot;color: #C04F17&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;4.8&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;OliveGreen&lt;/td&gt;
      &lt;td&gt;3C8031	&lt;span style=&quot;color: #3C8031&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;4.85&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;DarkOrchid&lt;/td&gt;
      &lt;td&gt;A4538A	&lt;span style=&quot;color: #A4538A&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;5.02&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RoyalBlue&lt;/td&gt;
      &lt;td&gt;0071BC	&lt;span style=&quot;color: #0071BC&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;5.13&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;NavyBlue&lt;/td&gt;
      &lt;td&gt;006EB8	&lt;span style=&quot;color: #006EB8&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;5.35&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Mulberry&lt;/td&gt;
      &lt;td&gt;A93C93	&lt;span style=&quot;color: #A93C93&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;5.59&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Purple&lt;/td&gt;
      &lt;td&gt;99479B	&lt;span style=&quot;color: #99479B&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;5.63&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;BrickRed&lt;/td&gt;
      &lt;td&gt;B6321C	&lt;span style=&quot;color: #B6321C&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;6.06&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;MidnightBlue&lt;/td&gt;
      &lt;td&gt;006795	&lt;span style=&quot;color: #006795&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;6.22&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Maroon&lt;/td&gt;
      &lt;td&gt;AF3235	&lt;span style=&quot;color: #AF3235&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;6.3&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Mahogany&lt;/td&gt;
      &lt;td&gt;A9341F	&lt;span style=&quot;color: #A9341F&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;6.56&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RawSienna&lt;/td&gt;
      &lt;td&gt;974006	&lt;span style=&quot;color: #974006&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;6.88&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Fuchsia&lt;/td&gt;
      &lt;td&gt;8C368C	&lt;span style=&quot;color: #8C368C&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;6.95&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RedViolet&lt;/td&gt;
      &lt;td&gt;A1246B	&lt;span style=&quot;color: #A1246B&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;7.04&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Plum&lt;/td&gt;
      &lt;td&gt;92268F	&lt;span style=&quot;color: #92268F&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;7.25&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RoyalPurple&lt;/td&gt;
      &lt;td&gt;613F99	&lt;span style=&quot;color: #613F99&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;7.83&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Violet&lt;/td&gt;
      &lt;td&gt;58429B	&lt;span style=&quot;color: #58429B&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;7.87&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;BlueViolet&lt;/td&gt;
      &lt;td&gt;473992	&lt;span style=&quot;color: #473992&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;9.25&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Brown&lt;/td&gt;
      &lt;td&gt;792500	&lt;span style=&quot;color: #792500&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;10.1&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Blue&lt;/td&gt;
      &lt;td&gt;2D2F92	&lt;span style=&quot;color: #2D2F92&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;10.86&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Sepia&lt;/td&gt;
      &lt;td&gt;671800	&lt;span style=&quot;color: #671800&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;12.29&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Black&lt;/td&gt;
      &lt;td&gt;221E1F	&lt;span style=&quot;color: #221E1F&quot;&gt;■&lt;/span&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;16.48&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Bolded are the ones that meet the contrast guidelines against a white background. Overall, the colors that remain are pretty unsatisfying: for example, there’s only one green. This is why I was so happy to see the ninecolors package when it came out; it is much more usable as a set of colors than any other package I am familiar with.&lt;/p&gt;
</description>
        <pubDate>Sun, 12 Jan 2025 00:00:00 +0000</pubDate>
        <link>https://www.ewintang.com/blog/2025/01/12/colors/</link>
        <guid isPermaLink="true">https://www.ewintang.com/blog/2025/01/12/colors/</guid>
      </item>
    
      <item>
        <title>Some settings supporting efficient state preparation</title>
        <description>\[\gdef\BB#1{\mathbb{#1}}
\gdef\eps\varepsilon
\gdef\ket#1{|#1\rangle}
\gdef\bra#1{\langle#1|}\]

&lt;p&gt;I wrote most of this list to procrastinate on the flight back from TQC (which was great!).
So, for my own reference: here’s some settings where efficient state preparation / data loading is possible, and classical versions of these protocols. Notes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;There might be errors, especially in details of the quantum protocols, and some of the algorithms may be suboptimal (note the streaming setting, in particular). Let me know if you notice either of these.&lt;/li&gt;
  &lt;li&gt;Some relevant complexity research here is in &lt;a href=&quot;https://arxiv.org/abs/1607.05256&quot;&gt;QSampling&lt;/a&gt; (Section 4).&lt;/li&gt;
  &lt;li&gt;All these runtimes should have an extra \(O(\log n)\) factor, since we assume that indices and entries take \(\log n\) bits/qubits to specify.
However, I’m going to follow the convention from classical computing and ignore these factors, hopefully with little resulting confusion.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For all that follows, we are given \(v \in \mathbb{C}^n\) in some way and want to output&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;for the quantum case, a copy of the state \(\ket{v} = \sum_{i=1}^n \frac{v_i}{\|v\|} \ket{i}\), and&lt;/li&gt;
  &lt;li&gt;for the classical case, the pair \((i,v_i)\) output with probability \(\frac{\vert v_i\vert^2}{\|v\|^2}\).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You could think about this as strong quantum simulation of state preparation protocols.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;type&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;#v-is-sparse&quot;&gt;sparse&lt;/a&gt;&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;#v-is-close-to-uniform&quot;&gt;uniform&lt;/a&gt;&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;#v-is-efficiently-integrable&quot;&gt;integrable&lt;/a&gt;&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;#v-is-stored-in-a-dynamic-data-structure&quot;&gt;QRAM&lt;/a&gt;&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;#v-is-streamed&quot;&gt;streamed&lt;/a&gt;&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;quantum&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;\(O(s)\)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;\(O(C\log\frac1\delta)\)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;\(O(I \log n)\)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;\(O(\log n)\) depth&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;\(O(1)\) space with 2 passes&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;classical&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;\(O(s)\)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;\(O(C^2\log\frac1\delta)\)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;\(O(I\log n)\)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;\(O(\log n)\)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;\(O(1)\) space with 1 pass&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Recall that if we want to prepare an arbitrary quantum state, we need at least \(\Omega(\sqrt{n})\) time by search lower bounds, so for some settings of the above constants, these protocols are exponentially faster than the naive strategy.
Further recall that state preparation and sampling both have easy protocols running in \(O(n)\) time.&lt;/p&gt;

&lt;h2 id=&quot;v-is-sparse&quot;&gt;\(v\) is sparse&lt;/h2&gt;
&lt;p&gt;We assume that \(v\) has at most \(s\) nonzero entries and we can access a list of the nonzero entries \(((i_1,v_{i_1}),(i_2,v_{i_2}),\ldots,(i_s,v_{i_s}))\).
Thus, we have the oracle \(a \to (i_a, v_{i_a})\).&lt;/p&gt;

&lt;p&gt;We can prepare the quantum state and classical sample by preparing the vector \(v&apos; \in \BB{C}^s\) where \(v_a&apos; = v_{i_a}\), and then using the oracle to swap out the index \(a\) with \(i_a\).
This gives \(O(s)\) classical and quantum time.&lt;/p&gt;

&lt;h2 id=&quot;v-is-close-to-uniform&quot;&gt;\(v\) is close-to-uniform&lt;/h2&gt;
&lt;p&gt;We assume that \(\max\vert v_i\vert \leq C\frac{\|v\|}{\sqrt{n}}\) and we know \(C, \|v\|\).
Notice that we don’t give a lower bound on the size of entries, but we can’t have too many small entries, since this would lower the norm.
Also notice that \(C \geq 1\).&lt;/p&gt;

&lt;p&gt;Quantumly, given the typical oracle \(\ket{i}\ket{0} \to \ket{i}\ket{v_i}\) we can prepare the state&lt;/p&gt;

\[\frac{1}{\sqrt{n}}\sum_{i=1}^n \ket{i}\Big(\frac{v_i\sqrt{n}}{\|v\|C}\ket{0} + \sqrt{1-\frac{\vert v_i\vert ^2n}{\|v\|^2C^2}}\ket{1}\Big).\]

&lt;p&gt;Measuring the ancilla and post-selecting on 0 gives \(\ket{v}\).
This happens with probability \(\frac{1}{C^2}\), and with amplitude amplification this means we can get a copy of the state with probability \(\geq 1-\delta\) in \(O(C\log\frac1\delta)\) time.&lt;/p&gt;

&lt;p&gt;Classically, we perform rejection sampling from the uniform distribution: pick an index uniformly at random, and keep it with probability \(\frac{v_i^2n}{\|v\|^2C^2}\); otherwise, restart.
This outputs the correct distribution and gives a sample in \(O(C^2\log\frac1\delta)\) time.&lt;/p&gt;

&lt;h2 id=&quot;v-is-efficiently-integrable&quot;&gt;\(v\) is efficiently integrable&lt;/h2&gt;
&lt;p&gt;We assume that, given \(1 \leq a \leq b \leq n\), I can compute \(\sqrt{\sum_{i=a}^b |v_i|^2}\) in \(O(I)\) time.
This assumption and the resulting quantum preparation routine comes from &lt;a href=&quot;https://arxiv.org/abs/quant-ph/0208112&quot;&gt;Grover-Rudolph&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The quantum algorithm uses one core subroutine: adding an extra qubit, sending \(\ket{v^{(k)}} \to \ket{v^{(k+1)}}\), where&lt;/p&gt;

\[\ket{v^{(k)}} := \sum_{b \in \{0,1\}^k} \ket{b}\sqrt{\sum_{i=b\cdot 0^{n-k}}^{b\cdot 1^{n-k}} |v_i|^2}\]

&lt;p&gt;All that’s necessary is to apply it \(O(\log n)\) times and add the phase at the end.
I haven’t worked it out, but I think you can run the subroutine efficiently using three calls to the integration oracle, giving \(O(I\log n)\) time.&lt;/p&gt;

&lt;p&gt;Classically, we can do essentially the same thing: the integration oracle means that we can compute marginal probabilities; that is,&lt;/p&gt;

\[\Pr_{s \sim v}[s\text{&apos;s bit representation starts with } b] = \sum_{i=b\cdot 0^{n-k}}^{b\cdot 1^{n-k}} |v_i|^2\]

&lt;p&gt;Thus, we can sample from the distribution on the first bit, then sample from the distribution on the second bit conditioned on our value of the first bit, and so on.
This also gives \(O(I\log n)\) time.&lt;/p&gt;

&lt;h2 id=&quot;v-is-stored-in-a-dynamic-data-structure&quot;&gt;\(v\) is stored in a dynamic data structure&lt;/h2&gt;
&lt;p&gt;We assume that our vector can be stored in a data structure that supports efficient updating of entries.
Namely, we use the standard binary search tree data structure (see, for example, &lt;a href=&quot;https://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-211.pdf&quot;&gt;Section 2.2.2 of Prakash’s thesis&lt;/a&gt;).
This is a simple data structure with many nice properties, including \(O(\log n)\) time updates.
If you want to prepare many states corresponding to similar vectors, this is a good option.&lt;/p&gt;

&lt;p&gt;There’s not much more to say, since the protocol is the same as the integrability protocol.
The only difference is that, instead of assuming that we can compute interval sums efficiently, we instead precompute and store all of the integration oracle calls we need for the state preparation procedure in a data structure.&lt;/p&gt;

&lt;p&gt;The classical runtime is \(O(\log n)\), and the &lt;a href=&quot;https://arxiv.org/abs/1812.00954&quot;&gt;quantum circuit&lt;/a&gt; takes \(O(n)\) gates but only \(O(\log n)\) depth.
The quantum algorithm is larger because here, we need to query a linear number of memory cells, as opposed to the integrabilility assumption, where we only needed to run the integration oracle in superposition.&lt;/p&gt;

&lt;p&gt;While it may seem that the classical algorithm wins definitively here, the small depth leaves potential for this protocol to run in \(O(\log n)\) time in practice, matching the classical algorithm.&lt;/p&gt;

&lt;h2 id=&quot;v-is-streamed&quot;&gt;\(v\) is streamed&lt;/h2&gt;
&lt;p&gt;We assume that we can receive a stream of the entries of \(v\) in order; we wish to produce a state/sample using as little space as possible.&lt;/p&gt;

&lt;p&gt;Classically, we can do this with &lt;a href=&quot;https://en.wikipedia.org/wiki/Reservoir_sampling&quot;&gt;reservoir sampling&lt;/a&gt;.
The idea is that we maintain a sample \((s, v_s)\) from all of the entries we’ve seen before, along with their squared norm \(\lambda = \sum_{i=1}^k \vert v_i\vert^2\).
Then, when we receive a new entry \(v_{k+1}\), we swap our sample to \((k+1,v_{k+1})\) with probability \(\vert v_{k+1}\vert^2/(\lambda + \vert v_{k+1}\vert^2)\) and update our \(\lambda\) to \(\lambda + \vert v_{k+1}\vert^2\).
After we go through all of \(v\)’s entries, we get a sample only using \(O(1)\) space.
(This is a particularly nice algorithm for sampling from a vector, since it has good locality and can be generalized to get \(O(k)\) samples in \(O(k)\) space and one pass.)&lt;/p&gt;

&lt;p&gt;Quantumly, I only know how to prepare a state in one pass with sublinear space if the norm is known.
If you know \(\|v\|\), then you can prepare \(\ket{n}\), and as entries come in, rotate to get \(\frac{v_1}{\|v\|}\ket{1} + \sqrt{1-\frac{|v_1|^2}{\|v\|^2}}\ket{n}\), then \(\frac{v_1}{\|v\|}\ket{1} + \frac{v_2}{\|v\|}\ket{2} + \sqrt{1-\frac{|v_1|^2+|v_2|^2}{\|v\|^2}}\ket{n}\), and so on.
This uses only \(O(\log n)\) qubits, which I notate here as \(O(1)\) space.&lt;/p&gt;

&lt;p&gt;You can relax this assumption to just having an estimate \(\lambda\) of \(\|v\|\) such that \(\frac{1}{\text{poly}(n)} \leq \lambda/\|v\| \leq \text{poly}(n)\).
Finally, if you like, you can remove the assumption that you know the norm just by requiring two passes instead of one; in the first pass, compute the norm, and in the second pass, prepare the state.
But it’d be nice to remove the assumption entirely.&lt;/p&gt;

&lt;p&gt;So, &lt;strong&gt;is it possible to prepare a quantum state corresponding to a generic \(v \in \BB{C}^n\), given only one pass through it?&lt;/strong&gt; Thanks to &lt;a href=&quot;https://www.chunhaowang.com/&quot;&gt;Chunhao Wang&lt;/a&gt; and &lt;a href=&quot;https://www.cs.utexas.edu/~nai/&quot;&gt;Nai-Hui Chia&lt;/a&gt; for telling me about this problem.&lt;/p&gt;
</description>
        <pubDate>Thu, 13 Jun 2019 00:00:00 +0000</pubDate>
        <link>https://www.ewintang.com/blog/2019/06/13/some-settings-supporting-efficient-state-preparation/</link>
        <guid isPermaLink="true">https://www.ewintang.com/blog/2019/06/13/some-settings-supporting-efficient-state-preparation/</guid>
      </item>
    
      <item>
        <title>An overview of quantum-inspired classical sampling</title>
        <description>\[\gdef\SC#1{\mathcal{#1}} %katex
\gdef\BB#1{\mathbb{#1}}
\gdef\eps\varepsilon
\gdef\SQ{\operatorname{SQ}}
\gdef\Q{\operatorname{Q}}
\gdef\Tr{\operatorname{Tr}}
\gdef\ket#1{\left|#1\right\rangle}
\gdef\bra#1{\left\langle#1\right|}
\gdef\poly{\operatorname{poly}}
\gdef\polylog{\operatorname{polylog}}
%\newcommand{\SC}[1]{\mathcal{#1}} %mathjax
%\newcommand{\BB}[1]{\mathbb{#1}}
%\newcommand{\eps}{\varepsilon}
%\newcommand{\SQ}{\operatorname{SQ}}
%\newcommand{\Q}{\operatorname{Q}}
%\newcommand{\Tr}{\operatorname{Tr}}
%\newcommand{\ket}[1]{\left|#1\right\rangle}
%\newcommand{\bra}[1]{\left\langle#1\right|}
%\newcommand{\poly}{\operatorname{poly}}
%\newcommand{\polylog}{\operatorname{polylog}}\]

&lt;p&gt;&lt;strong&gt;Update [2020-08-26]:&lt;/strong&gt; The way I present this topic in this blog post is not exactly how I present it now. See my &lt;a href=&quot;https://www.youtube.com/watch?v=EsJEBJ2d1UY&quot;&gt;STOC 2020 talk&lt;/a&gt; for a more recent version. The differences are pretty minor: results are more general and there are fewer hidden caveats. If you’re up for a somewhat technical talk, check it out!&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;This is an adaptation of a talk I gave at Microsoft Research in November 2018.&lt;/p&gt;

&lt;p&gt;I exposit the \(\ell^2\) sampling techniques used in my recommendation systems work and its follow-ups in dequantized machine learning:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Tang – &lt;a href=&quot;https://arxiv.org/abs/1807.04271&quot;&gt;&lt;em&gt;A quantum-inspired algorithm for recommendation systems&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Tang – &lt;a href=&quot;https://arxiv.org/abs/1811.00414&quot;&gt;&lt;em&gt;Quantum-inspired classical algorithms for principal component analysis and supervised clustering&lt;/em&gt;&lt;/a&gt;;&lt;/li&gt;
  &lt;li&gt;Gilyén, Lloyd, Tang – &lt;a href=&quot;https://arxiv.org/abs/1811.04909&quot;&gt;&lt;em&gt;Quantum-inspired low-rank stochastic regression with logarithmic dependence on the dimension&lt;/em&gt;&lt;/a&gt;;&lt;/li&gt;
  &lt;li&gt;Chia, Lin, Wang – &lt;a href=&quot;https://arxiv.org/abs/1811.04852&quot;&gt;&lt;em&gt;Quantum-inspired sublinear classical algorithms for solving low-rank linear systems&lt;/em&gt;&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core ideas used are super simple.
This goal of this blog post is to break down these ideas into intuition relevant for quantum researchers and create more understanding of this machine learning paradigm.&lt;/p&gt;

&lt;p&gt;Notation is defined in the &lt;a href=&quot;#glossary&quot;&gt;Glossary&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The intended audience is researchers comfortable with probability and linear algebra (SVD, in particular).
Basic quantum knowledge helps with intuition, but is not essential: everything from &lt;a href=&quot;#the-model&quot;&gt;The model&lt;/a&gt; onward is purely classical.
The appendix is optional and explains the dequantized techniques in more detail.&lt;/p&gt;

&lt;h2 id=&quot;an-introduction-to-dequantization&quot;&gt;An introduction to dequantization&lt;/h2&gt;

&lt;h3 id=&quot;motivation&quot;&gt;Motivation&lt;/h3&gt;

&lt;p&gt;The best, most sought-after quantum algorithms are those that take in raw, classical input and give some classical output.
For example, Shor’s algorithm for factoring takes this form.
These &lt;em&gt;classical-to-classical&lt;/em&gt; algorithms (a term I invented for this post) have the best chance to be efficiently implemented in practice: all you need is a scalable quantum computer. (It’s just that easy!)&lt;/p&gt;

&lt;p&gt;Nevertheless, many quantum algorithms aren’t so nice.
Most well-known QML algorithms convert input quantum states to a desired output state or value.
Thus, they do not provide a routine to get necessary copies of these input states (a &lt;em&gt;state preparation&lt;/em&gt; routine) and a strategy to extract information from an output state.
Both are essential to making the algorithm useful.&lt;/p&gt;

&lt;p&gt;An example of an algorithm that is not classical-to-classical is the &lt;em&gt;swap test&lt;/em&gt;.
If we have many copies of the quantum states \(\ket{a},\ket{b} \in \BB{C}^n\), then the swap test \(\SC{S}\) estimates their inner product in time polylogarithmic in dimension.
While this routine seems much faster than naively computing \(\sum_{i=1}^n \bar{a}_ib_i\) classically, we can only run this algorithm if we know how to prepare the states \(\ket{a}\) and \(\ket{b}\).
It may well be the case that state preparation is too expensive for input vectors, making the quantum algorithm as slow as the classical algorithm.
This illustrates the format and failings of most QML algorithms.&lt;/p&gt;

&lt;p&gt;You might then ask: can we fill in the missing routines in QML algorithms to get a classical-to-classical algorithm that’s provably fast and useful?
This is an open research problem: see Scott Aaronson’s piece on QML&lt;sup id=&quot;fnref:aaronson15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:aaronson15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;.
We have a variety of partial results towards the affirmative, but as far as I know, they don’t answer the question unless you’re loose with your definitions of at least one of “classical”, “provably fast”, or “useful”.
So let’s settle for a simpler question.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How can we compare the speed of quantum algorithms with quantum input and quantum output to classical algorithms with classical input and classical output?&lt;/strong&gt;
Quantum machine learning algorithms can be exponentially faster than the best standard classical algorithms for similar tasks, but this comparison is unfair because the quantum algorithms get outside help through input state preparation.
We want a classical model that helps its algorithms stand a chance against quantum algorithms, while still ensuring that they can be run in nearly all circumstances one would run the quantum algorithm.
The answer I propose: &lt;strong&gt;compare quantum algorithms with quantum state preparation to classical algorithms with &lt;em&gt;sample and query access&lt;/em&gt; to input.&lt;/strong&gt;&lt;/p&gt;

&lt;h3 id=&quot;the-model&quot;&gt;The model&lt;/h3&gt;

&lt;p&gt;Before we proceed with definitions, we’ll establish some conventions.
First, we generally consider our input as being some vector in \(\BB{C}^n\) or \(\BB{R}^n\), subject to an access model to be described.
Second, we’ll only concern ourselves with an algorithm’s &lt;em&gt;query complexity&lt;/em&gt;, the number of accesses to the input.
Our algorithms will have query complexity independent of input dimensions and polynomial in other parameters.
If we assume that each access costs (say) \(O(1)\) or \(O(\log n)\), the time complexity is still polylogarithmic in input dimension and at most polynomially worse in other parameters.&lt;/p&gt;

&lt;p&gt;Now, we define query access to input; we can get query access simply by having the input in RAM.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Definition.&lt;/strong&gt;
We have &lt;em&gt;query access&lt;/em&gt; to \(x \in \BB{C}^n\) (denoted \(\Q(x)\)) if, given \(i \in [n]\), we can efficiently compute \(x_i\).&lt;/p&gt;

&lt;p&gt;If we have \(x\) stored normally as an array in our classical computer’s memory, we have \(\Q(x)\) because finding the \(i\)th entry of \(x\) can be done with the code &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x[i]&lt;/code&gt;.
This notion of access can represent more than just memory: we can also have \(\Q(x)\) if \(x\) is &lt;em&gt;implicitly&lt;/em&gt; described.
For example, consider \(x\) the vector of squares: \(x_i = i^2\) for all \(i\).
We can have access to \(x\) without writing \(x\) in memory.
This will be important for the algorithms to come.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Definition.&lt;/strong&gt;
We have &lt;em&gt;sample and query access&lt;/em&gt; to \(x \in \BB{C}^n\) (denoted \(\SQ(x)\)) if we have query access to \(x\); can produce independent random samples \(i \in [n]\) where we sample \(i\) with probability \(|x_i|^2/\|x\|^2\); and can query for \(\|x\|\).&lt;/p&gt;

&lt;p&gt;Sampling and query access to \(x\) will be our classical analogue to assuming quantum state preparation of copies of \(\ket{x}\).
This should make some intuitive sense: our classical analogue \(\SQ(x)\) has the standard assumption of query access to input, along with samples, which are essentially measurements of \(\ket{x}\) in the computational basis.
Knowledge of \(\|x\|\) is for normalization issues, and is often assumed for quantum algorithms as well (though for both classical and quantum algorithms, often approximate knowledge suffices).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example.&lt;/strong&gt;
Like query access, we can get efficient sample and query access from an explicit memory structure.
To get \(\SQ(x)\) for a bit vector \(x \in \{0,1\}^n\), store the number of nonzero entries \(z\) and a sorted array of the 1-indices \(D\).
For example, we could store \(x = [1\;1\;0\;0\;1\;0\;0\;0]\) as&lt;/p&gt;

\[z, D = 3,\{1,2,5\}\]

&lt;p&gt;Then we can find \(x_i\) by checking if \(i \in D\), we can sample from \(x\) by picking an index from \(D\) uniformly at random, and we know \(\|x\|\), since it’s just \(\sqrt{z}\).
This generalizes to an efficient \(O(\log n)\) binary search tree data structure for \(\SQ(x)\) for any \(x \in \BB{C}^n\).&lt;/p&gt;

&lt;p&gt;We can also define sample and query access to matrices as just sample and query access to vectors “in” the matrix.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Definition.&lt;/strong&gt;
For \(A \in \BB{C}^{m\times n}\), \(\SQ(A)\) is defined as \(\SQ(A_i)\) for \(A_i\) the rows of \(A\), along with \(\SQ(\tilde{A})\) for \(\tilde{A}\) the vector of row norms (so \(\tilde{A}_i = \|A_i\|\)).&lt;/p&gt;

&lt;p&gt;By replacing quantum states with these classical analogues, we form a model based on sample and query access which we codify with the informal definition of “dequantization”.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Definition.&lt;/strong&gt;
Let \(\SC{A}\) be a quantum algorithm with input \(\ket{\phi_1},\ldots,\ket{\phi_C}\) and output either a state \(\ket{\psi}\) or a value \(\lambda\).
We say we &lt;em&gt;dequantize&lt;/em&gt; \(\SC{A}\) if we describe a classical algorithm that, given \(\SQ(\phi_1),\ldots,\SQ(\phi_C)\), can evaluate queries to \(\SQ(\psi)\) or output \(\lambda\), with similar guarantees to \(\SC{A}\) and query complexity \(\poly(C)\).&lt;/p&gt;

&lt;p&gt;That is, given sample and query access to the inputs, we can output sample and query access to a desired vector or a desired value, with at most polynomially larger query complexity.&lt;/p&gt;

&lt;p&gt;We justify why this model is a reasonable point of comparison two sections from now, in &lt;a href=&quot;#implications&quot;&gt;Implications&lt;/a&gt;.
Next, though, we will jump into how to build these dequantized protocols.&lt;/p&gt;

&lt;h2 id=&quot;quantum-for-the-quantum-less&quot;&gt;Quantum for the quantum-less&lt;/h2&gt;

&lt;p&gt;So far, all dequantized results revolve around three dequantized protocols that we piece together into more useful tasks.
In query complexity independent of \(m\) and \(n\), we can perform the following:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;(&lt;a href=&quot;#1-estimating-inner-products&quot;&gt;Inner Product&lt;/a&gt;)
For \(x,y \in \BB{C}^n\), given \(\SQ(x)\) and \(\Q(y)\), we can estimate \(\langle x,y\rangle\) to \(\|x\|\|y\|\eps\) error with probability \(\geq 1-\delta\) and \(\text{poly}(\frac1\eps, \log\frac1\delta)\) queries;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;(&lt;a href=&quot;#2-thin-matrix-vector-product-with-rejection-sampling&quot;&gt;Thin Matrix-Vector&lt;/a&gt;)
For \(V \in \BB{C}^{n\times k}, w \in \BB{C}^k\), given \(\SQ(V^\dagger)\) and \(\Q(w)\), we can simulate \(\SQ(Vw)\) with \(\text{poly}(k)\) queries;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;(&lt;a href=&quot;#3-low-rank-approximation-briefly&quot;&gt;Low-rank Approximation&lt;/a&gt;)
For \(A \in \BB{C}^{m\times n}\), given \(\SQ(A)\), a threshold \(k\), and an error parameter \(\eps\), we can output a description of a low-rank approximation of \(A\) with \(\text{poly}(k, \frac{1}{\eps})\) queries.&lt;/p&gt;

    &lt;p&gt;Specifically, our output is \(\SQ(S,\hat{U},\hat{\Sigma})\) for \(S \in \BB{C}^{\ell \times n}\), \(\hat{U} \in \BB{C}^{\ell \times k}\), and \(\hat{\Sigma} \in \BB{C}^{k\times k}\) (\(\ell = \poly(k,\frac{1}{\eps})\)), and this implicitly describes the low-rank approximation to \(A\), \(D := A(S^\dagger\hat{U}\hat{\Sigma}^{-1})(S^\dagger\hat{U}\hat{\Sigma}^{-1})^\dagger\) (notice rank \(D \leq k\)).&lt;/p&gt;

    &lt;p&gt;This matrix satisfies the following low-rank guarantee with probability \(\geq 1-\delta\): for \(\sigma := \sqrt{2/k}\|A\|_F\), and \(A_{\sigma} := \sum_{\sigma_i \geq \sigma} \sigma_iu_iv_i^\dagger\) (using \(A\)’s SVD),&lt;/p&gt;

\[\|A - D\|_F^2 \leq \|A - A_\sigma\|_F^2 + \eps^2\|A\|_F^2.\]

    &lt;p&gt;This guarantee is non-standard: instead of \(A_k\), we use \(A_\sigma\).
 This makes our promise weaker, since it is useless if \(A\) has no large singular values.&lt;/p&gt;

    &lt;p&gt;For intuition, it’s helpful to think of \(D\) as \(A\) multiplied with a “projector” \((S^\dagger\hat{U}\hat{\Sigma}^{-1})(S^\dagger\hat{U}\hat{\Sigma}^{-1})^\dagger\) that projects the rows of \(A\) onto the columns of \(S^\dagger\hat{U}\hat{\Sigma}^{-1}\), where these columns are “singular vectors” with corresponding “singular values” \(\hat{\sigma}_1,\ldots,\hat{\sigma}_k\) that are encoded in the diagonal matrix \(\hat{\Sigma}\).
 (For those interested, \(\hat{U}\) and \(\hat{\Sigma}\) are from the SVD of a &lt;em&gt;submatrix&lt;/em&gt; of \(A\), hence the evocative notation; see the &lt;a href=&quot;#3-low-rank-approximation-briefly&quot;&gt;appendix&lt;/a&gt; for more details.)&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first two protocols are dequantized swap tests and the third is essentially a dequantized variant of phase estimation seen in quantum recommendation systems&lt;sup id=&quot;fnref:kp17&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:kp17&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Now, we describe how these techniques are used to get the results for recommendation systems, supervised clustering, and low-rank matrix inversion.
We defer the important details of models and error analyses to &lt;a href=&quot;#implications&quot;&gt;Implications&lt;/a&gt;, instead focusing on the algorithms themselves and how they use dequantized protocols.&lt;/p&gt;

&lt;h3 id=&quot;supervised-clustering&quot;&gt;Supervised clustering&lt;/h3&gt;
&lt;p&gt;We want to find the distance from a point \(p \in \BB{R}^n\) to the centroid (average) of a cluster of points \(q_1,\ldots,q_{m-1} \in \BB{R}^{n}\).
If we assume sample and query access to the data points, computing \(\|p - \frac{1}{m-1}(q_1 + \cdots + q_{m-1})\|\) reduces to computing \(\|Mw\|\) for&lt;/p&gt;

\[M = \begin{bmatrix}
        \frac{p}{\|p\|} &amp;amp; \frac{q_1}{\|q_1\|} &amp;amp; \cdots &amp;amp; \frac{q_{m-1}}{\|q_{m-1}\|}
    \end{bmatrix}
    \qquad
    w = \begin{bmatrix} \|p\| \\ -\frac{\|q_1\|}{m-1} \\ \vdots \\ -\frac{\|q_{m-1}\|}{m-1} \end{bmatrix}.\]

&lt;p&gt;\(\SQ\) access to \(p,q_1,\ldots,q_{m-1}\) gives \(\SQ\) access to \(M^T\) and \(w\) so the supervised clustering problem reduces to the following:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem.&lt;/strong&gt;
For \(M \in \BB{R}^{m\times n}, w \in \BB{R}^n\), and \(\SQ(M^T,w)\), approximate \((Mw)^T(Mw)\) to additive \(\eps\) error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Algorithm.&lt;/strong&gt;
We can write \((Mw)^TMw\) as the inner product of an order three tensor; through basic tensor arithmetic, it is equal to \(\langle u, v\rangle\), where \(u,v \in \BB{R}^{m\times n\times n}\) are&lt;/p&gt;

\[\begin{aligned} u &amp;amp;= \sum_{i=1}^m\sum_{j=1}^n\sum_{k=1}^n M_{ij}\|M^{(k)}\| e_{i,j,k} \text{ and} \\
v &amp;amp;= \sum_{i=1}^m\sum_{j=1}^n\sum_{k=1}^n \frac{w_jw_kM_{ik}}{\|M^{(k)}\|} e_{i,j,k}. \end{aligned}\]

&lt;p&gt;Applying the algorithm for inner product (1) gives the desired approximation with \(O(\|w\|^2\|M\|_F^2\frac{1}{\eps^2} \log\frac{1}{\delta})\) samples and queries.&lt;/p&gt;

&lt;h3 id=&quot;recommendation-systems&quot;&gt;Recommendation systems&lt;/h3&gt;
&lt;p&gt;We want to randomly sample a product \(j \in [n]\) that is a good recommendation for a particular user \(i \in [m]\), given incomplete data on user-product preferences.
If we store this data in a matrix \(A \in \BB{R}^{m\times n}\) with sampling and query access, in the right model, finding good recommendations reduces to:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem.&lt;/strong&gt;
For a matrix \(A \in \BB{R}^{m\times n}\) along with a row \(i \in [m]\), given \(\SQ(A)\), approximately sample from \(D_i\) where \(D\) is a sufficiently good low-rank approximation of \(A\).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Remark.&lt;/em&gt; This task is essentially a variant of PCA, since a low-rank decomposition is dimensionality reduction of the matrix, viewed as a set of row vectors.
This is the “dequantized PCA” I refer to in other work&lt;sup id=&quot;fnref:tang18b&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:tang18b&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Algorithm.&lt;/strong&gt;
&lt;!-- TODO fix the USigma stuff --&gt;
Apply (3) to get \(\SQ(S,\hat{U},\hat{\Sigma})\) for a low-rank approximation \(D = AS^T \hat{U}\hat{\Sigma}^{-1}(\hat{\Sigma}^{-1})^T\hat{U}^T S\).
It turns out that this low-rank approximation is good enough to get good recommendations.
So it suffices to sample from \(D_i = A_iS^TMS\), where \(A_i \in \BB{R}^{1 \times n}, S \in \BB{R}^{\ell \times n}, M = \hat{U}\hat{\Sigma}^{-1}(\hat{\Sigma}^{-1})^T\hat{U}^T \in \BB{R}^{\ell \times \ell}\) with \(\ell = \poly(k)\).&lt;/p&gt;

\[\begin{bmatrix}
    \; \cdots &amp;amp; A_i &amp;amp; \cdots \;
\end{bmatrix} \begin{bmatrix}
    &amp;amp; \vdots &amp;amp; \\ &amp;amp; S^T &amp;amp; \\ &amp;amp; \vdots &amp;amp;
\end{bmatrix} \begin{bmatrix}
    &amp;amp; &amp;amp; \\ &amp;amp; M &amp;amp; \\ &amp;amp; &amp;amp;
\end{bmatrix} \begin{bmatrix}
    &amp;amp; &amp;amp; \\ \; \cdots &amp;amp; S &amp;amp; \cdots \; \\ &amp;amp; &amp;amp;
\end{bmatrix}\]

&lt;p&gt;Approximate \(A_iS^T\) to \(\ell^2\) norm using \(k\) inner product protocols (1).
Next, compute \(A_iS^TM\) with naive matrix-vector multiplication.
Finally, sample from \(A_iS^T\hat{U}\hat{U}^TS\), which is a thin matrix-vector product (2).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;An aside.&lt;/em&gt;
This gives an exponential speedup over previous classical results from 15-20 years ago&lt;sup id=&quot;fnref:dkr02&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:dkr02&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;.
The story here is quite odd.
From what I can tell, researchers at the time knew the important (read: hard) part of the algorithm, how to compute low-rank approximations fast, but didn’t notice that the resulting knowledge of \(S\) and \(\hat{U}\) could be used to sample the desired recommendations in sublinear time, which I think is much easier to understand.
This gave me anxiety during research, since I figured there was no way this would have been overlooked.
I’m glad these fears were unfounded; it’s cool that this quantum perspective made this step natural and obvious!&lt;/p&gt;

&lt;h3 id=&quot;low-rank-matrix-inversion&quot;&gt;Low-rank matrix inversion&lt;/h3&gt;
&lt;p&gt;The goal here is to mimic a quantum algorithm that can solve systems of equations \(Ax = b\) for \(A\) low-rank.
The dequantized version of this is:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem.&lt;/strong&gt;
For a low-rank matrix \(A \in \BB{R}^{m\times n}\) and a vector \(b \in \BB{R}^n\), given \(\SQ(A), \SQ(b)\), (approximately) respond to requests for \(\SQ(A^+b)\), where \(A^+\) is the pseudoinverse of \(A\).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Algorithm.&lt;/strong&gt;
Use the low-rank approximation protocol (3) to get \(\SQ(S,\hat{U}, \hat{\Sigma})\).
From applying the matrix-vector protocol (2), we have \(\SQ(\hat{V})\), where \(\hat{V} := S^T\hat{U}\hat{\Sigma}^{-1}\); with some analysis we can show that the columns of \(\hat{V}\) behave like the right singular vectors of \(A\).
Further, \(\hat{\Sigma}_{ii}\) behaves like their approximate singular values.
Using this information, we can approximate the vector we want to sample from:&lt;/p&gt;

\[A^+b= (A^TA)^+A^Tb \approx \sum_{i=1}^k \frac{1}{\hat{\Sigma}_{ii}^2}\hat{v}_i\hat{v}_i^T A^Tb\]

&lt;p&gt;We approximate \(\hat{v}_i^TA^Tb\) to additive error for all \(i\) by noticing that \(\hat{v}_i^TA^Tb = \Tr(A^Tb\hat{v}_i^T)\) is an inner product of the order two tensors \(A^T\) and \(b\hat{v}_i^T\).
Thus, we can apply (1), since being given \(\SQ(A)\) implies \(\SQ(A^T)\) for \(A^T\) viewed as a long vector.
Finally, using (2), sample from the linear combination using these estimates and \(\hat{\sigma}_i\).&lt;/p&gt;

&lt;h2 id=&quot;implications&quot;&gt;Implications&lt;/h2&gt;

&lt;p&gt;We have just described examples of dequantized algorithms for the following problems:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Recommendation systems&lt;sup id=&quot;fnref:tang18a&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:tang18a&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:kp17:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:kp17&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; (this classical algorithm &lt;em&gt;exponentially improves&lt;/em&gt; on the previous best!)&lt;/li&gt;
  &lt;li&gt;PCA&lt;sup id=&quot;fnref:tang18b:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:tang18b&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:lmr14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:lmr14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Supervised clustering&lt;sup id=&quot;fnref:tang18b:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:tang18b&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:lmr13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:lmr13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Low-rank matrix inversion&lt;sup id=&quot;fnref:rsml16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:rsml16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:glt18&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:glt18&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:clw18&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:clw18&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We address here what to take away from these results.&lt;/p&gt;

&lt;h3 id=&quot;for-quantum-computing&quot;&gt;For quantum computing&lt;/h3&gt;
&lt;p&gt;The most important conclusion, in my opinion, is a heuristic:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Heuristic 1.&lt;/strong&gt;
Linear algebra problems in low-dimensional spaces (constant, say, or polylogarithmic) likely can be dequantized.&lt;/p&gt;

&lt;p&gt;The intuition for this heuristic is that, if your problem operates in a subspace of such low dimension, the main challenge is “finding” this subspace and rotating to it.
Then, we can think about our problem as lying in \(\BB{C}^d\) where \(d\) is small, and can solve it with a simple polynomial-time (in \(d\)) algorithm.
Finding the subspace is an unordered search problem if you squint, so can’t be sped up much by exploiting quantum.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Remark.&lt;/em&gt; There are high-dimensional problems that cannot be dequantized; for example, given \(\SQ(v)\), it takes \(\Omega(n)\) queries to approximately sample from \(Hv\), where \(H\) is the Hadamard matrix (this is the Fourier Sampling problem&lt;sup id=&quot;fnref:ac16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:ac16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;Why do we care about dequantizing algorithms?
As the name suggests, I argue that this is a reasonable classical analogue to quantum machine learning algorithms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Heuristic 2.&lt;/strong&gt;
For machine learning problems, SQ assumptions are more reasonable than state preparation assumptions.&lt;/p&gt;

&lt;p&gt;That is, the practical task of preparing quantum states is probably always harder than the practical task of preparing sample and query access.
Practically, this makes sense, since for state preparation we need, well, quantum computers.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;
        Quantum computing applications that are realizable with zero qubits!
  &lt;/p&gt;&lt;footer&gt;– Scott Aaronson&apos;s &quot;elevator pitch&quot; of my work, paraphrased&lt;/footer&gt;
&lt;/blockquote&gt;

&lt;p&gt;Even assuming the existence of a practical quantum computer, there is evidence that state preparation assumptions are still harder to satisfy than sample and query access, up to polynomial slowdown.
For example, preparing a generic quantum state \(\ket{v}\) corresponding to an input vector \(v\) takes \(\Omega(\sqrt{n})\) quantum queries to \(v\) in general, while responding to \(\SQ(v)\) accesses takes \(\Theta(n)\) classical queries.
Because dequantized algorithms are polynomial in \(\log n\), this means that getting SQ access to a generic vector is much more expensive than running the algorithm.&lt;/p&gt;

&lt;p&gt;Of course, we can also consider special classes of vectors where quantum state preparation is easier, but generally SQ access gets proportionally faster as well.
For example, we can quickly prepare vectors where all entries have roughly equal magnitude (think vectors whose entries are either \(+1\) or \(-1\)), but correspondingly, we can compute SQ accesses to such vectors similarly quickly.&lt;/p&gt;

&lt;p&gt;On the classical side, the assumption of SQ access is on par with other typical assumptions to make machine learning algorithms sublinear:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;There is a classical dynamic data structure that supports SQ access, fast updates, and sparsity in log time.&lt;/li&gt;
  &lt;li&gt;Given an input vector as a list of nonzero entries, sampling from it takes time linear in sparsity.&lt;/li&gt;
  &lt;li&gt;\(k\) independent samples can be prepared with one pass through the data in \(O(k)\) space.&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;p&gt;To summarize these heuristics: quantum machine learning for &lt;em&gt;low-dimensional datasets&lt;/em&gt; will probably never get speedups as significant as, say, Shor’s algorithm, even in best-case scenarios.
Unfortunately, QML for low-dimensional problems were the most practical algorithms in the literature, so with this research it’s unclear what the state of the field is today.&lt;/p&gt;

&lt;p&gt;The story might not be over, though.
We know that quantum computers can “efficiently solve” high-dimensional linear algebra problems&lt;sup id=&quot;fnref:hhl08&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:hhl08&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;; however, this assumes that we have some way to evolve a quantum system precisely according to input data, a much harder problem than the linear algebra itself.
Nevertheless, I hold out hope that this result can be applied to achieve exponential speedups in machine learning or elsewhere.&lt;/p&gt;

&lt;h3 id=&quot;for-classical-computing&quot;&gt;For classical computing&lt;/h3&gt;

&lt;p&gt;I am cautiously optimistic about the implications of this work for classical computing.
The major advantage of dequantized algorithms is sheer speed (asymptotically, at least).
However, the issues listed below prevent dequantized algorithms from being strict improvements over current algorithms.&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Gaining SQ access to input typically requires preliminary data processing or the use of a data structure.
This means that dequantized algorithms can’t be plugged into existing systems without large amounts of computation.&lt;/li&gt;
  &lt;li&gt;SQ access to output might not always be useful or practical.&lt;/li&gt;
  &lt;li&gt;Current dequantized algorithms have large error compared to standard techniques.&lt;/li&gt;
  &lt;li&gt;Current algorithms have large theoretical exponents, so right now we don’t know whether they run quickly in practice.
I expect we can cut down these exponents greatly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If I had to guess, the best chance for success in dequantized techniques remains recommendation systems, since speed matters significantly in that context.
I view the other algorithms as significantly less likely to see use in practice, though probably more likely than their corresponding quantum algorithms.&lt;/p&gt;

&lt;p&gt;Regardless, these works fit nicely into the classical literature: dequantized quantum machine learning is just a nicely modular, quantum-inspired form of randomized numerical linear algebra.&lt;/p&gt;

&lt;h2 id=&quot;appendix-more-details&quot;&gt;Appendix: More details&lt;/h2&gt;

&lt;p&gt;As a reminder, here are the three techniques:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Inner Product&lt;/li&gt;
  &lt;li&gt;Thin Matrix-Vector&lt;/li&gt;
  &lt;li&gt;Low-rank Approximation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Below, we explain (1) and (2) fully, and give a rough sketch of (3).&lt;/p&gt;

&lt;h3 id=&quot;1-estimating-inner-products&quot;&gt;1. Estimating inner products&lt;/h3&gt;

&lt;p&gt;First, we give a basic way of estimating the mean of an arbitrary distribution with finite variance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fact.&lt;/strong&gt;
For \(\{X_{i,j}\}\) i.i.d random variables with mean \(\mu\) and variance \(\sigma^2\), let&lt;/p&gt;

\[Y := \underset{j \in [6\log 1/\delta]}{\operatorname{median}}\;\underset{i \in [6/\eps^2]}{\operatorname{mean}}\;X_{i,j}\]

&lt;p&gt;Then \(\vert Y - \mu\vert \leq \eps\sigma\) with probability \(\geq 1-\delta\), using only \(O(\frac{1}{\eps^2}\log\frac{1}{\delta})\) copies of \(X\).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Proof sketch.&lt;/em&gt;
The proof follows from two facts: first, the median of \(C_1,\ldots,C_n\) is at least \(\lambda\) precisely when at least half of the \(C_i\) are at least \(\lambda\); second, &lt;a href=&quot;https://en.wikipedia.org/wiki/Chebyshev%27s_inequality#Probabilistic_statement&quot;&gt;Chebyshev’s inequality&lt;/a&gt; (applied to the mean).&lt;/p&gt;

&lt;p&gt;Estimating the inner product is just a basic corollary of this estimator.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Proposition.&lt;/strong&gt;
For \(x,y \in \BB{C}^n\), given \(\SQ(x)\) and \(\Q(y)\), we can estimate \(\langle x,y\rangle\) to \(\eps\|x\|\|y\|\) error with probability \(\geq 1-\delta\) with query complexity \(O(\frac{1}{\eps^2}\log\frac{1}{\delta})\).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Proof.&lt;/em&gt;
Sample \(s\) from \(v\) and let \(Z = x_sv_s\frac{\|v\|^2}{|v_s|^2}\).
Apply the Fact with \(X_{i,j}\) being independent copies of \(Z\).&lt;/p&gt;

&lt;h3 id=&quot;2-thin-matrix-vector-product-with-rejection-sampling&quot;&gt;2. Thin matrix-vector product with rejection sampling&lt;/h3&gt;

&lt;p&gt;We first go over rejection sampling, a naive way to efficiently generate samples from a specified distribution from samples from another distribution.&lt;/p&gt;

&lt;p&gt;Input: samples from distribution \(P\)&lt;br /&gt;
Output: samples from distribution \(Q\)&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Pull a sample \(s\) from \(P\);&lt;/li&gt;
  &lt;li&gt;Compute \(r_s = \frac{Q(s)}{MP(s)}\) for some constant \(M\);&lt;/li&gt;
  &lt;li&gt;Output \(s\) with probability \(r_s\) and restart otherwise.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Fact.&lt;/strong&gt;
If \(r_i \leq 1\) for all \(i\), then the above procedure is well-defined and outputs a sample from \(Q\) in \(M\) iterations in expectation.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;strong&gt;Proposition.&lt;/strong&gt;
For \(V \in \BB{R}^{n\times k}\) and \(w \in \BB{R}^k\), given \(\SQ(V)\) and \(\Q(w)\), we can simulate \(\SQ(Vw)\) with expected query complexity \(O(k^2C(V,w))\), where&lt;/p&gt;

\[C(V,w) := \frac{\sum_{i=1}^k\|w_iV^{(i)}\|^2}{\|Vw\|^2}.\]

&lt;p&gt;We can compute entries \((Vw)_i\) with \(O(k)\) queries.&lt;br /&gt;
We can sample using rejection sampling:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;\(P\) is the distribution formed by sampling from \(V^{(j)}\) with probability proportional to \(\|w_jV^{(j)}\|^2\);&lt;/li&gt;
  &lt;li&gt;\(Q\) is the target \(Vw\).&lt;/li&gt;
&lt;/ul&gt;

\[r_i = \frac{(Vw)_i^2}{k \sum_{j=1}^k (w_jV_{ij})^2} = \frac{Q(i)}{kC(V,w)P(i)}\]

&lt;p&gt;Notice that we can compute these \(r_i\)’s (in fact, despite that we cannot compute probabilities from the target distribution), and that the rejection sampling guarantee is satisfied (via Cauchy-Schwarz).&lt;/p&gt;

&lt;p&gt;The probability of success is \(\frac{\|Vw\|^2}{k\sum_{i=1}^k\|w_iV^{(i)}\|^2}\).
Thus, to estimate the norm of \(Vw\), it suffices to estimate the probability of success of this rejection sampling process.
We can view this as estimating the heads probability of a biased coin, where the coin is heads if rejection sampling succeeds and tails otherwise.
Through a &lt;a href=&quot;https://en.wikipedia.org/wiki/Chernoff_bound#Multiplicative_form_(relative_error)&quot;&gt;Chernoff bound&lt;/a&gt;, we see that the average of \(O(kC(V,w)\frac{1}{\eps^2}\log\frac{1}{\delta})\) “coin flips” is in \([(1-\eps)\|Vw\|,(1+\eps)\|Vw\|]\) with probability \(\geq 1-\delta\), where each coin flip costs \(k\) queries and samples.&lt;/p&gt;

&lt;h3 id=&quot;3-low-rank-approximation-briefly&quot;&gt;3. Low-rank approximation, briefly&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Proposition.&lt;/strong&gt;
For \(A \in \BB{C}^{m\times n}\), given \(\SQ(A)\) and some threshold \(k\), we can output a description of a low-rank approximation of \(A\).&lt;/p&gt;

&lt;p&gt;Specifically, our output is \(\SQ(S,\hat{U}, \hat{\Sigma})\) for \(S \in \BB{C}^{\ell \times n}\), \(\hat{U} \in \BB{C}^{\ell \times k}\), \(\hat{\Sigma} \in \BB{C}^{k\times k}\) (\(\ell = \poly(k,\frac{1}{\eps})\)), and this implicitly describes the low-rank approximation to \(A\), \(D := A(S^\dagger\hat{U}\hat{\Sigma}^{-1})(S^\dagger\hat{U}\hat{\Sigma}^{-1})^\dagger\) (notice rank \(D \leq k\)).&lt;/p&gt;

&lt;p&gt;This matrix satisfies the following low-rank guarantee with probability \(\geq1-\delta\): for \(\sigma := \sqrt{2/k}\|A\|_F\), and \(A_{\sigma} := \sum_{\sigma_i \geq \sigma} \sigma_iu_iv_i^\dagger\) (using SVD),&lt;/p&gt;

\[\|A - D\|_F^2 \leq \|A - A_\sigma\|_F^2 + \eps^2\|A\|_F^2.\]

&lt;p&gt;This algorithm comes from the 1998 paper of Frieze, Kannan, and Vempala&lt;sup id=&quot;fnref:fkv04&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:fkv04&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;.
See the recent survey&lt;sup id=&quot;fnref:kv17&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:kv17&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt; by Kannan and Vempala for a survey of these techniques, and see Woodruff’s textbook&lt;sup id=&quot;fnref:woodruff14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:woodruff14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt; for a discussion of more general techniques.
The form I state above is a simple variant that I discuss in my recommendation systems paper&lt;sup id=&quot;fnref:tang18a:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:tang18a&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;The core piece of analysis is the following theorem (sometimes called the &lt;em&gt;Approximate Matrix Product&lt;/em&gt; property in the literature).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theorem.&lt;/strong&gt; 
Let \(S^TS = \sum_{j=1}^\ell S_jS_j^T\), where \(S_j\) is \(\frac{A_i\|A\|_F^2}{\|A_i\|}\) with probability \(\frac{\|A_i\|^2}{\|A\|_F^2}\) (so \(i\) is sampled from \(\tilde{A}\)). For sufficiently small \(\eps\) and \(\ell = \Omega(\frac{1}{\eps^2}\log\frac1\delta)\), with probability \(\geq 1-\delta\),&lt;/p&gt;

\[\|S^TS - A^TA\|_F \leq \eps\|A\|_F^2.\]

&lt;p&gt;This looks like a further higher-order (two order two tensor inner product) generalization of inner product (two order one tensor inner product) and thin matrix-vector (order two and order one tensor inner product); it’s possible that a clever rephrasing of this result in the \(SQ\) model could make the low-rank approximation result more quantum-ic.&lt;/p&gt;

&lt;p&gt;We now sketch the algorithm along with intuition: it’s most useful to consider the low-rank approximation task as one of finding large approximate singular vectors.
First, sample \(\ell\) rows of \(A\) according to \(\ell^2\) norm, and consider the matrix \(S \in \BB{C}^{\ell \times n}\) of these rows, all renormalized to have the same length.
This is the \(S\) that we output.
By the above theorem, \(\|S^TS - A^TA\|_F \leq \eps\|A\|_F^2\) with good probability, which implies that the large right singular vectors of \(S\) (eigenvectors of \(S^TS\)) approximate the large right singular vectors of \(A\) (eigenvectors of \(A^TA\)).&lt;/p&gt;

&lt;p&gt;Next, we can perform the same process to \(S^T\): sample rows of \(S^T\) and get a normalized submatrix \(W \in \BB{R}^{\ell \times \ell}\) such that \(\|WW^T-SS^T\|_F \leq \eps\|A\|_F^2\).
Since \(W\) is a constant-sized matrix, we can compute \(\hat{U}\) and \(\hat{\Sigma}\), the large left singular vectors and values of \(W\), which approximate the large left singular vectors and values of \(S\).
Then, \(S^T\hat{U}\hat{\Sigma}^{-1}\) translates these large left singular vectors to their corresponding right singular vectors and rescales them accordingly, giving the approximate singular vectors of \(A\) as desired.&lt;/p&gt;

&lt;h2 id=&quot;glossary&quot;&gt;Glossary&lt;/h2&gt;

&lt;p&gt;For natural numbers \(m, n\), vector \(v \in \BB{C}^n\) and \(A \in \BB{C}^{m\times n}\):&lt;/p&gt;

&lt;p&gt;\([n]\) denotes \(\{1,2,\ldots,n\}\);
\(O(\cdot)\) and \(\Omega(\cdot)\) is &lt;a href=&quot;https://en.wikipedia.org/wiki/Big_O_notation&quot;&gt;big O notation&lt;/a&gt;;
\(A_i\) and \(A^{(j)}\) denotes the \(i\)th row of \(A\) and the \(j\)th column of \(A\);
\(\|v\|\) denotes the \(\ell^2\) norm of \(v\), \(\sqrt{\|v_1\|^2 + \cdots + \|v_n\|^2}\);&lt;/p&gt;

&lt;p&gt;\(\ket{\psi}\) is &lt;a href=&quot;https://en.wikipedia.org/wiki/Bra%E2%80%93ket_notation&quot;&gt;bra-ket notation&lt;/a&gt;: kets are column vectors \(\ket{\psi} \in \BB{C}^{n \times 1}\), bras are row vectors \(\bra{\psi} := (\ket{\psi})^\dagger\), standard basis vectors are denoted \(\ket{1},\ldots,\ket{n}\), and the tensor product of \(\ket{\alpha}\) and \(\ket{\beta}\) is denoted \(\ket{\alpha}\ket{\beta}\).
Of course, these are all really quantum states, but that’s only relevant for quantum algorithms: for my purposes, I use \(\ket{\phi}\) and \(\phi\) interchangeably to refer to vectors.
(I ignore normalization, but those issues can be dealt with.)&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://en.wikipedia.org/wiki/Singular_value_decomposition&quot;&gt;singular value decomposition&lt;/a&gt; (SVD) of \(A \in \BB{C}^{m\times n}\) is a decomposition \(A = U\Sigma V^\dagger\), where \(U \in \BB{C}^{m\times m}\) and \(V \in \BB{C}^{n\times n}\) are unitary and \(\Sigma \in \BB{R}^{m\times n}\) is diagonal.
In other words, for \(u_i\) and \(v_i\) the columns of \(U\) and \(V\), respectively, and \(\sigma_i\) the diagonal entries of \(\Sigma\), \(A = \sum \sigma_iu_iv_i^\dagger\).
By convention, \(\sigma_1 \geq \ldots \geq \sigma_{\min m,n} \geq 0\).&lt;/p&gt;

&lt;p&gt;Using \(A\)’s SVD, we can define basic linear algebraic objects.
\(\|A\|_2 = \max_{v \in \BB{C}^n} \|Av\|/\|v\| = \sigma_1\) is the spectral norm of \(A\).
\(\|A\|_F = \sqrt{\sum_{i=1}^m\sum_{j=1}^n |A_{ij}|^2} = \sqrt{\sigma_1^2 + \cdots + \sigma_{\min m,n}^2}\) is the Frobenius norm of \(A\).
\(A_k = \sum_{i=1}^k \sigma_iu_iv_i^\dagger\) is an optimal rank \(k\)  approximation to \(A\) in both spectral and Frobenius norm.
\(A^+ = \sum_{\sigma_i &amp;gt; 0} \frac{1}{\sigma_i}v_iu_i^\dagger\) is \(A\)’s &lt;a href=&quot;https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_inverse&quot;&gt;pseudoinverse&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I define \(\SQ(v)\), \(\SQ(A)\), and \(\Q(v)\) in &lt;a href=&quot;#an-introduction-to-dequantization&quot;&gt;An introduction to dequantization&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:aaronson15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Scott Aaronson. &lt;em&gt;Read the fine print&lt;/em&gt;. Nature Physics 11.4, 2015. &lt;a href=&quot;https://www.scottaaronson.com/papers/qml.pdf&quot;&gt;Link&lt;/a&gt; &lt;a href=&quot;#fnref:aaronson15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:kp17&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Iordanis Kerenidis, Anupam Prakash. &lt;em&gt;Quantum recommendation systems&lt;/em&gt;. &lt;a href=&quot;https://arxiv.org/abs/1603.08675&quot;&gt;arXiv:1603.08675&lt;/a&gt;. &lt;a href=&quot;#fnref:kp17&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:kp17:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:tang18b&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ewin Tang. &lt;em&gt;Quantum-inspired classical algorithms for principal component analysis and supervised clustering&lt;/em&gt;. &lt;a href=&quot;https://arxiv.org/abs/1811.00414&quot;&gt;arXiv:1811.00414&lt;/a&gt;, 2018. &lt;a href=&quot;#fnref:tang18b&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:tang18b:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:tang18b:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:dkr02&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Petros Drineas, Iordanis Kerenidis, Prabhakar Raghavan. &lt;em&gt;Competitive recommendation systems&lt;/em&gt;. STOC, 2002. &lt;a href=&quot;https://www.irif.fr/~jkeren/jkeren/CV_Pubs_files/DKR02.pdf&quot;&gt;Link&lt;/a&gt;. &lt;a href=&quot;#fnref:dkr02&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:tang18a&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ewin Tang. &lt;em&gt;A quantum-inspired algorithm for recommendation systems&lt;/em&gt;. &lt;a href=&quot;https://arxiv.org/abs/1807.04271&quot;&gt;arXiv:1807.04271&lt;/a&gt;, 2018. &lt;a href=&quot;#fnref:tang18a&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:tang18a:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:lmr14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Seth Lloyd, Masoud Mohseni, Patrick Rebentrost. &lt;em&gt;Quantum principal component analysis&lt;/em&gt;. &lt;a href=&quot;https://arxiv.org/abs/1307.0401&quot;&gt;arXiv:1307.0401&lt;/a&gt;, 2013. &lt;a href=&quot;#fnref:lmr14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:lmr13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Seth Lloyd, Masoud Mohseni, Patrick Rebentrost. &lt;em&gt;Quantum algorithms for supervised and unsupervised machine learning&lt;/em&gt;. &lt;a href=&quot;https://arxiv.org/abs/1307.0411&quot;&gt;arXiv:1307.0411&lt;/a&gt;, 2013. &lt;a href=&quot;#fnref:lmr13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:rsml16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Patrick Rebentrost, Adrian Steffens, Iman Marvian, Seth Lloyd. &lt;em&gt;Quantum singular-value decomposition of nonsparse low-rank matrices&lt;/em&gt;. &lt;a href=&quot;https://arxiv.org/abs/1607.05404&quot;&gt;arXiv:1607.05404&lt;/a&gt; . &lt;a href=&quot;#fnref:rsml16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:glt18&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;András Gilyén, Seth Lloyd, Ewin Tang. &lt;em&gt;Quantum-inspired low-rank stochastic regression with logarithmic dependence on the dimension&lt;/em&gt;. &lt;a href=&quot;https://arxiv.org/abs/1811.04909&quot;&gt;arXiv:1811.04909&lt;/a&gt;, 2018. &lt;a href=&quot;#fnref:glt18&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:clw18&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Nai-Hui Chia, Han-Hsuan Lin, Chunhao Wang. &lt;em&gt;Quantum-inspired sublinear classical algorithms for solving low-rank linear systems&lt;/em&gt;. &lt;a href=&quot;https://arxiv.org/abs/1811.04852&quot;&gt;arXiv:1811.04852&lt;/a&gt;, 2018. &lt;a href=&quot;#fnref:clw18&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:ac16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Scott Aaronson and Lijie Chen. &lt;em&gt;Complexity-theoretic foundations of quantum supremacy experiments&lt;/em&gt;. &lt;a href=&quot;https://arxiv.org/abs/1612.05903&quot;&gt;arXiv:1612.05903&lt;/a&gt;, 2016. &lt;a href=&quot;#fnref:ac16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:hhl08&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Aram W. Harrow, Avinatan Hassidim, Seth Lloyd. &lt;em&gt;Quantum algorithm for solving linear systems of equations&lt;/em&gt;. &lt;a href=&quot;https://arxiv.org/abs/0811.3171&quot;&gt;arXiv:0811.3171&lt;/a&gt;, 2008. &lt;a href=&quot;#fnref:hhl08&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:fkv04&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Alan Frieze, Ravindran Kannan, Santosh Vempala. &lt;em&gt;Fast monte-carlo algorithms for finding low-rank approximations&lt;/em&gt;. &lt;em&gt;Journal of the ACM&lt;/em&gt;, vol. 51, no. 6, 2004. &lt;a href=&quot;https://www.math.cmu.edu/~af1p/Texfiles/SVD.pdf&quot;&gt;Link&lt;/a&gt;. &lt;a href=&quot;#fnref:fkv04&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:kv17&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ravindran Kannan and Santosh Vempala. &lt;em&gt;Randomized algorithms in numerical linear algebra&lt;/em&gt;. Acta Numerica 26, 2017. &lt;a href=&quot;https://www.cc.gatech.edu/~vempala/papers/acta_survey.pdf&quot;&gt;Link&lt;/a&gt;. &lt;a href=&quot;#fnref:kv17&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:woodruff14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;David P. Woodruff. &lt;em&gt;Sketching as a tool for numerical linear algebra&lt;/em&gt;. Foundations and Trends in Theoretical Computer Science 10.1–2, 2014. &lt;a href=&quot;https://researcher.watson.ibm.com/researcher/files/us-dpwoodru/wNow.pdf&quot;&gt;Link&lt;/a&gt;. &lt;a href=&quot;#fnref:woodruff14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
        <pubDate>Mon, 28 Jan 2019 00:00:00 +0000</pubDate>
        <link>https://www.ewintang.com/blog/2019/01/28/an-overview-of-quantum-inspired-sampling/</link>
        <guid isPermaLink="true">https://www.ewintang.com/blog/2019/01/28/an-overview-of-quantum-inspired-sampling/</guid>
      </item>
    
  </channel>
</rss>
