Topological Quantum Color Code Model on Infinite Lattice

Our work, topological quantum color code model on infinite lattices, co-authored with Shiyu Cao and Sheng Tan, has been posted on arXiv1. In this study, we carry out a rigorous Doplicher–Roberts–Haag (DHR) analysis for the quantum color code model—a framework central to understanding the structure of topological order in infinite lattice.

It is known that the topological phases of the color code model are equivalent to those of the double-layer toric code model. In this work, we present a rigorous proof of this equivalence within the framework of quasi-local C*-algebras.

We classify the model’s irreducible anyon superselection sectors and construct explicit string operators that generate anyonic excitations from the ground state. We further examine the fusion and braiding properties of these excitations and show that the resulting category is equivalent to \mathsf{Rep}(D(\mathbb{Z}_2 \times \mathbb{Z}_2)). We also prove the Haag duality for the gound state of color code model.

  1. Since I have recently been moving and settling down in Changsha and have been somewhat nomadic, this post comes out a little bit late. ↩︎

Temporal Kirkwood-Dirac Quasiprobability Distribution and Unification of Temporal State Formalisms through Temporal Bloch Tomography

Recently, we post the paper “temporal kirkwood-dirac quasiprobability distribution and unification of temporal state formalisms through temporal bloch tomography” on arXiv. This is joint work with Kavan Modi and Dagomir Kaszlikowski, in which we extend the Kirkwood–Dirac (KD) quasiprobability distribution to the temporal setting and develop a unified framework for temporal states.

The traditional (spatial) KD quasiprobability distribution is defined for density operators, which serve as a phase-space–like representation of a quantum state (see this post for a brief introduction). In this work, we generalize the KD formalism to temporal quantum processes, which consist of an initial state together with a sequence of quantum evolutions, \mathfrak{P} = (\rho_{t_0}, \mathcal{E}_{t_1 \leftarrow t_0}, \ldots, \mathcal{E}_{t_n \leftarrow t_{n-1}}), where \rho_{t_0} is the initial state and each \mathcal{E}_{t_j \leftarrow t_{j-1}} denotes a completely positive trace-preserving (CPTP) map describing the evolution from t_{j-1} to t_j. The system state at an intermediate time t_k is then obtained recursively as \rho_{t_k} = \mathcal{E}_{t_k \leftarrow t_{k-1}} \circ \cdots \circ \mathcal{E}_{t_1 \leftarrow t_0}(\rho_{t_0}).

Within this framework, we define the multi-time doubled temporal KD quasiprobability distribution by

\overleftrightarrow{Q}_{KD}(a_n,\ldots,a_0; b_n,\ldots,b_0)

= \mathrm{Tr}[ \Pi_{a_n|A_n}^{t_n} \mathcal{E}_{t_n \leftarrow t_{n-1}}(\cdots \Pi_{a_1|A_1}^{t_1}( \mathcal{E}_{t_1 \leftarrow t_0}( \Pi_{a_0|A_0}^{t_0} \rho_{t_0} \Pi_{b_0|B_0}^{t_0})) \Pi_{b_1|B_1}^{t_1} \cdots) \Pi_{b_n|B_n}^{t_n}]

By taking the left and right marginals of this doubled distribution, we recover the corresponding temporal KD quasiprobability distributions \overleftarrow{Q}_{KD}(a_n,\ldots,a_0) and \overrightarrow{Q}_{KD}(b_n,\ldots,b_0). These two distributions are related by complex conjugation, reflecting the intrinsic temporal structure encoded in the doubled formalism. The temporal Margenau-Hill (MH) quasiprobability distribution is defined as real part of the temporal KD distribution.

The temporal KD quasiprobability may take values outside the interval [0,1]. This phenomenon reflects the intrinsically quantum nature of multi-time processes, in direct analogy with the negativity and nonclassical features of the spatial KD quasiprobability distribution. Using the Heisenberg picture, we provide a general and unified explanation for the origin of this quantumness.

A crucial application of temporal KD quasiprobability distribution is that it provides a general framework for temporal state.

There exist several formalisms of temporal states. In our paper, we show that using the temporal KD quasiprobability distribution, we can compute correlators of Pauli operators at different time steps. Then, by applying the Bloch representation of the state, we can derive several distinct temporal state formalisms. This provides a unified picture of temporal states.

For example, with doubled temporal KD distribution, we obtain the doubled KD temporal state:

\overleftrightarrow{\Upsilon}= \frac{1}{d^{n+1}}\sum_{\mu_0,\ldots,\mu_n \atop \nu_0,\ldots,\nu_n=0}^{d^2-1}\overleftrightarrow{T}^{\mu_n,\ldots,\mu_0;\,\nu_n,\ldots,\nu_0}\left(\bigotimes_{i=0}^n \sigma_{\mu_i}\right)\otimes\left(\bigotimes_{j=0}^n \sigma_{\nu_j}\right)

This is the most information-complete temporal state, and we show that it coincides with the doubled density operator. The temporal states obtained from the left or right temporal KD distribution correspond to the left and right reduced states of the doubled KD temporal state. Similarly, the left and right MH distributions coincide and give the left/right MH temporal states. For the two-time case, the left and right MH temporal states are identical to the pseudo-density operator; however, for general multi-time processes, they are no longer the same. In general, the MH temporal states can be understood as the Hermitianization of the corresponding KD temporal state.

Kirkwood-Dirac Quasiprobability Distribution

Recently, I have been working on several topics related to the Kirkwood–Dirac quasiprobability distribution (often abbreviated as the KD distribution), which has inspired me to write down some informal notes.

The KD distribution is a fascinating conceptual tool that sits at the boundary between classical probability theory and the strange, counterintuitive structure of quantum mechanics. It provides a way to represent quantum states that goes beyond classical statistics, allowing for features such as negativity and even complex values. These properties have no classical analogue, yet they encode genuinely quantum aspects of measurement, interference, and contextuality.

In this sense, the KD distribution offers a particularly transparent window into the nonclassical nature of quantum theory, making it an invaluable framework for both foundational studies and modern applications in quantum information.

What Are Quasiprobability Distributions in Quantum Mechanics?

Before turning to the KD distribution itself, it is helpful to place it within a broader conceptual landscape.

In classical physics, the state of a system can be described directly in phase space (concept developed in the late 19th century by Ludwig BoltzmannHenri Poincaré, and Josiah Willard Gibbs). A point (x,p) specifies the position and momentum of a particle, and our uncertainty about the system is encoded in a genuine probability density \omega(x,p). Such distributions are always non-negative real functions and satisfy a normalization condition, reflecting the familiar rules of classical probability theory. All classical observables are functions on phase space, F(x,p), and their expectation values are given by
\langle F \rangle = \int F(x,p)\,\omega(x,p)\,dx\,dp.

Quantum mechanics, however, adopts a very different language. The fundamental object is the wavefunction \psi(x) (or, more generally, a density operator), which does not represent a probability distribution in phase space. While |\psi(x)|^2 gives the probability density for position and |\psi(p)|^2 gives the probability density for momentum, there is no single function that simultaneously assigns probabilities to both position and momentum. This obstruction is not merely technical but fundamental, rooted in the noncommutativity of quantum observables and formalized by the Heisenberg uncertainty principle.

Nevertheless, it is natural to ask whether quantum states can still be represented in a phase-space-like manner, even if some classical features must be sacrificed. This motivation led to the development of quasiprobability distributions. These objects resemble classical probability distributions in form, but they relax one crucial requirement: they need not be non-negative, or even real-valued. Allowing for negative or complex values makes it possible to encode quantum interference, coherence, and other genuinely nonclassical effects within a phase-space framework.

Well-known examples include:

The Kirkwood–Dirac (KD) quasiprobability distribution belongs to this same family, but with an important distinction. It can be defined for arbitrary pairs of observables (in fact the Wigner function can also be extended in this way), not just position and momentum, and it is generically complex-valued. These features make the KD distribution especially powerful for studying quantum measurements, temporal correlations, and applications in quantum information processing.

Historical Background: From Kirkwood to Dirac

The KD distribution has its roots in the early days of quantum mechanics, when physicists were grappling with how to reconcile quantum phenomena with classical intuition.

  • John G. Kirkwood’s Contribution (1933): American physicist John Gamble Kirkwood first introduced the distribution in his paper Quantum Statistics of Almost Classical Assemblies. Kirkwood aimed to describe quantum systems that are nearly classical, using a phase-space representation. His version was essentially a complex-valued joint distribution for position and momentum.
  • Paul Dirac’s Independent Discovery (1945): The legendary physicist Paul A. M. Dirac independently rediscovered the distribution in his paper On the Analogy Between Classical and Quantum Mechanics. Dirac highlighted its role in illustrating analogies between classical and quantum theories, particularly for noncommuting operators.

The real part of KD distribution is called Margenau-Hill quasiprobability distribution, which was proposed by by H. Margenau and R. N. Hill in 1961 in their paper “Correlation between Measurements in Quantum Theory“.

For decades, the KD distribution remained relatively obscure, overshadowed by the Wigner function. However, in recent years—especially since the 2010s—it has gained renewed attention due to its applications in quantum information theory, where it serves as a powerful tool to quantify nonclassical resources such as coherence and contextuality. See the review paper “Properties and applications of the Kirkwood–Dirac distribution” by Arvidsson-Shukur et al for more details.

Mathematical Formulation: Defining the KD Distribution

Let’s dive into the mathematics. We’ll assume a basic familiarity with quantum mechanics notation (e.g., density operators \hat{\rho}, bras \langle \cdot |, kets | \cdot \rangle), but I will explain as we go.

Basic Definition for Two Observables

Consider a quantum system in a state described by the density operator \hat{\rho} (which may be pure or mixed). Let \hat{A} and \hat{B} be two Hermitian observables with eigenbases {|a_i\rangle} and {|b_j\rangle}, respectively, assuming a finite-dimensional Hilbert space of dimension d for simplicity.

The Kirkwood–Dirac (KD) quasiprobability distribution is defined as:

Q_{i,j}(\hat{\rho}) = \langle b_j | a_i \rangle \langle a_i | \hat{\rho} | b_j\rangle

Here:

  • \langle b_j | a_i \rangle is the overlap between the eigenvectors.
  • This expression is generally complex because the overlaps can introduce phases.

For projectors \hat{\Pi}_a = |a\rangle\langle a| and \hat{\Pi}_b =|b\rangle\langle b|, it can also be written as:

Q(a,b; \hat{\rho}) = \mathrm{Tr}(\hat{\Pi}_b \hat{\Pi}_a \hat{\rho})

In continuous variables (e.g., position x and momentum p):

Q(x, p; \hat{\rho}) = \langle p | x \rangle \langle x | \hat{\rho} | p \rangle

The real part of KD distribution is called Margenau-Hill quasiprobability distribution, which by definition is a real-valued quasiprobability distribution.

Generalizations

The KD distribution extends naturally to more general scenarios:

  • Multiple observables (k observables): Q_{i_1, \dots, i_k}(\hat{\rho}) = \mathrm{Tr} (\hat{\Pi}_{i_k}^{(k)} \cdots \hat{\Pi}^{(1)}_{i_1} \hat{\rho} )
  • Positive-operator-valued measures (POVMs) {\hat{M}^{(l)}_{i_l}}: Q_{i_1, \dots, i_k}(\hat{\rho}) = \mathrm{Tr} ( \hat{M}_{i_k}^{(k)} \cdots \hat{M}^{(1)}_{i_1} \hat{\rho} )

Key Mathematical Properties

  • Normalization:
    \sum_{i,j} Q_{i,j}(\hat{\rho}) = 1
  • Marginals: The sums over one index reproduce the standard quantum probabilities (Born rule):
    \sum_j Q_{i,j}(\hat{\rho}) = \langle a_i | \hat{\rho} | a_i \rangle, \quad \sum_i Q_{i,j}(\hat{\rho}) = \langle b_j | \hat{\rho} | b_j \rangle
    For multiple observable case, this property becomes Kolmogorov consistency condition.
  • State Reconstruction: Assuming all overlaps \langle a_i | b_j \rangle are non-zero, the density operator can be recovered from the KD distribution: \hat{\rho} = \sum_{i,j} Q_{i,j}(\hat{\rho}) \frac{|b_j\rangle\langle a_i|}{\langle a_i | b_j \rangle^*}

Properties: What Makes the KD Distribution Nonclassical?

The KD distribution is particularly powerful because it captures features of quantum states that have no classical analogue:

  • Complex Values and Negativity: Entries of the KD distribution can be negative (in their real part) or contain imaginary components. Negativity is a direct signature of quantum interference, while the imaginary parts reflect the disturbance caused by measurement.
  • Quantifying Nonclassicality: One common measure is the total non-positivity:
    N(Q) = \sum_{i,j} |Q_{i,j}| - 1 \ge 0,
    where N(Q) = 0 for states that behave classically with respect to the chosen observables.

Connections to Quantum Foundations:

  • Negativity in the KD distribution can serve as a witness of contextuality, showing that measurement outcomes cannot be explained by noncontextual hidden-variable models.
  • It signals the incompatibility of observables, reflecting the fundamentally noncommutative structure of quantum mechanics.
  • It is also related to violations of macrorealism, for example as quantified by Leggett–Garg inequalities.

Applications in Modern Quantum Research

  • Quantum Metrology: The imaginary part enables ultra-sensitive parameter estimation.
  • Weak Measurements: Negativity gives rise to anomalous weak values, allowing signal amplification.
  • Quantum Thermodynamics: Defines coherent work distributions and fluctuation theorems.
  • Quantum Information: Quantifies coherence, entanglement, and information scrambling via out-of-time-order correlators (OTOCs). It can also be used for direct quantum state measurement, requiring fewer resources than full state tomography.
  • Quantum Computing: The nonclassicality of the KD distribution is closely related to the computational power of magic-state injection schemes in quantum computation. (This connection has been extensively studied within the Wigner distribution framework, as it has been recognized that the advantage of quantum computation is intimately tied to quantum contextuality; both the KD distribution and the Wigner distribution are closely related to quantum contextuality.)

Kitaev Algorithm for Quantum Phase Estimation

In this post, we introduce the quantum phase estimation based on quantum Fourier transform. There is another way to implement quantum phase estimation via an interferometric scheme, usually called Kitaev algorithm.

As before, we assume that we can prepare an eigenstate |\psi\rangle of a unitary operator U,

U|\psi\rangle = e^{2\pi i \theta} |\psi\rangle ,

where we take \theta \in [-1/2,1/2) for convenience. Kitaev’s idea is to encode the phase information into the amplitudes of certain superposition states and extract \theta using the interferometric scheme.

Fig 1: interferometric scheme for cosine part.

The quantum circuit is shown in Figure 1.

Acting with a Hadamard gate on the control qubit gives

\frac{1}{\sqrt{2}}(|0\rangle|\psi\rangle + |1\rangle|\psi\rangle).

Applying the controlled-U gate yields

\frac{1}{\sqrt{2}}(|0\rangle|\psi\rangle + e^{2\pi i\theta}|1\rangle|\psi\rangle),

so the phase is now encoded in the relative amplitude. A final Hadamard gate transforms the state to

|\Phi_f^{c}\rangle = \frac{1}{\sqrt{2}}(|+\rangle|\psi\rangle + e^{2\pi i\theta}|-\rangle|\psi\rangle)

= \frac{1+e^{2\pi i \theta}}{2}|0\rangle|\psi\rangle +  \frac{1-e^{2\pi i\theta}}{2}|1\rangle|\psi\rangle

Thus the measurement probabilities are

p(0) = \left|\frac{1+e^{2\pi i\theta}}{2}\right|^2 = \frac{1+\cos(2\pi\theta)}{2}

p(1) = \left|\frac{1-e^{2\pi i\theta}}{2}\right|^2= \frac{1-\cos(2\pi\theta)}{2}

These measurements allow us to extract the cosine of the phase,

\theta = \frac{1}{2\pi}\cos^{-1}(2p(0)-1)

or equivalently,

\theta = \frac{1}{2\pi}\cos^{-1}(1-2p(1))

However, we know that \cos(2\pi \theta) cannot distinguish between
\theta and -\theta.
It is therefore natural to incorporate the sine component \sin(2\pi\theta) to resolve this ambiguity.
By first determining the sign of \theta from the sine measurement, we can then output the final estimate as

\theta = \pm \frac{1}{2\pi}\cos^{-1}(2p(0)-1),

where the sign is chosen according to whether \sin(2\pi\theta) is positive or negative.

Fig 2. interferometric scheme for sine part.

To obtain the sine component, we slightly modify the circuit by inserting a phase gate S on the control qubit prior to the controlled-U operation, see Figure 2.

A parallel computation yields

|\Phi_f^{s}\rangle = \frac{1+ie^{2\pi i\theta}}{2}|0\rangle|\psi\rangle+\frac{1-ie^{2\pi i\theta}}{2}|1\rangle|\psi\rangle.

The measurement probabilities are now

p(0) = \left|\frac{1+ie^{2\pi i\theta}}{2}\right|^2= \frac{1-\sin(2\pi\theta)}{2}, \quad p(1) = \left|\frac{1-ie^{2\pi i\theta}}{2}\right|^2= \frac{1+\sin(2\pi\theta)}{2}

Therefore,

\theta = \frac{1}{2\pi}\sin^{-1}(1-2p(0))

or equivalently,

\theta = \frac{1}{2\pi}\sin^{-1}(2p(1)-1)

with \sin^{-1} taking values in [-\pi,\pi).

Combining the estimates of \cos(2\pi\theta) and \sin(2\pi\theta) allows us to reconstruct \theta uniquely in the range [-1/2,1/2).

Shor’s Algorithm

Shor’s algorithm, introduced by Peter Shor in 1994 (algorithms for quantum computation: discrete logarithms and factoring), is the breakthrough quantum algorithm that factors large integers in polynomial time, a task widely believed to be intractable for classical computers. Given an integer N, the algorithm finds its prime factors in O\left((\log N)^2 (\log \log N)(\log \log \log N)\right) time on a quantum computer. This is exponentially faster than the best classical algorithms, such as the general number field sieve, which runs in sub-exponential time O\left(e^{c (\log N)^{1/3} (\log \log N)^{2/3}}\right), where c = \frac{1}{3}(92 + 26\sqrt{13})^{1/3}.

Peter Shor, Photo from this link.

The discovery of Shor’s algorithm showed that a large quantum computer would be capable of breaking widely used public-key cryptosystems, most notably RSA, whose security relies on the assumed hardness of integer factorization. This result initiated both the global effort to build scalable quantum computers and the rapid development of post-quantum cryptography, which seeks cryptographic protocols secure even against quantum attackers.

RSA Cryptosystem

Since Shor’s algorithm was primarily designed to break the RSA public-key cryptosystem. For completeness, we first review the core principles of RSA. The RSA cryptosystem, along with its variants, is one of the most widely deployed methods for securely transmitting messages over public channels, such as the Internet, where it protects confidentiality and enables digital signatures. Its security relies fundamentally on a single, long-standing computational assumption: that factoring a large composite integer into its prime factors is extraordinarily difficult for classical computers.

It is worth emphasizing that testing whether a number is prime is computationally feasible, whereas factoring a large composite number into its prime factors remains extraordinarily difficult. RSA cryptography leverages precisely this asymmetry: while anyone can efficiently verify the primality of the factors used in key generation, recovering those factors from the public modulus is, for classical computers, practically intractable. This one-way computational difficulty forms the cornerstone of RSA’s security, enabling secure encryption and decryption of messages.

The RSA cryptosystem operates as follows:

  1. Alice prepares two very large prime numbers p and q, which she keeps secret. She computes N = pq and publishes N. She also chooses a number, called the public exponent (or encoding exponent), e < N, which is required to be relatively prime to (p-1)(q-1), that is, \mathbf{gcd}(e,(p-1)(q-1)) = 1. Alice then computes the multiplicative inverse of e modulo (p-1)(q-1), namely a number d satisfying

de \equiv 1 \pmod{(p-1)(q-1)}

She keeps d secret, thus called the secret exponent (or decoding exponent).

  1. When Bob wants to send a message to Alice, he first encodes the message as an integer m with m < N. He encrypts it as

M = m^{e} \pmod{N}

  1. Upon receiving M, Alice decrypts it using

m \equiv M^{d} \pmod{N}

where d is the multiplicative inverse of the public exponent e modulo (p-1)(q-1).

To see that the RSA cryptosystem correctly recovers the message m using m \equiv M^{d} \pmod{N}, we need to check

m \equiv (m^e)^{d} \pmod{N}.

From de \equiv 1 \pmod{(p-1)(q-1)}, there exists an integer k such that de = k (p-1)(q-1) + 1. Hence

(m^e)^{d} = m \bigl[ m^{k(q-1)} \bigr]^{p-1}.

Suppose m is not a multiple of p. Fermat’s little theorem gives

\bigl[ m^{k(q-1)} \bigr]^{p-1} \equiv 1 \pmod{p}.

If m is a multiple of p, then m^{de} \equiv 0 \pmod{p}.

Similarly,

\bigl[ m^{k(p-1)} \bigr]^{q-1} \equiv 1 \pmod{q}

when m is not a multiple of q, and m^{de} \equiv 0 \pmod{q} if m is a multiple of q.

We now verify m^{de} \equiv m \pmod{N} by considering several cases.

  1. If m is a multiple of both p and q, then it is automatically a multiple of N = pq, and m^{de} \equiv m \pmod{N} holds trivially.
  2. If m is a multiple of p but not of q, then

m \equiv 0 \pmod{p}, \quad m^{de} \equiv 0 \pmod{p},

and

\bigl[m^{k(p-1)}\bigr]^{q-1} \equiv 1 \pmod{q}.

Hence,

m^{de} \equiv m \pmod{q}, \quad m^{de} \equiv m \pmod{p}.

By the Chinese remainder theorem{}^1,

m^{de} \equiv m \pmod{N}.

We are using a basic consequence of the Chinese remainder theorem: if two integers a and b satisfy a \equiv b \pmod{p} and a \equiv b \pmod{q}, with \gcd(p,q) = 1, then a \equiv b \pmod{pq}. In particular, if p and q are prime, it is easier to see p \mid (a-b) and q \mid (a-b) implies pq \mid (a-b).

  1. If m is a multiple of q but not of p, the argument is symmetric, giving

m^{de} \equiv m \pmod{N}.

  1. If m is not a multiple of either p or q, then Fermat’s little theorem gives

m^{de} \equiv m \pmod{p}, \quad m^{de} \equiv m \pmod{q}.

Again, by the Chinese remainder theorem,

m^{de} \equiv m \pmod{N}.

This completes the proof of m^{de} \equiv m \pmod{N}.

We see from the above analysis that if one could factor N efficiently, then the RSA system would be completely compromised. Indeed, factoring N reveals the primes p and q, from which one can compute (p-1)(q-1), determine the public exponent e, and then recover the secret exponent d. With d in hand, any encrypted message can be decrypted.

Shor’s Algorithm

Definition (The Factoring Problem)
The input to Shor’s algorithm is an odd composite integer N = p q, where p and q are distinct large primes. The goal is to find the prime factors p and q.

Shor’s algorithm is a hybrid quantum-classical algorithm that factors N in polynomial time on a quantum computer. It consists of two main phases:

  • Classical reduction: Transform factoring N into an instance of the order-finding problem (see this post).
  • Quantum order-finding: Solve the order-finding problem efficiently using a quantum computer.

The classical reduction (which succeeds with probability > 1/2 for random choices) is as follows:

1. Choose a random integer a \in {2, 3, \dots, N-1} and compute d = \gcd(a, N):

  • If 1 < d < N, then d is a non-trivial factor of N. Factoring complete.
  • If d = 1, continue.

2. Find the multiplicative order r of a modulo N, i.e., the smallest positive integer r such that
a^r \equiv 1 \pmod{N}.

3. If r is odd, return to the first step with a new a.

4. If r is even, compute b = a^{r/2} \mod N:

  • If b \equiv -1 \pmod{N}, return to the first step.
  • If b \not\equiv \pm 1 \pmod{N}, proceed.

5. Compute the factors:
p = \gcd(b - 1, N), q = \gcd(b + 1, N).
At least one of these is a non-trivial factor of N.

The only step that is computationally infeasible on a classical computer is step 2: finding the order r.
Shor’s breakthrough contribution is a quantum algorithm that performs this order-finding step in polynomial time using quantum phase estimation on the unitary operator x \mapsto a x \mod N.

Once r is obtained quantumly, the remaining classical post-processing immediately yields the prime factors p and q. Thus, on a sufficiently large fault-tolerant quantum computer, Shor’s algorithm solves the integer factorization problem efficiently, rendering RSA and related cryptosystems insecure.

Remark (Why the Reduction Works)

Assume that r is even and that a^{r/2} \not\equiv -1 \pmod{N} (the cases explicitly filtered out by the algorithm). Since r is the order of a modulo N, we have
a^r \equiv 1 \pmod{N}.
This implies
a^r - 1 = (a^{r/2} - 1)(a^{r/2} + 1) \equiv 0 \pmod{N},
so N \mid (a^{r/2} - 1)(a^{r/2} + 1), or equivalently,
pq \mid (a^{r/2} - 1)(a^{r/2} + 1).

Because p and q are distinct primes, the prime factors of the product (a^{r/2}-1)(a^{r/2}+1) can be distributed in only four logical ways:

  1. p divides a^{r/2}-1 and q divides a^{r/2}+1
  2. p divides a^{r/2}+1 and q divides a^{r/2}-1
  3. pq divides a^{r/2}-1 (i.e., N divides a^{r/2}-1)
  4. pq divides a^{r/2}+1 (i.e., N divides a^{r/2}+1)

Cases 1 and 2 are precisely what we want: they yield non-trivial factors via the gcd computations
p = \gcd(a^{r/2}-1, N) and q = \gcd(a^{r/2}+1, N) (or vice versa).
Cases 3 and 4, however, must be excluded:

  • Case 3 is impossible: If N divides a^{r/2}-1, then a^{r/2} \equiv 1 \pmod{N}.
    But r is the order of a modulo N, so r would have to divide r/2, which is impossible unless r = 0, a contradiction.
  • Case 4 is impossible under our assumption: If N divides a^{r/2}+1, then a^{r/2} \equiv -1 \pmod{N}.
    This situation is explicitly detected and rejected in the algorithm (step 5).

Thus, whenever the algorithm reaches step 6, we are guaranteed to be in Case 1 or Case 2, and the two gcd operations recover the prime factors p and q of N.

Example

Let us illustrate Shor’s algorithm with the integer N = 15.

  1. Choose a random number: Let a = 7. Compute \gcd(7, 15) = 1.
    Since \gcd(7, 15) = 1, we proceed to the next step.
  2. Find the period r: Using a quantum computer, we determine the period of the function
    f(x) = 7^x \bmod 15.
    The result is r = 4, since 7^4 \equiv 1 \pmod{15}.
  3. Use the period to find the factors:
    • Since r = 4 is even, compute 7^{r/2} - 1 = 7^2 - 1 = 48.
    • Compute the gcd: \gcd(48, 15) = 3, which is one factor of 15.
    • Similarly, 7^{r/2} + 1 = 7^2 + 1 = 50, and \gcd(50, 15) = 5, which is the other factor.

Thus, we have successfully factored 15 = 3 \times 5.

Grover’s Search Algorithm

Grover’s search algorithm, also known as the quantum search algorithm, is a cornerstone of quantum computation, offering a quadratic speedup for unstructured search problems. It was introduced by Lov Kumar Grover in 1996 in his paper A fast quantum mechanical algorithm for database search [arXiv:quant-ph/9605043]. Whereas classical algorithms require O(N) queries to search an unsorted database of N elements, Grover’s algorithm achieves the same task with only O(\sqrt{N}) queries by exploiting quantum parallelism and amplitude amplification.

Fig. 1. Lov Kumar Grover, pioneer of algorithms for quantum computing. Photo is from this link.

Searching an unstructured database is a fundamental task in many practical applications, such as finding the lowest-priced product, determining the shortest path to a destination via transfers, and solving other optimization problems. In this post, we begin with the simplest case of a single marked item and later consider the general case with multiple marked items. The core of Grover’s algorithm is the Grover iteration. The algorithm’s precision depends on the ratio M/N, where M is the number of marked items and N is the total number of items.

Definition.(Unstructured search problem) The goal of Grover’s algorithm is to locate the marked element in an unsorted database. Given an oracle function

f : \{0, 1\}^n \rightarrow \{0, 1\},

where f(x) = 1 for the marked item x' and f(x') = 0 for all other items$, the task is to find the marked elements x' using fewer oracle queries than would be required classically.

Quantum oracle

To solve this problem quantumly, we also need to introduce a quantum oracle.

Recall that the classical oracle is a function f : \{0, 1\}^n \rightarrow \{0, 1\}, with f(x) = 1 for the marked item x_{\text{marked}} and f(x) = 0 for all other items.

We can generalize this to a quantum oracle U_f, which is a unitary operation that flips the sign of the amplitude of the marked state:

U_f |x\rangle = (-1)^{f(x)} |x\rangle,

where |x\rangle = |x_1\rangle |x_2\rangle \cdots |x_n\rangle corresponds to the binary string x.

This construction is the same as the oracles encountered in the Deutsch–Jozsa and Bernstein–Vazirani algorithms. By introducing an ancillary qubit initialized in the state |-\rangle, we define the oracle

\tilde{U}_f |x\rangle |y\rangle = |x\rangle |y \oplus f(x)\rangle,

so that

\tilde{U}_f |x\rangle (|0\rangle - |1\rangle) = (-1)^{f(x)} |x\rangle (|0\rangle - |1\rangle) = (U_f |x\rangle) (|0\rangle - |1\rangle).

Dropping the ancillary qubit then recovers U_f |x\rangle = (-1)^{f(x)} |x\rangle.

In particular, for the marked state x', f(x') = 1, so its phase is flipped: U_f |x'\rangle = - |x'\rangle. For all unmarked elements, f(x) = 0, so their amplitudes remain unchanged.

Grover iteration

The algorithm proceeds by repeatedly applying a procedure known as the Grover iteration or Grover operator to amplify the amplitude of the marked state. As we shall see, each Grover iteration corresponds to a rotation within a two-dimensional plane. The rotation angle at each step depends on the ratio M/N between the number of marked items and the total number of items.

Suppose the search space contains N=2^n items, of which M are marked. Define the states

|\alpha\rangle = \frac{1}{\sqrt{N-M}} \sum_{x: \text{ unmarked}} |x\rangle,

representing an equal superposition of all unmarked items, and

|\beta\rangle = \frac{1}{\sqrt{M}} \sum_{x: \text{ marked}} |x\rangle,

representing an equal superposition of all marked items.

The uniform superposition over all items,

|\psi\rangle = \frac{1}{\sqrt{N}} \sum_x |x\rangle = \frac{1}{\sqrt{2^n}} \sum_{x\in {0,1}^n} |x\rangle = H^{\otimes n} |0\rangle^{\otimes n},

can be expressed as a superposition of |\alpha\rangle and |\beta\rangle:

|\psi\rangle = \frac{\sqrt{N-M}}{\sqrt{N}} |\alpha\rangle + \frac{\sqrt{M}}{\sqrt{N}} |\beta\rangle.

Fig. 2. Illustration of a Grover iteration. Each iteration rotates the state vector in the two-dimensional plane spanned by |\alpha\rangle and |\beta\rangle, gradually increasing the amplitude of the marked state. Notice that for Grover’s search, the initial state is the uniform superposition over all items |\psi\rangle.

We assume that M/N is small. In this regime, the coefficient c_{\alpha} = \frac{\sqrt{N-M}}{\sqrt{N}} corresponding to the unmarked states is much larger than the coefficient c_{\beta} = \frac{\sqrt{M}}{\sqrt{N}} associated with the marked states. Consequently, the initial state |\psi\rangle is predominantly aligned along |\alpha\rangle, and the Grover iteration rotates the state vector slowly toward |\beta\rangle, allowing gradual amplitude amplification of the marked state.

It is convenient to introduce an angle \theta defined by

\cos \frac{\theta}{2} = \frac{\sqrt{N-M}}{\sqrt{N}}, \qquad \sin \frac{\theta}{2} = \frac{\sqrt{M}}{\sqrt{N}}.

This angle quantifies the initial projection of |\psi\rangle onto the marked subspace and will be useful for understanding Grover iteration geometrically.

Each Grover iteration consists of two main steps:

  1. Oracle Query (Phase Flip): Apply the oracle U_f, which flips the phase of the marked state:
    U_f |x\rangle = (-1)^{f(x)} |x\rangle. Geometrically, this can be viewed as a reflection about the state |\alpha\rangle. For any state in the two-dimensional subspace spanned by |\alpha\rangle and |\beta\rangle,
    |\varphi\rangle = a |\alpha\rangle + b |\beta\rangle,
    the action of the oracle gives
    |\varphi'\rangle = U_f |\varphi\rangle = a |\alpha\rangle - b |\beta\rangle,
    showing that the |\beta\rangle component is reflected.
  2. Amplitude Amplification (Grover Diffusion Operator): Apply the diffusion operator
    D = 2 |\psi\rangle \langle \psi| - I,
    where |\psi\rangle is the uniform superposition over all items as defined earlier. This operation can be interpreted as a reflection about |\psi\rangle in the plane spanned by |\alpha\rangle and |\beta\rangle. To see this, define
    |\psi^\perp\rangle = -c_\beta |\alpha\rangle + c_\alpha |\beta\rangle,
    where c_\alpha and c_\beta are the coefficients of |\psi\rangle in the |\alpha\rangle, |\beta\rangle basis, so that |\psi^\perp\rangle is orthogonal to |\psi\rangle in the plane. Any state in this plane can be expressed as
    |\chi\rangle = a |\psi\rangle + b |\psi^\perp\rangle. Acting with D yields
    |\chi'\rangle = D |\chi\rangle = a |\psi\rangle - b |\psi^\perp\rangle,
    confirming that D performs a reflection about |\psi\rangle.

The combined Grover iteration operator is defined as

G = D U_f = (2 |\psi\rangle\langle \psi| - I)\, U_f.

Within the two-dimensional subspace spanned by |\alpha\rangle and |\beta\rangle, consider an initial state

|\Phi\rangle = \cos\frac{\theta}{2}\, |\alpha\rangle + \sin\frac{\theta}{2}\, |\beta\rangle,

whose angle from |\alpha\rangle is \theta/2. Applying the oracle U_f first reflects the state about |\alpha\rangle, and subsequently applying D reflects the resulting state about |\psi\rangle. The composition of these two reflections produces a net rotation by an angle 3\theta/2 toward |\beta\rangle.

In Grover’s search, the initial state is the uniform superposition |\psi\rangle, which lies at an angle \theta/2 from |\alpha\rangle. Each application of G therefore increases this angle by \theta, gradually amplifying the amplitude of |\beta\rangle and increasing the probability of measuring the marked state.

Exercise. Verify that G |\psi\rangle performs the rotation illustrated in Fig. 2. Show explicitly that the Grover iteration acts as a rotation matrix in the basis {|\alpha\rangle, |\beta\rangle}.

From Fig. 2, it is clear that \theta becomes small when the ratio M/N is small. Consequently, each Grover iteration corresponds to a small but precise rotation, making the algorithm particularly powerful for large databases with relatively few marked items.

Grover’s Search Algorithm

The detailed procedure of Grover’s search algorithm is given below.

1. Initialization

  • Prepare n qubits in the state |0\rangle^{\otimes n}.
  • Apply the Hadamard transform H^{\otimes n} to obtain the equal superposition
    |\psi\rangle = H^{\otimes n} |0\rangle^{\otimes n} = \frac{1}{\sqrt{N}} \sum_{x=0}^{N-1} |x\rangle.

2. Grover iterations

Apply the Grover iteration operator G for O(\sqrt{N}) rounds. Each round consists of:

  • Oracle U_f, flipping the phase of the marked state.
  • Diffusion operator D, performing amplitude amplification.

3. Measurement

After approximately O(\sqrt{N}) iterations, the amplitude of the marked state is close to one. Measuring the qubits yields marked item x' with high probability.

Fig. 3. Quantum circuit for Grover’s search algorithm.

To analyze the behavior of the algorithm, let us assume for simplicity that there is exactly one marked item, M=1. Initially, all basis states have amplitude \frac{1}{\sqrt{N}}. After applying the oracle and the diffusion operator, the amplitude of the marked state increases, while the amplitudes of unmarked states decrease.

Each Grover iteration acts as a rotation in the two-dimensional subspace spanned by the unmarked-state superposition |\alpha\rangle and the marked state |\beta\rangle. The rotation angle is \theta \approx 2 \sin^{-1}\left( \frac{1}{\sqrt{N}} \right). After k iterations, the state becomes
G^k |\psi\rangle = \cos\left( \frac{2k+1}{2}\theta \right)|\alpha\rangle + \sin\left( \frac{2k+1}{2}\theta \right)|\beta\rangle.

Thus, repeated applications of G rotate the initial state toward the marked state. The optimal number of iterations is
k_{\mathrm{opt}} \approx \frac{\pi}{4}\sqrt{N},
which maximizes the probability of obtaining |\beta\rangle upon measurement. Applying G more times causes an overshoot and reduces the success probability.

Bernstein–Vazirani Algorithm

The Bernstein–Vazirani algorithm is one of the earliest and clearest examples of a quantum algorithm that significantly outperforms any classical counterpart for a well-defined problem. It can be viewed as an application of the Deutsch-Jozsa algorithm. The algorithm was first introduced by Ethan Bernstein and Umesh Vazirani in their 1993 paper and later discussed again in their 1997 paper. The algorithm treats the following problem:

Fig 1. Umesh Vazirani (photo from this talk). I could not find any photo of Ethan Bernstein online; if you come across one, please email me.

Definition (Bernstein–Vazirani problem): Let f be a function from n-bit strings to a single bit, f:\{0,1\}^n \to \{0,1\}, such that there exists a secret bit string s \in \{0,1\}^n with the property that, for every input x \in \{0,1\}^n, f(x) = x \cdot s, where the dot product is defined by x \cdot s = x_1 s_1 \oplus x_2 s_2 \oplus \cdots \oplus x_n s_n, with x = x_1 x_2 \cdots x_n and s = s_1 s_2 \cdots s_n. The task is to determine the secret string s using as few queries to f as possible.

Exercise. 1. For the 3-bit string s = 011, compute x \cdot s for all x \in \{0,1\}^3. As an example, take x = 001. Then x \cdot s = (0 \times 0) \oplus (0 \times 1) \oplus (1 \times 1) = 1 The function defined by f(x) := x \cdot s is an example of a function satisfying the Bernstein–Vazirani property.

2. Show that the majority function
\mathrm{maj}(x) = \begin{cases} 1, \text{ if majority of bits in } x \text{ are 1} \\ 0, \text{ otherwise} \end{cases}
is a counterexample: it does not satisfy the Bernstein–Vazirani property.

Hint: Any function f(x) = x \cdot s is linear over \mathbf{F}_2, i.e. it obeys

f(x \oplus y) = f(x) \oplus f(y) \quad \text{for all } x, y.

To disprove that \mathrm{maj} has the Bernstein–Vazirani form, find explicit x, y for which

\mathrm{maj}(x \oplus y) \neq \mathrm{maj}(x) \oplus \mathrm{maj}(y).

For instance, take x = 110 and y = 011. Then x \oplus y = 101, while

\mathrm{maj}(110) = 1, \quad \mathrm{maj}(011) = 1, \quad \mathrm{maj}(101) = 1,

so \mathrm{maj}(110) \oplus \mathrm{maj}(011) = 0 \neq 1 = \mathrm{maj}(101), showing the required contradiction. ∎

Classically, we can solve the problem by considering a complete basis of bit strings:

10\cdots 0, \quad 010\cdots 0, \quad \dots, \quad 0\cdots 01.

Fig 2. Quantum circuit for the Bernstein-Vazirani algorithm.

If we query the function f(x) := x \cdot s on these strings, we obtain

10\cdots 0 \cdot s = s_1,

010\cdots 0 \cdot s = s_2,

\vdots

0\cdots 01 \cdot s = s_n.

From this, it is clear that the secret string s can be determined with n queries. Because the n bits of s are independent, there is no way to do better classically. Remarkably, a quantum computer allows us to solve the same problem with far fewer queries.

From the expression
H^{\otimes n} |x\rangle = \frac{1}{\sqrt{2^n}} \sum_{z \in \{0,1\}^n} (-1)^{x \cdot z} |z\rangle,
we observe that an inner product appears in the phase factor (-1)^{x \cdot z}. If we can prepare the corresponding state, applying Hadamard gates directly yields s. This state naturally appears just before the final Hadamard transforms in the Deutsch–Jozsa algorithm(see this post), so the same circuit can be used to solve the Bernstein–Vazirani problem with only a single query.

To implement this, we prepare an n-qubit input register in the state |0\rangle^{\otimes n} and a single auxiliary qubit in the state |1\rangle. Applying Hadamard gates to all n+1 qubits yields

|\psi\rangle = \frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} |x\rangle \otimes \frac{1}{\sqrt{2}} \bigl(|0\rangle - |1\rangle\bigr)

We then apply the quantum oracle U_f encoding f:

U_f \, |x\rangle |y\rangle = |x\rangle \, |y \oplus f(x)\rangle

After acting on |\psi\rangle, the combined state becomes

\frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} (-1)^{f(x)} |x\rangle \otimes \frac{1}{\sqrt{2}} \bigl(|0\rangle - |1\rangle\bigr)

This has been carefully discussed in Deutsch–Jozsa algorithm(see this post).

On the input register, this action is equivalently described by the operator Q_f:

Q_f \frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} |x\rangle = \frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} (-1)^{f(x)} |x\rangle

Recall that f(x) := x \cdot s, then apply quantum oracle Q_f to get

\frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} (-1)^{x \cdot s} |x\rangle

Then applying Hadamard gates H^{\otimes n} will give |s\rangle. By measuring the final state, we can read out s.

The quantum circuit of the Bernstein-Vazirani algorithm is completely the same as that for Deutsch-Jozsa. A simpler version is given in Figure 2 based on Q_f.

Deutsch-Jozsa Algorithm

The Deutsch-Jozsa algorithm, introduced by David Deutsch and Richard Jozsa in their 1992 paper Rapid solution of problems by quantum computation, marked an important early milestone in the development of quantum algorithms. The formulation familiar to us today—featuring the standard quantum circuit model and oracle—was later refined in 1998 by Richard Cleve, Artur Ekert, Chiara Macchiavello, and Michele Mosca in their paper Quantum algorithms revisited. Both papers are highly readable, and we encourage interested readers to consult them directly.

Fig.1 David Deutsch(Photo Credits: Lulie Tanett, see this link) and Richard Jozsa (Photo from this link)

As a deterministic quantum algorithm, the Deutsch-Jozsa algorithm has limited direct practical applications. Its significance lies in demonstrating, for the first time, that quantum computation can achieve an exponential speedup over all deterministic classical algorithms. Studying this algorithm offers valuable insight into how quantum algorithms are structured and why they can outperform classical methods.

It is worth noting that this exponential separation holds only when compared with deterministic classical algorithms. If classical randomness is allowed, such an exponential advantage no longer appears. The first quantum algorithm to establish an exponential speedup over any classical stochastic algorithm is Simon’s algorithm, see this post.

Definition (Problem of the Deutsch-Jozsa Algorithm). Consider a function

f: \{0,1\}^n \to \{0,1\}

that accepts n-bit binary inputs and outputs either 0 or 1. It is guaranteed that f is either constant (producing the same output for all inputs) or balanced (outputting 1 for exactly half of the inputs and 0 for the other half). The objective is to determine whether f is constant or balanced by querying the oracle.

For example, consider a one-bit input function f: \{0,1\} \to \{0,1\}. The function is constant if f(0) = f(1), and balanced if f(0) \neq f(1) (equivalently, f(0) \oplus f(1) = 1, where \oplus denotes addition modulo 2). Determining whether f is constant or balanced is thus equivalent to evaluating f(0) \oplus f(1). This can also be viewed as a quantum XOR game: if the result is 0, then f is constant; otherwise, f is balanced.

Deutsch-Jozsa Algorithm: n=1 Case

Let us begin with the simplest example, namely the n=1 case of the Deutsch-Jozsa algorithm, to gain some intuition about how quantum algorithms work.

Classically, we would need to evaluate f twice to solve the problem, namely compute f(0) and f(1). There is no way around this. Quantum mechanically, we can do better. To begin, we need to “promote” the function f to a unitary operation U_f, defined by

U_f |x\rangle |y\rangle = |x\rangle |y \oplus f(x)\rangle

We do not concern ourselves here with how to implement U_f in an actual quantum circuit. Instead, we treat it as a black box that can be used freely. Such a black box is called a quantum oracle, and we will encounter this concept frequently in many other quantum algorithms, like Simon’s algorithm, Grover search algorith, Shor’s algorithm, etc.

Exercise. As a simple practice, prove that U_f defined above is a unitary operator for both balanced and constant functions f.

Proof. Consider the action of U_f on the four computational basis states:

|0\rangle |0\rangle \mapsto |0\rangle |0 \oplus f(0)\rangle

|0\rangle |1\rangle \mapsto |0\rangle |1 \oplus f(0)\rangle

|1\rangle |0\rangle \mapsto |1\rangle |0 \oplus f(1)\rangle

|1\rangle |1\rangle \mapsto |1\rangle |1 \oplus f(1)\rangle

Notice that 0 \oplus f(0) \neq 1 \oplus f(0) regardless of the value of f(0), and similarly 0 \oplus f(1) \neq 1 \oplus f(1).

Hence, the images of the basis states under U_f form an orthonormal basis, which shows that U_f is indeed unitary. ∎

With the quantum oracle U_f, the Deutsch-Jozsa algorithm works as follows.
We prepare two qubits: one for the input of f and one for the output. They are initialized in the state |0\rangle |1\rangle. Then we perform the following steps:

  1. Apply Hadamard gates to both qubits: (H \otimes H) \, |0\rangle |1\rangle.
  2. Apply the oracle: (U_f \otimes I) \big( (H \otimes H) \, |0\rangle |1\rangle \big).
  3. Apply a Hadamard gate to the first qubit: (H \otimes I) \, (U_f \otimes I) \big( (H \otimes H) \, |0\rangle |1\rangle \big).
  4. Measure the first qubit in the computational basis.

Let us examine what happens step by step. After steps 1 and 2, we obtain

|0\rangle |1\rangle \overset{H\otimes H}{\longrightarrow} |+\rangle |-\rangle = \frac{1}{2} \Big( |0\rangle (|0\rangle - |1\rangle) + |1\rangle (|0\rangle - |1\rangle) \Big)

\overset{U_f\otimes I}{\longrightarrow} \frac{1}{2}\Big( |0\rangle \big( |0 \oplus f(0)\rangle - |1 \oplus f(0)\rangle \big) + |1\rangle \big( |0 \oplus f(1)\rangle - |1 \oplus f(1)\rangle \big) \Big).

Next, we use a simple but important trick: we rewrite the action of the oracle so that the value of f(x) appears as a phase. The key identity is

|0 \oplus z\rangle - |1 \oplus z\rangle = (-1)^z \big( |0\rangle - |1\rangle \big),

valid for z = 0,1. The state after applying the oracle can be rewritten as

\frac{1}{2} \Big( |0\rangle (|0 \oplus f(0)\rangle - |1 \oplus f(0)\rangle) + |1\rangle (|0 \oplus f(1)\rangle - |1 \oplus f(1)\rangle) \Big)

= \frac{1}{2} \Big( (-1)^{f(0)} |0\rangle (|0\rangle - |1\rangle) + (-1)^{f(1)} |1\rangle (|0\rangle - |1\rangle) \Big)

= (-1)^{f(0)} \frac{1}{2} \Big( |0\rangle + (-1)^{f(0)\oplus f(1)} |1\rangle \Big) (|0\rangle - |1\rangle).

Ignoring the second qubit and the global phase, the first qubit is in the state

\frac{1}{\sqrt{2}} \Big( |0\rangle + (-1)^{f(0) \oplus f(1)} |1\rangle \Big).

Fig.2 Quantum circuit for Deutsch–Jozsa algorithm (n=1 case).

Our goal is to determine f(0) \oplus f(1), which is encoded in the relative phase (-1)^{f(0) \oplus f(1)} of the first qubit. To read out this information, we apply a Hadamard gate to the first qubit and then measure it in the computational basis. This is an example of an interferometric measurement scheme, a standard method for extracting the coefficient of a state in a given basis; you will encounter this technique in many other contexts as well.

After applying the Hadamard, the state of the first qubit becomes

\frac{1}{2} \Big( |0\rangle + |1\rangle + (-1)^{f(0)\oplus f(1)} |0\rangle - (-1)^{f(0)\oplus f(1)} |1\rangle \Big)

= \frac{1}{2} \Big( \big(1 + (-1)^{f(0)\oplus f(1)}\big) |0\rangle + \big(1 - (-1)^{f(0)\oplus f(1)}\big) |1\rangle \Big).

The probabilities of measuring the first qubit in the computational basis are then

p(0) = \frac{1}{4} \big| 1 + (-1)^{f(0) \oplus f(1)} \big|^2,
p(1) = \frac{1}{4} \big| 1 - (-1)^{f(0) \oplus f(1)} \big|^2.

Hence, if f(0)\oplus f(1) = 0 (the function is constant), we will certainly measure |0\rangle.
If f(0)\oplus f(1) = 1 (the function is balanced), we will measure |1\rangle.

In other words, a single query to the quantum oracle suffices to determine with certainty whether f is constant or balanced. In contrast, a classical algorithm would require evaluating f twice.

The key mechanism at work here is quantum parallelism: in step 2, the oracle evaluates ff f across both inputs in parallel, storing the outcomes in superposition. The concluding Hadamard transformation then translates this superposed information into observable probability amplitudes, from which we can efficiently determine the answer.

Deutsch-Jozsa Algorithm

Now let us turn to the general case in which f has an n-bit input. As in the n = 1 case, the classical oracle must be replaced by a quantum oracle U_f that coherently evaluates the function. By definition, the quantum oracle acts as

Fig. 3 Quantum circuit for Deutsch–Jozsa algorithm.

U_f\,|x\rangle |y\rangle = |x\rangle\, |y \oplus f(x)\rangle,

where x = x_1 \cdots x_n is an n-bit string, and we write

|x\rangle := |x_1\rangle \cdots |x_n\rangle.

This means we use the canonical encoding that maps each bit string to a corresponding computational basis state.

Exercise. Prove that the quantum oracle U_f defined by U_f\,|x\rangle|y\rangle = |x\rangle|y\oplus f(x)\rangle is unitary.

Proof. The operator U_f permutes the computational basis states. Indeed, for each basis vector |x\rangle|y\rangle the image |x\rangle|y\oplus f(x)\rangle is again a computational-basis vector, and distinct input basis vectors have distinct images because y\mapsto y\oplus f(x) is a bijection on ${0,1}$. Hence the set of images of the computational basis is another orthonormal basis of the Hilbert space. A linear map that sends an orthonormal basis to an orthonormal basis preserves inner products and therefore is unitary. Thus U_f is unitary (Equivalently, one can note that U_f^2 = I since (y\oplus f(x))\oplus f(x)=y, so U_f is invertible with U_f^{-1}=U_f, and a permutation matrix with real entries has its inverse equal to its adjoint; either way U_f is unitary.) ∎

The quantum circuit for the Deutsch–Jozsa algorithm is shown in Figure 3. The algorithm proceeds as follows:

  1. Initial State: Prepare an n-qubit input register in the state |0\rangle^{\otimes n} and a single output qubit in the state |1\rangle:
    |0\rangle^{\otimes n} \otimes |1\rangle.
  2. Hadamard Transform: Apply the Hadamard gate to each qubit:
    \frac{1}{\sqrt{2^n}} \sum_{x \in \{0,1\}^n} |x\rangle \otimes \frac{1}{\sqrt{2}} (|0\rangle - |1\rangle).
  3. Oracle Query: Apply the quantum oracle U_f:
    U_f |x\rangle |y\rangle = |x\rangle |y \oplus f(x)\rangle.
    After applying the oracle, the state becomes
    \frac{1}{\sqrt{2^n}} \sum_{x \in \{0,1\}^n} (-1)^{f(x)} |x\rangle \otimes \frac{1}{\sqrt{2}} (|0\rangle - |1\rangle),
    using the identity
    |0 \oplus f(x)\rangle - |1 \oplus f(x)\rangle = (-1)^{f(x)} (|0\rangle - |1\rangle).
    The output qubit is no longer entangled with the input register, so we can ignore it, leaving
    \frac{1}{\sqrt{2^n}} \sum_{x \in \{0,1\}^n} (-1)^{f(x)} |x\rangle.
  4. Second Hadamard Transform: Apply H^{\otimes n} to the input register:
    H^{\otimes n} |x\rangle = \frac{1}{\sqrt{2^n}} \sum_{z \in \{0,1\}^n} (-1)^{x \cdot z} |z\rangle,
    giving
    \frac{1}{2^n} \sum_{z \in \{0,1\}^n} \left( \sum_{x \in \{0,1\}^n} (-1)^{f(x) + x \cdot z} \right) |z\rangle.
  5. Measurement: Measure the n-qubit input register. The outcome reveals the type of function:
  • If f is constant, interference ensures only |0\rangle^{\otimes n} appears.
  • If f is balanced, destructive interference prevents |0\rangle^{\otimes n} from appearing.

Measuring |0\rangle^{\otimes n} indicates a constant function; any other outcome indicates a balanced function.

Exercise: Verify that the states at each step of the n-qubit Deutsch-Jozsa algorithm match the evolution described above for a given function f.

Classically, if we want to solve the problem by brute force, we need to check about half of the inputs, thus evaluating f up to 2^{n-1} times in the worst case. The Deutsch-Jozsa algorithm solves the problem with only one evaluation of the oracle, compared to the exponentially growing number of evaluations required by a classical algorithm. The quantum algorithm achieves this exponential speedup by leveraging quantum superposition and interference.

Quantum Order-Finding Algorithm

The order-finding problem is a fundamental problem that plays a central role in several quantum algorithms; in particular, it underlies Shor’s celebrated factoring algorithm. Classically, order-finding is hard—no efficient classical algorithm is known—but quantum computers can solve it efficiently. Efficiently determining the order of an integer modulo N allows integers to be factored in polynomial time, which would compromise RSA encryption, as we will discuss in the subsequent sections.

The order-finding problem is defined rigorously as follows:

Definition (Order-Finding Problem) Let N \in \mathbf{Z}_+ be a positive integer, and let a \in \mathbf{Z}_N^{\times} be an integer coprime to N (that is, the greatest common divisor \gcd(a,N) of a and N equals one, \gcd(a,N) = 1). The notation \mathbf{Z}_N^\times denotes the set of invertible elements in \mathbf{Z}_N under multiplication—those integers that are coprime to N. The order of a modulo N, denoted \mathrm{ord}_N(a) or simply r, is the smallest positive integer r such that
a^r \equiv 1 (\mathrm{mod}\, N)
The order-finding problem is: given a and N, compute r = \mathrm{ord}_N(a).

At first glance, this definition may appear somewhat abstract.
Do not worry—shortly, we will unpack each of the mathematical ingredients and illustrate their meaning with concrete examples.

The quantum algorithm for order-finding exploits the periodic structure of the function

f_{a,N}(x) \equiv a^x (\mathrm{mod}\, N) \ \ \ \ \ (1)

which is periodic with period exactly r. By preparing a quantum superposition over all possible exponents and applying the quantum phase estimation algorithm to the unitary operator

U |y\rangle = |a y (\mathrm{mod}\, N)\rangle,

we obtain an efficient quantum procedure that succeeds with high probability.

Preliminaries of Modular Arithmetic

Before presenting the complete quantum algorithm, let us first review a few essential results from number theory. This section may appear somewhat technical compared to others; if you prefer not to delve into the details, simply remember that for coprime integers a and N, there exists a positive order r, and the function in Eq.(1) is periodic and turn to the next section.

An integer N > 1 is called a prime number if it has no nontrivial divisors; for example, N = 2, 3, 5, 7, 11, \dots are prime numbers. Otherwise, N is called a composite number. We use a \mid b to denote that b is divisible by a, for instance 3 \mid 12, 2 \mid 6. Similarly, a \nmid b means that a does not divide b, for example 3 \nmid 14.

Every integer N > 1 can be uniquely expressed as a product of prime powers:

N = p_1^{e_1} \times p_2^{e_2} \times \cdots \times p_m^{e_m},

where p_1, \ldots, p_m are distinct prime numbers and e_1, \ldots, e_m are positive integers. This result is known as the Fundamental Theorem of Arithmetic.

To obtain the prime factorization of a given number N, one typically follows these steps:

  1. Divide N by 2 repeatedly until the quotient is no longer divisible by 2.
    Let e_2 be the number of times we divide by 2, and denote the resulting quotient by N_2 = N / 2^{e_2}, for which 2 \nmid N_2.
  2. For the quotient N_2, proceed with the next prime, 3, and divide repeatedly until the quotient is no longer divisible by 3.
    Let e_3 be the number of divisions, and denote the resulting quotient by N_3 = N_2 / 3^{e_3}.
  3. Continue in this manner with successive primes 5, 7, 11, \dots until the quotient becomes 1.

Exercise. Decompose N = 84 into its prime factors: 84 = 2^2 \times 3 \times 7.

The key operation we will use is modular arithmetic.
In everyday life we count linearly: 1, 2, 3, …, but in modular arithmetic we count on a circle.
Take a clock that shows only 12 hours. When the hand reaches 12, it jumps back to 0.
Mathematically, we say time is measured modulo 12.
This simple idea—wrapping around after a fixed modulus N—is crucial in number theory and its applications in modern public-key cryptography.

Definition (Congruence) Fix an integer N \ge 2, called the modulus.
Two integers a and b are congruent modulo N if their difference is a multiple of N:

a \equiv b (\mathrm{mod}\, N) \quad \Longleftrightarrow \quad N \mid (a - b) \quad \Longleftrightarrow \quad \exists\, k \in \mathbf{Z} \text{ such that } a = b + kN.

We read a \equiv b (\mathrm{mod}\, N) as “a is congruent to b modulo N”.

The congruence class of a is usually represented by its smallest non-negative residue:

a (\mathrm{mod}\, N) \in \mathbf{Z}_N := \{0, 1, 2, \dots, N-1\}

It is also convenient to use [a] to denote the congruence class whenever there is no ambiguity about the modulus.

Example. Let N = 7. We have \mathbf{Z}_7 :=\{0, 1, 2, \dots, 6\}. Then 1 \equiv 8 (\mathrm{mod}\, 7) and 2 \equiv 9 (\mathrm{mod}\, 7). The congruence class [0] consists of \{0, 7, 14, 21, \dots\}, the congruence class [1] consists of \{1, 8, 15, 22, \dots\}, and [2] consists of \{2, 9, 16, 23, \dots\}, etc.

The order-finding problem considers the modular exponentiation for given integers a and N:

a^x \equiv b (\mathrm{mod}\, N)

The remainder b can take values in 0, 1, \dots, N-1:

a \equiv 0 (\mathrm{mod}\, N)
a \equiv 1 (\mathrm{mod}\, N)
\vdots
a \equiv N-1 (\mathrm{mod}\, N)

However, not all values of b = 0, 1, \dots, N-1 are necessarily attainable.

For example, let a = 2 and N = 6. We find that

2^0 \equiv 1 (\mathrm{mod}\, 6)
2^1 \equiv 2 (\mathrm{mod}\, 6)
2^2 \equiv 4 (\mathrm{mod}\, 6)
2^3 \equiv 2 (\mathrm{mod}\, 6)
2^4 \equiv 4 (\mathrm{mod}\, 6)
2^5 \equiv 2 (\mathrm{mod}\, 6)
\vdots

We observe a repeating pattern. In this case, the residues 0, 3, and 5 never appear.

Note also that a^0 \equiv 1 (\mathrm{mod}\, N) for any integers a and N. Hence, the order of a modulo N is defined as the smallest positive integer r such that a^r \equiv 1 (\mathrm{mod}\, N). However, we must still ensure that such an order actually exists. In fact, if \gcd(a, N) = d > 1, then no positive integer r satisfies a^r \equiv 1 (\mathrm{mod}\, N). This can be seen in the previous example with a = 2 and N = 6, where the sequence never returns to 1 modulo 6.

Exercise. Let N > 1 and \gcd(a, N) = d > 1. Then the only integer r satisfying

a^r \equiv 1 (\mathrm{mod}\, N)

is r = 0.

Proof. Suppose there exists an integer r such that a^r \equiv 1 (\mathrm{mod}\, N).
Then N divides a^r - 1, so in particular

d \mid (a^r - 1).

Since d \mid a, we also have d \mid a^r, and therefore

d \mid (a^r - 1) \implies d \mid 1,

which contradicts d > 1. Hence no such r \neq 0 exists.

For r = 0, the convention a^0 = 1 gives

a^0 = 1 \equiv 1 (\mathrm{mod}\, N)

trivially. Thus r = 0 is the only solution ∎.

In Shor’s algorithm, we either assume \gcd(a, N) = 1, or first compute the greatest common divisor classically, which already yields a nontrivial factor of N if d > 1.

Another useful concept we will use is Euler’s Totient function: \varphi(N) counts the number of integers k \in {1, 2, \dots, N-1} coprime to N.
For example, \varphi(1) = 1, \varphi(2) = 1, \varphi(3) = 2, \varphi(4) = 2, etc.

For fixed N, consider \mathbf{Z}_N.
We call a \in \mathbf{Z}_N invertible if there exists an integer c such that a c \equiv 1 (\mathrm{mod}\, N).

Proposition (Bézout’s Identity). Let a and b be integers with greatest common divisor d = \gcd(a,b). Then there exist integers x and y such that

a x + b y = d.

Moreover, the integers of the form a z + b t are exactly the multiples of d.

The set \mathbf{Z}_N is not a group under multiplication modulo N in general. We denote by (\mathbf{Z}_N^{\times}, \times) the set of invertible elements, which indeed forms a group. By Bézout’s identity, \mathbf{Z}_N^{\times} consists of all integers coprime to N, and this group has order \phi(N), where \phi denotes Euler’s totient function.

From a group-theoretic perspective, for any a \in \mathbf{Z}_N^{\times}, there exists a positive integer r such that

a^r \equiv 1 (\mathrm{mod}\, N).

Since a must be invertible, it follows that \gcd(a, N) = 1. In the multiplicative group \mathbf{Z}_N^{\times}, the order of any element divides the order of the group, that is,

r \mid |\mathbf{Z}_N^{\times}| = \phi(N).

Theorem (Euler’s Theorem). If \gcd(a, N) = 1, then

a^{\phi(N)} \equiv 1 (\mathrm{mod}\, N)

and hence the order of a modulo N divides \phi(N).

By Euler’s theorem, for integers a and N with \gcd(a, N) = 1, there always exists a positive integer r such that

a^r \equiv 1 (\mathrm{mod}\, N).

The order r = \mathrm{ord}_N(a) always satisfies 1 \le r \le \phi(N).

Since in the cyclic subgroup generated by [a] \in \mathbf{Z}_N^{\times} we have

{[a^0], [a^1], [a^2], \dots, [a^{r-1}]}, \quad \text{with } [a^r] = [a^0] = [1],

the sequence a^n (\mathrm{mod}\, N) is periodic with period equal to the order r:

a^0 \equiv 1 (\mathrm{mod}\, N)
a^1 \equiv a (\mathrm{mod}\, N)
\dots
a^{r-1} \not\equiv 1 (\mathrm{mod}\, N)
a^r \equiv 1 (\mathrm{mod}\, N)
a^{r+1} \equiv a (\mathrm{mod}\, N)
a^{r+2} \equiv a^2 (\mathrm{mod}\, N)
a^{r+3} \equiv a^3 (\mathrm{mod}\, N)

Thus, the sequence repeats with period r.

Example. Set a = 2 and N = 7. We have f_{a,N}(x) = 2^x (\mathrm{mod}\, 7) as:

2^0 \equiv 1 (\mathrm{mod}\, 7)
2^1 \equiv 2 (\mathrm{mod}\, 7)
2^2 \equiv 4 (\mathrm{mod}\, 7)
2^3 \equiv 8 \equiv 1 (\mathrm{mod}\, 7)
2^4 \equiv 16 \equiv 2 (\mathrm{mod}\, 7)
2^5 \equiv 32 \equiv 4 (\mathrm{mod}\, 7)
2^6 \equiv 64 \equiv 1 (\mathrm{mod}\, 7)
2^7 \equiv 128 \equiv 2 (\mathrm{mod}\, 7)
2^8 \equiv 256 \equiv 4 (\mathrm{mod}\, 7)

We see that the order is r = \mathrm{ord}_7(2) = 3.

Quantum Order-Finding Algorithm

The quantum order-finding algorithm can be divided into two main steps:

  1. Encode the order into the phase of an eigenvalue of a suitable unitary operator, thereby reducing the order-finding problem to a quantum phase estimation problem.
  2. Recover the order r from the estimated phase \varphi using the method of continued fractions.

Let us first examine how to encode the order into the phase of a unitary operator.

Fix integers a and N such that \gcd(a, N) = 1.
Define the unitary operator U_{a,N} acting on computational basis states |y\rangle by

U_{a,N} |y\rangle = | a y (\mathrm{mod}\, N) \rangle, \quad y = 0, 1, \dots, N-1

We will see that the order can be mapped to the phase of an eigenvalue of U_{a,N}.

Proposition. Let N \ge 2 and let a be an integer coprime to N.
The operator U_{a,N} on \mathbf{C}^N defined by

U_{a,N} |y\rangle = | a y (\mathrm{mod}\, N) \rangle, \quad y = 0, 1, \dots, N-1

is unitary.

Proof. Since \gcd(a, N) = 1, multiplication by a modulo N defines a bijection on the set {0, 1, \dots, N-1}.
Thus U_{a,N} merely permutes the computational basis { |0\rangle, |1\rangle, \dots, |N-1\rangle }. Any permutation of an orthonormal basis extends to a unitary operator, so U_{a,N} is unitary.

Alternatively, suppose U_{a,N} |y\rangle = U_{a,N} |y'\rangle. Then

a y \equiv a y' (\mathrm{mod}\, N) \implies a (y - y') \equiv 0 (\mathrm{mod}\, N) \implies N \mid a (y - y').

Because \gcd(a, N) = 1, we may cancel a (more precisely, multiply both sides by the modular inverse of a modulo N) to obtain

N \mid (y - y').

Since 0 \le y, y' < N, the only possibility is y - y' = 0, so y = y'.
Thus U_{a,N} is injective on a basis of \mathbf{C}^N.
Being an isometry that maps a basis to a basis, it is unitary ∎.

Proposition. Let r be the order of a modulo N, i.e., the smallest positive integer such that

a^r \equiv 1 (\mathrm{mod}\, N).

For each s = 0, 1, \dots, r-1, define the state

|u_s\rangle := \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} e^{-2\pi i s k / r} \, | a^k (\mathrm{mod}\, N) \rangle.

Then, for each s = 0, 1, \dots, r-1, |u_s\rangle is an eigenstate of U_{a,N} with eigenvalue

e^{2\pi i s / r}.

Proof. Apply U_{a,N} to |u_s\rangle:

U_{a,N} |u_s\rangle = \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} e^{-2\pi i s k / r} \, U_{a,N} | a^k (\mathrm{mod}\, N) \rangle

\quad\; = \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} e^{-2\pi i s k / r} \, | a^{k+1} (\mathrm{mod}\, N) \rangle

Reindex the sum by letting k \mapsto k+1:

\quad\; = \frac{1}{\sqrt{r}} \sum_{k=1}^{r} e^{-2\pi i s (k-1) / r} \, | a^k (\mathrm{mod}\, N) \rangle

\quad\; = \frac{1}{\sqrt{r}} \sum_{k=1}^{r} e^{-2\pi i s k / r} \, e^{2\pi i s / r} \, | a^k (\mathrm{mod}\, N) \rangle

\quad\; = e^{2\pi i s / r} \frac{1}{\sqrt{r}} \sum_{k=1}^{r} e^{-2\pi i s k / r} \, | a^k (\mathrm{mod}\, N) \rangle

The sum now runs from k = 1 to r, but the k = r term is

e^{-2\pi i s r / r} \, | a^r (\mathrm{mod}\, N) \rangle = e^{-2\pi i s} \, | 1 (\mathrm{mod}\, N) \rangle = | 1 (\mathrm{mod}\, N) \rangle,

while the k = 0 term in the original definition of |u_s\rangle is

e^0 \, | a^0 (\mathrm{mod}\, N) \rangle = | 1 (\mathrm{mod}\, N) \rangle.

Thus the missing k = 0 term is exactly replaced by the k = r term, and we obtain

U_{a,N} |u_s\rangle = e^{2\pi i s / r} \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} e^{-2\pi i s k / r} \, | a^k (\mathrm{mod}\, N) \rangle = e^{2\pi i s / r} |u_s\rangle.

This holds for every s = 0, 1, \dots, r-1 ∎.

Now the procedure becomes “straightforward”. We apply quantum phase estimation (see this post) to the operators U_a^{2^j} with the eigenstate |u_s\rangle, which yields the phases

\theta_s = \frac{s}{r}, \quad s = 0, 1, \dots, r-1

The output of the phase estimation algorithm is thus

0.\theta_0 \theta_1 \dots \theta_{n}

represented as binary numbers. A classical algorithm called continued fractions can be applied to obtain the order r.

Quantum Phase Estimation

A key application of the quantum Fourier transform (see this post) is the Quantum Phase Estimation algorithm. This algorithm allows one to estimate the phase (or eigenvalue) of a unitary operator and serves as a central component of many quantum algorithms, including Shor’s factoring algorithm and quantum simulation.

Consider a unitary matrix U. Since U^\dagger U = I, all eigenvalues of U have the form e^{2\pi i \theta} with 0 \leq \theta < 1. Expressing \theta in binary gives

\theta = 0.\theta_1 \theta_2 \theta_3 \cdots = \sum_{k=1}^\infty \theta_k \, 2^{-k},

where each \theta_k \in {0,1}. The task is to estimate \theta to n binary digits of precision.

More precisely, the quantum phase estimation algorithm aims to determine the phase \theta in the eigenvalue equation

U |\psi\rangle = e^{2\pi i \theta} |\psi\rangle,

where U is the given unitary operator, |\psi\rangle is an eigenstate of U that can be prepared as an input state, and \theta is the phase to be determined. We further assume that controlled-U^{2^l} operations can be implemented for arbitrary non-negative integer l. The goal is to estimate \theta to the desired number of digits with high probability.

Now suppose that we express \theta in n binary digits,

\theta = 0.\theta_1\theta_2\cdots \theta_n.

Then, we have

U^{2^l} \lvert \psi \rangle = e^{2 \pi i 0.\theta_1\theta_2\cdots\theta_n 2^l} \lvert \psi \rangle = e^{2 \pi i \theta_1\cdots\theta_l.\theta_{l+1}\cdots\theta_n} \lvert \psi \rangle

where the integer part of \theta_1\cdots \theta_l.\theta_{l+1}\cdots \theta_n can be omitted, as it contributes only a trivial phase factor of 1.

For the controlled unitary C(U^{2^l}), its action on |+\rangle|\psi\rangle (with |+\rangle the control qubit) is:

C(U^{2^l}) \frac{1}{\sqrt{2}} ( \lvert 0 \rangle + \lvert 1 \rangle ) \lvert \psi \rangle = \frac{1}{\sqrt{2}} \lvert 0 \rangle \otimes \lvert \psi \rangle + e^{2 \pi i 0.\theta_{l+1}\cdots \theta_n} \lvert 1 \rangle \otimes \lvert \psi \rangle

= \frac{1}{\sqrt{2}} (|0\rangle + e^{2\pi i 0.\theta_{l+1}\cdots \theta_n} |1\rangle) \otimes |\psi\rangle

It is convenient to view this control qubit as representing the $(l+1)$-th digit of a binary number k = k_1 k_2 \cdots k_n = k_1 \times 2^0 + k_2 \times 2^1 + \cdots + k_n \times 2^{n-1}.

With this notation, the above expression can be rewritten as

C(U^{2^l}) \frac{1}{\sqrt{2}} (\lvert0\rangle + \lvert1\rangle)\lvert\psi\rangle = \frac{1}{\sqrt{2}} \sum_{k_{l+1}=0}^1 e^{2 \pi i \theta k_{l+1} 2^l} \lvert k_{l+1}\rangle \lvert\psi\rangle

This expression for the first qubit is familiar from our earlier discussion of the quantum Fourier transform, and it naturally suggests that the quantum Fourier transform should be applied here. The eigenstate \lvert \psi \rangle therefore remains unchanged and can be reused in subsequent steps.

Fig 1: quantum circuit for quantum phase estimation algorithm.

The quantum phase estimation algorithm uses two quantum registers as shown in Figure 1:

  • First register (phase record): Contains n qubits, which will store the estimated phase.
  • Second register (eigenstate): Contains the eigenstate \lvert \psi \rangle, which is an eigenstate of the unitary operator U.