Zhian Jia

Topological Quantum Color Code Model on Infinite Lattice

Our work, topological quantum color code model on infinite lattices, co-authored with Shiyu Cao and Sheng Tan, has been posted on arXiv¹. In this study, we carry out a rigorous Doplicher–Roberts–Haag (DHR) analysis for the quantum color code model—a framework central to understanding the structure of topological order in infinite lattice.

It is known that the topological phases of the color code model are equivalent to those of the double-layer toric code model. In this work, we present a rigorous proof of this equivalence within the framework of quasi-local C*-algebras.

We classify the model’s irreducible anyon superselection sectors and construct explicit string operators that generate anyonic excitations from the ground state. We further examine the fusion and braiding properties of these excitations and show that the resulting category is equivalent to $\mathsf{Rep}(D(\mathbb{Z}_2 \times \mathbb{Z}_2))$ . We also prove the Haag duality for the gound state of color code model.

Since I have recently been moving and settling down in Changsha and have been somewhat nomadic, this post comes out a little bit late. ↩︎

Temporal Kirkwood-Dirac Quasiprobability Distribution and Unification of Temporal State Formalisms through Temporal Bloch Tomography

Recently, we post the paper “temporal kirkwood-dirac quasiprobability distribution and unification of temporal state formalisms through temporal bloch tomography” on arXiv. This is joint work with Kavan Modi and Dagomir Kaszlikowski, in which we extend the Kirkwood–Dirac (KD) quasiprobability distribution to the temporal setting and develop a unified framework for temporal states.

The traditional (spatial) KD quasiprobability distribution is defined for density operators, which serve as a phase-space–like representation of a quantum state (see this post for a brief introduction). In this work, we generalize the KD formalism to temporal quantum processes, which consist of an initial state together with a sequence of quantum evolutions, $\mathfrak{P} = (\rho_{t_0}, \mathcal{E}_{t_1 \leftarrow t_0}, \ldots, \mathcal{E}_{t_n \leftarrow t_{n-1}})$ , where $\rho_{t_0}$ is the initial state and each $\mathcal{E}_{t_j \leftarrow t_{j-1}}$ denotes a completely positive trace-preserving (CPTP) map describing the evolution from $t_{j-1}$ to $t_j$ . The system state at an intermediate time $t_k$ is then obtained recursively as $\rho_{t_k} = \mathcal{E}_{t_k \leftarrow t_{k-1}} \circ \cdots \circ \mathcal{E}_{t_1 \leftarrow t_0}(\rho_{t_0})$ .

Within this framework, we define the multi-time doubled temporal KD quasiprobability distribution by

$\overleftrightarrow{Q}_{KD}(a_n,\ldots,a_0; b_n,\ldots,b_0)$

$= \mathrm{Tr}[ \Pi_{a_n|A_n}^{t_n} \mathcal{E}_{t_n \leftarrow t_{n-1}}(\cdots \Pi_{a_1|A_1}^{t_1}( \mathcal{E}_{t_1 \leftarrow t_0}( \Pi_{a_0|A_0}^{t_0} \rho_{t_0} \Pi_{b_0|B_0}^{t_0})) \Pi_{b_1|B_1}^{t_1} \cdots) \Pi_{b_n|B_n}^{t_n}]$

By taking the left and right marginals of this doubled distribution, we recover the corresponding temporal KD quasiprobability distributions $\overleftarrow{Q}_{KD}(a_n,\ldots,a_0)$ and $\overrightarrow{Q}_{KD}(b_n,\ldots,b_0)$ . These two distributions are related by complex conjugation, reflecting the intrinsic temporal structure encoded in the doubled formalism. The temporal Margenau-Hill (MH) quasiprobability distribution is defined as real part of the temporal KD distribution.

The temporal KD quasiprobability may take values outside the interval $[0,1]$ . This phenomenon reflects the intrinsically quantum nature of multi-time processes, in direct analogy with the negativity and nonclassical features of the spatial KD quasiprobability distribution. Using the Heisenberg picture, we provide a general and unified explanation for the origin of this quantumness.

A crucial application of temporal KD quasiprobability distribution is that it provides a general framework for temporal state.

There exist several formalisms of temporal states. In our paper, we show that using the temporal KD quasiprobability distribution, we can compute correlators of Pauli operators at different time steps. Then, by applying the Bloch representation of the state, we can derive several distinct temporal state formalisms. This provides a unified picture of temporal states.

For example, with doubled temporal KD distribution, we obtain the doubled KD temporal state:

$\overleftrightarrow{\Upsilon}= \frac{1}{d^{n+1}}\sum_{\mu_0,\ldots,\mu_n \atop \nu_0,\ldots,\nu_n=0}^{d^2-1}\overleftrightarrow{T}^{\mu_n,\ldots,\mu_0;\,\nu_n,\ldots,\nu_0}\left(\bigotimes_{i=0}^n \sigma_{\mu_i}\right)\otimes\left(\bigotimes_{j=0}^n \sigma_{\nu_j}\right)$

This is the most information-complete temporal state, and we show that it coincides with the doubled density operator. The temporal states obtained from the left or right temporal KD distribution correspond to the left and right reduced states of the doubled KD temporal state. Similarly, the left and right MH distributions coincide and give the left/right MH temporal states. For the two-time case, the left and right MH temporal states are identical to the pseudo-density operator; however, for general multi-time processes, they are no longer the same. In general, the MH temporal states can be understood as the Hermitianization of the corresponding KD temporal state.

Kirkwood-Dirac Quasiprobability Distribution

Recently, I have been working on several topics related to the Kirkwood–Dirac quasiprobability distribution (often abbreviated as the KD distribution), which has inspired me to write down some informal notes.

The KD distribution is a fascinating conceptual tool that sits at the boundary between classical probability theory and the strange, counterintuitive structure of quantum mechanics. It provides a way to represent quantum states that goes beyond classical statistics, allowing for features such as negativity and even complex values. These properties have no classical analogue, yet they encode genuinely quantum aspects of measurement, interference, and contextuality.

In this sense, the KD distribution offers a particularly transparent window into the nonclassical nature of quantum theory, making it an invaluable framework for both foundational studies and modern applications in quantum information.

What Are Quasiprobability Distributions in Quantum Mechanics?

Before turning to the KD distribution itself, it is helpful to place it within a broader conceptual landscape.

In classical physics, the state of a system can be described directly in phase space (concept developed in the late 19th century by Ludwig Boltzmann, Henri Poincaré, and Josiah Willard Gibbs). A point $(x,p)$ specifies the position and momentum of a particle, and our uncertainty about the system is encoded in a genuine probability density $\omega(x,p)$ . Such distributions are always non-negative real functions and satisfy a normalization condition, reflecting the familiar rules of classical probability theory. All classical observables are functions on phase space, $F(x,p)$ , and their expectation values are given by
$\langle F \rangle = \int F(x,p)\,\omega(x,p)\,dx\,dp$ .

Quantum mechanics, however, adopts a very different language. The fundamental object is the wavefunction $\psi(x)$ (or, more generally, a density operator), which does not represent a probability distribution in phase space. While $|\psi(x)|^2$ gives the probability density for position and $|\psi(p)|^2$ gives the probability density for momentum, there is no single function that simultaneously assigns probabilities to both position and momentum. This obstruction is not merely technical but fundamental, rooted in the noncommutativity of quantum observables and formalized by the Heisenberg uncertainty principle.

Nevertheless, it is natural to ask whether quantum states can still be represented in a phase-space-like manner, even if some classical features must be sacrificed. This motivation led to the development of quasiprobability distributions. These objects resemble classical probability distributions in form, but they relax one crucial requirement: they need not be non-negative, or even real-valued. Allowing for negative or complex values makes it possible to encode quantum interference, coherence, and other genuinely nonclassical effects within a phase-space framework.

Well-known examples include:

The Wigner quasiprobability distribution, introduced by Eugene Wigner in 1932 (see this very readable paper), which is real-valued but can take negative values.
The Husimi $Q$ -function and the Glauber–Sudarshan $P$ -function, widely used in quantum optics.

The Kirkwood–Dirac (KD) quasiprobability distribution belongs to this same family, but with an important distinction. It can be defined for arbitrary pairs of observables (in fact the Wigner function can also be extended in this way), not just position and momentum, and it is generically complex-valued. These features make the KD distribution especially powerful for studying quantum measurements, temporal correlations, and applications in quantum information processing.

Historical Background: From Kirkwood to Dirac

The KD distribution has its roots in the early days of quantum mechanics, when physicists were grappling with how to reconcile quantum phenomena with classical intuition.

John G. Kirkwood’s Contribution (1933): American physicist John Gamble Kirkwood first introduced the distribution in his paper “Quantum Statistics of Almost Classical Assemblies“. Kirkwood aimed to describe quantum systems that are nearly classical, using a phase-space representation. His version was essentially a complex-valued joint distribution for position and momentum.
Paul Dirac’s Independent Discovery (1945): The legendary physicist Paul A. M. Dirac independently rediscovered the distribution in his paper “On the Analogy Between Classical and Quantum Mechanics“. Dirac highlighted its role in illustrating analogies between classical and quantum theories, particularly for noncommuting operators.

The real part of KD distribution is called Margenau-Hill quasiprobability distribution, which was proposed by by H. Margenau and R. N. Hill in 1961 in their paper “Correlation between Measurements in Quantum Theory“.

For decades, the KD distribution remained relatively obscure, overshadowed by the Wigner function. However, in recent years—especially since the 2010s—it has gained renewed attention due to its applications in quantum information theory, where it serves as a powerful tool to quantify nonclassical resources such as coherence and contextuality. See the review paper “Properties and applications of the Kirkwood–Dirac distribution” by Arvidsson-Shukur et al for more details.

Mathematical Formulation: Defining the KD Distribution

Let’s dive into the mathematics. We’ll assume a basic familiarity with quantum mechanics notation (e.g., density operators $\hat{\rho}$ , bras $\langle \cdot |$ , kets $| \cdot \rangle$ ), but I will explain as we go.

Basic Definition for Two Observables

Consider a quantum system in a state described by the density operator $\hat{\rho}$ (which may be pure or mixed). Let $\hat{A}$ and $\hat{B}$ be two Hermitian observables with eigenbases ${|a_i\rangle}$ and ${|b_j\rangle}$ , respectively, assuming a finite-dimensional Hilbert space of dimension $d$ for simplicity.

The Kirkwood–Dirac (KD) quasiprobability distribution is defined as:

$Q_{i,j}(\hat{\rho}) = \langle b_j | a_i \rangle \langle a_i | \hat{\rho} | b_j\rangle$

Here:

$\langle b_j | a_i \rangle$ is the overlap between the eigenvectors.
This expression is generally complex because the overlaps can introduce phases.

For projectors $\hat{\Pi}_a = |a\rangle\langle a|$ and $\hat{\Pi}_b =|b\rangle\langle b|$ , it can also be written as:

$Q(a,b; \hat{\rho}) = \mathrm{Tr}(\hat{\Pi}_b \hat{\Pi}_a \hat{\rho})$

In continuous variables (e.g., position $x$ and momentum $p$ ):

$Q(x, p; \hat{\rho}) = \langle p | x \rangle \langle x | \hat{\rho} | p \rangle$

The real part of KD distribution is called Margenau-Hill quasiprobability distribution, which by definition is a real-valued quasiprobability distribution.

Generalizations

The KD distribution extends naturally to more general scenarios:

Multiple observables ( $k$ observables): $Q_{i_1, \dots, i_k}(\hat{\rho}) = \mathrm{Tr} (\hat{\Pi}_{i_k}^{(k)} \cdots \hat{\Pi}^{(1)}_{i_1} \hat{\rho} )$
Positive-operator-valued measures (POVMs) ${\hat{M}^{(l)}_{i_l}}$ : $Q_{i_1, \dots, i_k}(\hat{\rho}) = \mathrm{Tr} ( \hat{M}_{i_k}^{(k)} \cdots \hat{M}^{(1)}_{i_1} \hat{\rho} )$

Key Mathematical Properties

Normalization:
$\sum_{i,j} Q_{i,j}(\hat{\rho}) = 1$
Marginals: The sums over one index reproduce the standard quantum probabilities (Born rule):
$\sum_j Q_{i,j}(\hat{\rho}) = \langle a_i | \hat{\rho} | a_i \rangle, \quad \sum_i Q_{i,j}(\hat{\rho}) = \langle b_j | \hat{\rho} | b_j \rangle$
For multiple observable case, this property becomes Kolmogorov consistency condition.
State Reconstruction: Assuming all overlaps $\langle a_i | b_j \rangle$ are non-zero, the density operator can be recovered from the KD distribution: $\hat{\rho} = \sum_{i,j} Q_{i,j}(\hat{\rho}) \frac{|b_j\rangle\langle a_i|}{\langle a_i | b_j \rangle^*}$

Properties: What Makes the KD Distribution Nonclassical?

The KD distribution is particularly powerful because it captures features of quantum states that have no classical analogue:

Complex Values and Negativity: Entries of the KD distribution can be negative (in their real part) or contain imaginary components. Negativity is a direct signature of quantum interference, while the imaginary parts reflect the disturbance caused by measurement.
Quantifying Nonclassicality: One common measure is the total non-positivity:
$N(Q) = \sum_{i,j} |Q_{i,j}| - 1 \ge 0,$
where $N(Q) = 0$ for states that behave classically with respect to the chosen observables.

Connections to Quantum Foundations:

Negativity in the KD distribution can serve as a witness of contextuality, showing that measurement outcomes cannot be explained by noncontextual hidden-variable models.
It signals the incompatibility of observables, reflecting the fundamentally noncommutative structure of quantum mechanics.
It is also related to violations of macrorealism, for example as quantified by Leggett–Garg inequalities.

Applications in Modern Quantum Research

Quantum Metrology: The imaginary part enables ultra-sensitive parameter estimation.
Weak Measurements: Negativity gives rise to anomalous weak values, allowing signal amplification.
Quantum Thermodynamics: Defines coherent work distributions and fluctuation theorems.
Quantum Information: Quantifies coherence, entanglement, and information scrambling via out-of-time-order correlators (OTOCs). It can also be used for direct quantum state measurement, requiring fewer resources than full state tomography.
Quantum Computing: The nonclassicality of the KD distribution is closely related to the computational power of magic-state injection schemes in quantum computation. (This connection has been extensively studied within the Wigner distribution framework, as it has been recognized that the advantage of quantum computation is intimately tied to quantum contextuality; both the KD distribution and the Wigner distribution are closely related to quantum contextuality.)

Kitaev Algorithm for Quantum Phase Estimation

In this post, we introduce the quantum phase estimation based on quantum Fourier transform. There is another way to implement quantum phase estimation via an interferometric scheme, usually called Kitaev algorithm.

As before, we assume that we can prepare an eigenstate $|\psi\rangle$ of a unitary operator $U$ ,

$U|\psi\rangle = e^{2\pi i \theta} |\psi\rangle ,$

where we take $\theta \in [-1/2,1/2)$ for convenience. Kitaev’s idea is to encode the phase information into the amplitudes of certain superposition states and extract $\theta$ using the interferometric scheme.

^{Fig 1: interferometric scheme for cosine part.}

The quantum circuit is shown in Figure 1.

Acting with a Hadamard gate on the control qubit gives

$\frac{1}{\sqrt{2}}(|0\rangle|\psi\rangle + |1\rangle|\psi\rangle).$

Applying the controlled- $U$ gate yields

$\frac{1}{\sqrt{2}}(|0\rangle|\psi\rangle + e^{2\pi i\theta}|1\rangle|\psi\rangle),$

so the phase is now encoded in the relative amplitude. A final Hadamard gate transforms the state to

$|\Phi_f^{c}\rangle = \frac{1}{\sqrt{2}}(|+\rangle|\psi\rangle + e^{2\pi i\theta}|-\rangle|\psi\rangle)$

$= \frac{1+e^{2\pi i \theta}}{2}|0\rangle|\psi\rangle + \frac{1-e^{2\pi i\theta}}{2}|1\rangle|\psi\rangle$

Thus the measurement probabilities are

$p(0) = \left|\frac{1+e^{2\pi i\theta}}{2}\right|^2 = \frac{1+\cos(2\pi\theta)}{2}$

$p(1) = \left|\frac{1-e^{2\pi i\theta}}{2}\right|^2= \frac{1-\cos(2\pi\theta)}{2}$

These measurements allow us to extract the cosine of the phase,

$\theta = \frac{1}{2\pi}\cos^{-1}(2p(0)-1)$

or equivalently,

$\theta = \frac{1}{2\pi}\cos^{-1}(1-2p(1))$

However, we know that $\cos(2\pi \theta)$ cannot distinguish between
$\theta$ and $-\theta$ .
It is therefore natural to incorporate the sine component $\sin(2\pi\theta)$ to resolve this ambiguity.
By first determining the sign of $\theta$ from the sine measurement, we can then output the final estimate as

$\theta = \pm \frac{1}{2\pi}\cos^{-1}(2p(0)-1),$

where the sign is chosen according to whether $\sin(2\pi\theta)$ is positive or negative.

^{Fig 2. interferometric scheme for sine part.}

To obtain the sine component, we slightly modify the circuit by inserting a phase gate $S$ on the control qubit prior to the controlled- $U$ operation, see Figure 2.

A parallel computation yields

$|\Phi_f^{s}\rangle = \frac{1+ie^{2\pi i\theta}}{2}|0\rangle|\psi\rangle+\frac{1-ie^{2\pi i\theta}}{2}|1\rangle|\psi\rangle.$

The measurement probabilities are now

$p(0) = \left|\frac{1+ie^{2\pi i\theta}}{2}\right|^2= \frac{1-\sin(2\pi\theta)}{2}, \quad p(1) = \left|\frac{1-ie^{2\pi i\theta}}{2}\right|^2= \frac{1+\sin(2\pi\theta)}{2}$

Therefore,

$\theta = \frac{1}{2\pi}\sin^{-1}(1-2p(0))$

or equivalently,

$\theta = \frac{1}{2\pi}\sin^{-1}(2p(1)-1)$

with $\sin^{-1}$ taking values in $[-\pi,\pi)$ .

Combining the estimates of $\cos(2\pi\theta)$ and $\sin(2\pi\theta)$ allows us to reconstruct $\theta$ uniquely in the range $[-1/2,1/2)$ .

Shor’s Algorithm

Shor’s algorithm, introduced by Peter Shor in 1994 (algorithms for quantum computation: discrete logarithms and factoring), is the breakthrough quantum algorithm that factors large integers in polynomial time, a task widely believed to be intractable for classical computers. Given an integer $N$ , the algorithm finds its prime factors in $O\left((\log N)^2 (\log \log N)(\log \log \log N)\right)$ time on a quantum computer. This is exponentially faster than the best classical algorithms, such as the general number field sieve, which runs in sub-exponential time $O\left(e^{c (\log N)^{1/3} (\log \log N)^{2/3}}\right)$ , where $c = \frac{1}{3}(92 + 26\sqrt{13})^{1/3}$ .

The discovery of Shor’s algorithm showed that a large quantum computer would be capable of breaking widely used public-key cryptosystems, most notably RSA, whose security relies on the assumed hardness of integer factorization. This result initiated both the global effort to build scalable quantum computers and the rapid development of post-quantum cryptography, which seeks cryptographic protocols secure even against quantum attackers.

RSA Cryptosystem

Since Shor’s algorithm was primarily designed to break the RSA public-key cryptosystem. For completeness, we first review the core principles of RSA. The RSA cryptosystem, along with its variants, is one of the most widely deployed methods for securely transmitting messages over public channels, such as the Internet, where it protects confidentiality and enables digital signatures. Its security relies fundamentally on a single, long-standing computational assumption: that factoring a large composite integer into its prime factors is extraordinarily difficult for classical computers.

It is worth emphasizing that testing whether a number is prime is computationally feasible, whereas factoring a large composite number into its prime factors remains extraordinarily difficult. RSA cryptography leverages precisely this asymmetry: while anyone can efficiently verify the primality of the factors used in key generation, recovering those factors from the public modulus is, for classical computers, practically intractable. This one-way computational difficulty forms the cornerstone of RSA’s security, enabling secure encryption and decryption of messages.

The RSA cryptosystem operates as follows:

Alice prepares two very large prime numbers $p$ and $q$ , which she keeps secret. She computes $N = pq$ and publishes $N$ . She also chooses a number, called the public exponent (or encoding exponent), $e < N$ , which is required to be relatively prime to $(p-1)(q-1)$ , that is, $\mathbf{gcd}(e,(p-1)(q-1)) = 1$ . Alice then computes the multiplicative inverse of $e$ modulo $(p-1)(q-1)$ , namely a number $d$ satisfying

$de \equiv 1 \pmod{(p-1)(q-1)}$

She keeps $d$ secret, thus called the secret exponent (or decoding exponent).

When Bob wants to send a message to Alice, he first encodes the message as an integer $m$ with $m < N$ . He encrypts it as

$M = m^{e} \pmod{N}$

Upon receiving $M$ , Alice decrypts it using

$m \equiv M^{d} \pmod{N}$

where $d$ is the multiplicative inverse of the public exponent $e$ modulo $(p-1)(q-1)$ .

To see that the RSA cryptosystem correctly recovers the message $m$ using $m \equiv M^{d} \pmod{N}$ , we need to check

$m \equiv (m^e)^{d} \pmod{N}$ .

From $de \equiv 1 \pmod{(p-1)(q-1)}$ , there exists an integer $k$ such that $de = k (p-1)(q-1) + 1$ . Hence

$(m^e)^{d} = m \bigl[ m^{k(q-1)} \bigr]^{p-1}$ .

Suppose $m$ is not a multiple of $p$ . Fermat’s little theorem gives

$\bigl[ m^{k(q-1)} \bigr]^{p-1} \equiv 1 \pmod{p}$ .

If $m$ is a multiple of $p$ , then $m^{de} \equiv 0 \pmod{p}$ .

Similarly,

$\bigl[ m^{k(p-1)} \bigr]^{q-1} \equiv 1 \pmod{q}$

when $m$ is not a multiple of $q$ , and $m^{de} \equiv 0 \pmod{q}$ if $m$ is a multiple of $q$ .

We now verify $m^{de} \equiv m \pmod{N}$ by considering several cases.

If $m$ is a multiple of both $p$ and $q$ , then it is automatically a multiple of $N = pq$ , and $m^{de} \equiv m \pmod{N}$ holds trivially.
If $m$ is a multiple of $p$ but not of $q$ , then

$m \equiv 0 \pmod{p}, \quad m^{de} \equiv 0 \pmod{p}$ ,

and

$\bigl[m^{k(p-1)}\bigr]^{q-1} \equiv 1 \pmod{q}$ .

Hence,

$m^{de} \equiv m \pmod{q}, \quad m^{de} \equiv m \pmod{p}$ .

By the Chinese remainder theorem ${}^1$ ,

$m^{de} \equiv m \pmod{N}$ .

We are using a basic consequence of the Chinese remainder theorem: if two integers $a$ and $b$ satisfy $a \equiv b \pmod{p}$ and $a \equiv b \pmod{q}$ , with $\gcd(p,q) = 1$ , then $a \equiv b \pmod{pq}$ . In particular, if $p$ and $q$ are prime, it is easier to see $p \mid (a-b)$ and $q \mid (a-b)$ implies $pq \mid (a-b)$ .

If $m$ is a multiple of $q$ but not of $p$ , the argument is symmetric, giving

$m^{de} \equiv m \pmod{N}$ .

If $m$ is not a multiple of either $p$ or $q$ , then Fermat’s little theorem gives

$m^{de} \equiv m \pmod{p}, \quad m^{de} \equiv m \pmod{q}$ .

Again, by the Chinese remainder theorem,

$m^{de} \equiv m \pmod{N}$ .

This completes the proof of $m^{de} \equiv m \pmod{N}$ .

We see from the above analysis that if one could factor $N$ efficiently, then the RSA system would be completely compromised. Indeed, factoring $N$ reveals the primes $p$ and $q$ , from which one can compute $(p-1)(q-1)$ , determine the public exponent $e$ , and then recover the secret exponent $d$ . With $d$ in hand, any encrypted message can be decrypted.

Shor’s Algorithm

Definition (The Factoring Problem)
The input to Shor’s algorithm is an odd composite integer $N = p q$ , where $p$ and $q$ are distinct large primes. The goal is to find the prime factors $p$ and $q$ .

Shor’s algorithm is a hybrid quantum-classical algorithm that factors $N$ in polynomial time on a quantum computer. It consists of two main phases:

Classical reduction: Transform factoring $N$ into an instance of the order-finding problem (see this post).
Quantum order-finding: Solve the order-finding problem efficiently using a quantum computer.

The classical reduction (which succeeds with probability $> 1/2$ for random choices) is as follows:

1. Choose a random integer $a \in {2, 3, \dots, N-1}$ and compute $d = \gcd(a, N)$ :

If $1 < d < N$ , then $d$ is a non-trivial factor of $N$ . Factoring complete.
If $d = 1$ , continue.

2. Find the multiplicative order $r$ of $a$ modulo $N$ , i.e., the smallest positive integer $r$ such that
$a^r \equiv 1 \pmod{N}$ .

3. If $r$ is odd, return to the first step with a new $a$ .

4. If $r$ is even, compute $b = a^{r/2} \mod N$ :

If $b \equiv -1 \pmod{N}$ , return to the first step.
If $b \not\equiv \pm 1 \pmod{N}$ , proceed.

5. Compute the factors:
$p = \gcd(b - 1, N)$ , $q = \gcd(b + 1, N)$ .
At least one of these is a non-trivial factor of $N$ .

The only step that is computationally infeasible on a classical computer is step 2: finding the order $r$ .
Shor’s breakthrough contribution is a quantum algorithm that performs this order-finding step in polynomial time using quantum phase estimation on the unitary operator $x \mapsto a x \mod N$ .

Once $r$ is obtained quantumly, the remaining classical post-processing immediately yields the prime factors $p$ and $q$ . Thus, on a sufficiently large fault-tolerant quantum computer, Shor’s algorithm solves the integer factorization problem efficiently, rendering RSA and related cryptosystems insecure.

Remark (Why the Reduction Works)

Assume that $r$ is even and that $a^{r/2} \not\equiv -1 \pmod{N}$ (the cases explicitly filtered out by the algorithm). Since $r$ is the order of $a$ modulo $N$ , we have
$a^r \equiv 1 \pmod{N}$ .
This implies
$a^r - 1 = (a^{r/2} - 1)(a^{r/2} + 1) \equiv 0 \pmod{N}$ ,
so $N \mid (a^{r/2} - 1)(a^{r/2} + 1)$ , or equivalently,
$pq \mid (a^{r/2} - 1)(a^{r/2} + 1)$ .

Because $p$ and $q$ are distinct primes, the prime factors of the product $(a^{r/2}-1)(a^{r/2}+1)$ can be distributed in only four logical ways:

$p$ divides $a^{r/2}-1$ and $q$ divides $a^{r/2}+1$
$p$ divides $a^{r/2}+1$ and $q$ divides $a^{r/2}-1$
$pq$ divides $a^{r/2}-1$ (i.e., $N$ divides $a^{r/2}-1$ )
$pq$ divides $a^{r/2}+1$ (i.e., $N$ divides $a^{r/2}+1$ )

Cases 1 and 2 are precisely what we want: they yield non-trivial factors via the gcd computations
$p = \gcd(a^{r/2}-1, N)$ and $q = \gcd(a^{r/2}+1, N)$ (or vice versa).
Cases 3 and 4, however, must be excluded:

Case 3 is impossible: If $N$ divides $a^{r/2}-1$ , then $a^{r/2} \equiv 1 \pmod{N}$ .
But $r$ is the order of $a$ modulo $N$ , so $r$ would have to divide $r/2$ , which is impossible unless $r = 0$ , a contradiction.
Case 4 is impossible under our assumption: If $N$ divides $a^{r/2}+1$ , then $a^{r/2} \equiv -1 \pmod{N}$ .
This situation is explicitly detected and rejected in the algorithm (step 5).

Thus, whenever the algorithm reaches step 6, we are guaranteed to be in Case 1 or Case 2, and the two gcd operations recover the prime factors $p$ and $q$ of $N$ .

Example

Let us illustrate Shor’s algorithm with the integer $N = 15$ .

Choose a random number: Let $a = 7$ . Compute $\gcd(7, 15) = 1$ .
Since $\gcd(7, 15) = 1$ , we proceed to the next step.
Find the period $r$ : Using a quantum computer, we determine the period of the function
$f(x) = 7^x \bmod 15$ .
The result is $r = 4$ , since $7^4 \equiv 1 \pmod{15}$ .
Use the period to find the factors:
- Since $r = 4$ is even, compute $7^{r/2} - 1 = 7^2 - 1 = 48$ .
- Compute the gcd: $\gcd(48, 15) = 3$ , which is one factor of $15$ .
- Similarly, $7^{r/2} + 1 = 7^2 + 1 = 50$ , and $\gcd(50, 15) = 5$ , which is the other factor.

Thus, we have successfully factored $15 = 3 \times 5$ .

Grover’s Search Algorithm

Grover’s search algorithm, also known as the quantum search algorithm, is a cornerstone of quantum computation, offering a quadratic speedup for unstructured search problems. It was introduced by Lov Kumar Grover in 1996 in his paper A fast quantum mechanical algorithm for database search [arXiv:quant-ph/9605043]. Whereas classical algorithms require $O(N)$ queries to search an unsorted database of $N$ elements, Grover’s algorithm achieves the same task with only $O(\sqrt{N})$ queries by exploiting quantum parallelism and amplitude amplification.

^{Fig. 1. Lov Kumar Grover, pioneer of algorithms for quantum computing. Photo is from this link.}

Searching an unstructured database is a fundamental task in many practical applications, such as finding the lowest-priced product, determining the shortest path to a destination via transfers, and solving other optimization problems. In this post, we begin with the simplest case of a single marked item and later consider the general case with multiple marked items. The core of Grover’s algorithm is the Grover iteration. The algorithm’s precision depends on the ratio $M/N$ , where $M$ is the number of marked items and $N$ is the total number of items.

Definition.(Unstructured search problem) The goal of Grover’s algorithm is to locate the marked element in an unsorted database. Given an oracle function

$f : \{0, 1\}^n \rightarrow \{0, 1\},$

where $f(x) = 1$ for the marked item $x'$ and $f(x') = 0$ for all other items$, the task is to find the marked elements $x'$ using fewer oracle queries than would be required classically.

Quantum oracle

To solve this problem quantumly, we also need to introduce a quantum oracle.

Recall that the classical oracle is a function $f : \{0, 1\}^n \rightarrow \{0, 1\},$ with $f(x) = 1$ for the marked item $x_{\text{marked}}$ and $f(x) = 0$ for all other items.

We can generalize this to a quantum oracle $U_f$ , which is a unitary operation that flips the sign of the amplitude of the marked state:

$U_f |x\rangle = (-1)^{f(x)} |x\rangle,$

where $|x\rangle = |x_1\rangle |x_2\rangle \cdots |x_n\rangle$ corresponds to the binary string $x$ .

This construction is the same as the oracles encountered in the Deutsch–Jozsa and Bernstein–Vazirani algorithms. By introducing an ancillary qubit initialized in the state $|-\rangle$ , we define the oracle

$\tilde{U}_f |x\rangle |y\rangle = |x\rangle |y \oplus f(x)\rangle,$

so that

$\tilde{U}_f |x\rangle (|0\rangle - |1\rangle) = (-1)^{f(x)} |x\rangle (|0\rangle - |1\rangle) = (U_f |x\rangle) (|0\rangle - |1\rangle).$

Dropping the ancillary qubit then recovers $U_f |x\rangle = (-1)^{f(x)} |x\rangle.$

In particular, for the marked state $x'$ , $f(x') = 1$ , so its phase is flipped: $U_f |x'\rangle = - |x'\rangle.$ For all unmarked elements, $f(x) = 0$ , so their amplitudes remain unchanged.

Grover iteration

The algorithm proceeds by repeatedly applying a procedure known as the Grover iteration or Grover operator to amplify the amplitude of the marked state. As we shall see, each Grover iteration corresponds to a rotation within a two-dimensional plane. The rotation angle at each step depends on the ratio $M/N$ between the number of marked items and the total number of items.

Suppose the search space contains $N=2^n$ items, of which $M$ are marked. Define the states

$|\alpha\rangle = \frac{1}{\sqrt{N-M}} \sum_{x: \text{ unmarked}} |x\rangle,$

representing an equal superposition of all unmarked items, and

$|\beta\rangle = \frac{1}{\sqrt{M}} \sum_{x: \text{ marked}} |x\rangle,$

representing an equal superposition of all marked items.

The uniform superposition over all items,

$|\psi\rangle = \frac{1}{\sqrt{N}} \sum_x |x\rangle = \frac{1}{\sqrt{2^n}} \sum_{x\in {0,1}^n} |x\rangle = H^{\otimes n} |0\rangle^{\otimes n},$

can be expressed as a superposition of $|\alpha\rangle$ and $|\beta\rangle$ :

$|\psi\rangle = \frac{\sqrt{N-M}}{\sqrt{N}} |\alpha\rangle + \frac{\sqrt{M}}{\sqrt{N}} |\beta\rangle.$

^{Fig. 2. Illustration of a Grover iteration. Each iteration rotates the state vector in the two-dimensional plane spanned by and , gradually increasing the amplitude of the marked state. Notice that for Grover’s search, the initial state is the uniform superposition over all items .}

We assume that $M/N$ is small. In this regime, the coefficient $c_{\alpha} = \frac{\sqrt{N-M}}{\sqrt{N}}$ corresponding to the unmarked states is much larger than the coefficient $c_{\beta} = \frac{\sqrt{M}}{\sqrt{N}}$ associated with the marked states. Consequently, the initial state $|\psi\rangle$ is predominantly aligned along $|\alpha\rangle$ , and the Grover iteration rotates the state vector slowly toward $|\beta\rangle$ , allowing gradual amplitude amplification of the marked state.

It is convenient to introduce an angle $\theta$ defined by

$\cos \frac{\theta}{2} = \frac{\sqrt{N-M}}{\sqrt{N}}, \qquad \sin \frac{\theta}{2} = \frac{\sqrt{M}}{\sqrt{N}}.$

This angle quantifies the initial projection of $|\psi\rangle$ onto the marked subspace and will be useful for understanding Grover iteration geometrically.

Each Grover iteration consists of two main steps:

Oracle Query (Phase Flip): Apply the oracle $U_f$ , which flips the phase of the marked state:
$U_f |x\rangle = (-1)^{f(x)} |x\rangle$ . Geometrically, this can be viewed as a reflection about the state $|\alpha\rangle$ . For any state in the two-dimensional subspace spanned by $|\alpha\rangle$ and $|\beta\rangle$ ,
$|\varphi\rangle = a |\alpha\rangle + b |\beta\rangle$ ,
the action of the oracle gives
$|\varphi'\rangle = U_f |\varphi\rangle = a |\alpha\rangle - b |\beta\rangle$ ,
showing that the $|\beta\rangle$ component is reflected.
Amplitude Amplification (Grover Diffusion Operator): Apply the diffusion operator
$D = 2 |\psi\rangle \langle \psi| - I$ ,
where $|\psi\rangle$ is the uniform superposition over all items as defined earlier. This operation can be interpreted as a reflection about $|\psi\rangle$ in the plane spanned by $|\alpha\rangle$ and $|\beta\rangle$ . To see this, define
$|\psi^\perp\rangle = -c_\beta |\alpha\rangle + c_\alpha |\beta\rangle$ ,
where $c_\alpha$ and $c_\beta$ are the coefficients of $|\psi\rangle$ in the $|\alpha\rangle$ , $|\beta\rangle$ basis, so that $|\psi^\perp\rangle$ is orthogonal to $|\psi\rangle$ in the plane. Any state in this plane can be expressed as
$|\chi\rangle = a |\psi\rangle + b |\psi^\perp\rangle$ . Acting with $D$ yields
$|\chi'\rangle = D |\chi\rangle = a |\psi\rangle - b |\psi^\perp\rangle$ ,
confirming that $D$ performs a reflection about $|\psi\rangle$ .

The combined Grover iteration operator is defined as

$G = D U_f = (2 |\psi\rangle\langle \psi| - I)\, U_f$ .

Within the two-dimensional subspace spanned by $|\alpha\rangle$ and $|\beta\rangle$ , consider an initial state

$|\Phi\rangle = \cos\frac{\theta}{2}\, |\alpha\rangle + \sin\frac{\theta}{2}\, |\beta\rangle$ ,

whose angle from $|\alpha\rangle$ is $\theta/2$ . Applying the oracle $U_f$ first reflects the state about $|\alpha\rangle$ , and subsequently applying $D$ reflects the resulting state about $|\psi\rangle$ . The composition of these two reflections produces a net rotation by an angle $3\theta/2$ toward $|\beta\rangle$ .

In Grover’s search, the initial state is the uniform superposition $|\psi\rangle$ , which lies at an angle $\theta/2$ from $|\alpha\rangle$ . Each application of $G$ therefore increases this angle by $\theta$ , gradually amplifying the amplitude of $|\beta\rangle$ and increasing the probability of measuring the marked state.

Exercise. Verify that $G |\psi\rangle$ performs the rotation illustrated in Fig. 2. Show explicitly that the Grover iteration acts as a rotation matrix in the basis ${|\alpha\rangle, |\beta\rangle}$ .

From Fig. 2, it is clear that $\theta$ becomes small when the ratio $M/N$ is small. Consequently, each Grover iteration corresponds to a small but precise rotation, making the algorithm particularly powerful for large databases with relatively few marked items.

Grover’s Search Algorithm

The detailed procedure of Grover’s search algorithm is given below.

1. Initialization

Prepare $n$ qubits in the state $|0\rangle^{\otimes n}$ .

Apply the Hadamard transform $H^{\otimes n}$ to obtain the equal superposition
$|\psi\rangle = H^{\otimes n} |0\rangle^{\otimes n} = \frac{1}{\sqrt{N}} \sum_{x=0}^{N-1} |x\rangle$ .

2. Grover iterations

Apply the Grover iteration operator $G$ for $O(\sqrt{N})$ rounds. Each round consists of:

Oracle $U_f$ , flipping the phase of the marked state.

Diffusion operator $D$ , performing amplitude amplification.

3. Measurement

After approximately $O(\sqrt{N})$ iterations, the amplitude of the marked state is close to one. Measuring the qubits yields marked item $x'$ with high probability.

^{Fig. 3. Quantum circuit for Grover’s search algorithm.}

To analyze the behavior of the algorithm, let us assume for simplicity that there is exactly one marked item, $M=1$ . Initially, all basis states have amplitude $\frac{1}{\sqrt{N}}$ . After applying the oracle and the diffusion operator, the amplitude of the marked state increases, while the amplitudes of unmarked states decrease.

Each Grover iteration acts as a rotation in the two-dimensional subspace spanned by the unmarked-state superposition $|\alpha\rangle$ and the marked state $|\beta\rangle$ . The rotation angle is $\theta \approx 2 \sin^{-1}\left( \frac{1}{\sqrt{N}} \right)$ . After $k$ iterations, the state becomes
$G^k |\psi\rangle = \cos\left( \frac{2k+1}{2}\theta \right)|\alpha\rangle + \sin\left( \frac{2k+1}{2}\theta \right)|\beta\rangle$ .

Thus, repeated applications of $G$ rotate the initial state toward the marked state. The optimal number of iterations is
$k_{\mathrm{opt}} \approx \frac{\pi}{4}\sqrt{N}$ ,
which maximizes the probability of obtaining $|\beta\rangle$ upon measurement. Applying $G$ more times causes an overshoot and reduces the success probability.

Bernstein–Vazirani Algorithm

The Bernstein–Vazirani algorithm is one of the earliest and clearest examples of a quantum algorithm that significantly outperforms any classical counterpart for a well-defined problem. It can be viewed as an application of the Deutsch-Jozsa algorithm. The algorithm was first introduced by Ethan Bernstein and Umesh Vazirani in their 1993 paper and later discussed again in their 1997 paper. The algorithm treats the following problem:

^{Fig 1. Umesh Vazirani (photo from this talk). I could not find any photo of Ethan Bernstein online; if you come across one, please email me.}

Definition (Bernstein–Vazirani problem): Let $f$ be a function from n-bit strings to a single bit, $f:\{0,1\}^n \to \{0,1\}$ , such that there exists a secret bit string $s \in \{0,1\}^n$ with the property that, for every input $x \in \{0,1\}^n$ , $f(x) = x \cdot s$ , where the dot product is defined by $x \cdot s = x_1 s_1 \oplus x_2 s_2 \oplus \cdots \oplus x_n s_n$ , with $x = x_1 x_2 \cdots x_n$ and $s = s_1 s_2 \cdots s_n$ . The task is to determine the secret string $s$ using as few queries to $f$ as possible.

Exercise. 1. For the 3-bit string $s = 011$ , compute $x \cdot s$ for all $x \in \{0,1\}^3$ . As an example, take $x = 001$ . Then $x \cdot s = (0 \times 0) \oplus (0 \times 1) \oplus (1 \times 1) = 1$ The function defined by $f(x) := x \cdot s$ is an example of a function satisfying the Bernstein–Vazirani property.

2. Show that the majority function
$\mathrm{maj}(x) = \begin{cases} 1, \text{ if majority of bits in } x \text{ are 1} \\ 0, \text{ otherwise} \end{cases}$
is a counterexample: it does not satisfy the Bernstein–Vazirani property.

Hint: Any function $f(x) = x \cdot s$ is linear over $\mathbf{F}_2$ , i.e. it obeys

$f(x \oplus y) = f(x) \oplus f(y) \quad \text{for all } x, y.$

To disprove that $\mathrm{maj}$ has the Bernstein–Vazirani form, find explicit $x, y$ for which

$\mathrm{maj}(x \oplus y) \neq \mathrm{maj}(x) \oplus \mathrm{maj}(y).$

For instance, take $x = 110$ and $y = 011$ . Then $x \oplus y = 101$ , while

$\mathrm{maj}(110) = 1, \quad \mathrm{maj}(011) = 1, \quad \mathrm{maj}(101) = 1,$

so $\mathrm{maj}(110) \oplus \mathrm{maj}(011) = 0 \neq 1 = \mathrm{maj}(101)$ , showing the required contradiction. ∎

Classically, we can solve the problem by considering a complete basis of bit strings:

$10\cdots 0, \quad 010\cdots 0, \quad \dots, \quad 0\cdots 01.$

^{Fig 2. Quantum circuit for the Bernstein-Vazirani algorithm.}

If we query the function $f(x) := x \cdot s$ on these strings, we obtain

$10\cdots 0 \cdot s = s_1,$

$010\cdots 0 \cdot s = s_2,$

$\vdots$

$0\cdots 01 \cdot s = s_n.$

From this, it is clear that the secret string $s$ can be determined with $n$ queries. Because the $n$ bits of $s$ are independent, there is no way to do better classically. Remarkably, a quantum computer allows us to solve the same problem with far fewer queries.

From the expression
$H^{\otimes n} |x\rangle = \frac{1}{\sqrt{2^n}} \sum_{z \in \{0,1\}^n} (-1)^{x \cdot z} |z\rangle$ ,
we observe that an inner product appears in the phase factor $(-1)^{x \cdot z}$ . If we can prepare the corresponding state, applying Hadamard gates directly yields $s$ . This state naturally appears just before the final Hadamard transforms in the Deutsch–Jozsa algorithm(see this post), so the same circuit can be used to solve the Bernstein–Vazirani problem with only a single query.

To implement this, we prepare an $n$ -qubit input register in the state $|0\rangle^{\otimes n}$ and a single auxiliary qubit in the state $|1\rangle$ . Applying Hadamard gates to all $n+1$ qubits yields

$|\psi\rangle = \frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} |x\rangle \otimes \frac{1}{\sqrt{2}} \bigl(|0\rangle - |1\rangle\bigr)$

We then apply the quantum oracle $U_f$ encoding $f$ :

$U_f \, |x\rangle |y\rangle = |x\rangle \, |y \oplus f(x)\rangle$

After acting on $|\psi\rangle$ , the combined state becomes

$\frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} (-1)^{f(x)} |x\rangle \otimes \frac{1}{\sqrt{2}} \bigl(|0\rangle - |1\rangle\bigr)$

This has been carefully discussed in Deutsch–Jozsa algorithm(see this post).

On the input register, this action is equivalently described by the operator $Q_f$ :

$Q_f \frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} |x\rangle = \frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} (-1)^{f(x)} |x\rangle$

Recall that $f(x) := x \cdot s$ , then apply quantum oracle $Q_f$ to get

$\frac{1}{\sqrt{2^n}} \sum_{x \in {0,1}^n} (-1)^{x \cdot s} |x\rangle$

Then applying Hadamard gates $H^{\otimes n}$ will give $|s\rangle$ . By measuring the final state, we can read out $s$ .

The quantum circuit of the Bernstein-Vazirani algorithm is completely the same as that for Deutsch-Jozsa. A simpler version is given in Figure 2 based on $Q_f$ .

Deutsch-Jozsa Algorithm

The Deutsch-Jozsa algorithm, introduced by David Deutsch and Richard Jozsa in their 1992 paper Rapid solution of problems by quantum computation, marked an important early milestone in the development of quantum algorithms. The formulation familiar to us today—featuring the standard quantum circuit model and oracle—was later refined in 1998 by Richard Cleve, Artur Ekert, Chiara Macchiavello, and Michele Mosca in their paper Quantum algorithms revisited. Both papers are highly readable, and we encourage interested readers to consult them directly.

^{Fig.1 David Deutsch(Photo Credits: Lulie Tanett, see this link) and Richard Jozsa (Photo from this link)}

As a deterministic quantum algorithm, the Deutsch-Jozsa algorithm has limited direct practical applications. Its significance lies in demonstrating, for the first time, that quantum computation can achieve an exponential speedup over all deterministic classical algorithms. Studying this algorithm offers valuable insight into how quantum algorithms are structured and why they can outperform classical methods.

It is worth noting that this exponential separation holds only when compared with deterministic classical algorithms. If classical randomness is allowed, such an exponential advantage no longer appears. The first quantum algorithm to establish an exponential speedup over any classical stochastic algorithm is Simon’s algorithm, see this post.

Definition (Problem of the Deutsch-Jozsa Algorithm). Consider a function

$f: \{0,1\}^n \to \{0,1\}$

that accepts $n$ -bit binary inputs and outputs either 0 or 1. It is guaranteed that $f$ is either constant (producing the same output for all inputs) or balanced (outputting 1 for exactly half of the inputs and $0$ for the other half). The objective is to determine whether $f$ is constant or balanced by querying the oracle.

For example, consider a one-bit input function $f: \{0,1\} \to \{0,1\}$ . The function is constant if $f(0) = f(1)$ , and balanced if $f(0) \neq f(1)$ (equivalently, $f(0) \oplus f(1) = 1$ , where $\oplus$ denotes addition modulo 2). Determining whether $f$ is constant or balanced is thus equivalent to evaluating $f(0) \oplus f(1)$ . This can also be viewed as a quantum XOR game: if the result is 0, then $f$ is constant; otherwise, $f$ is balanced.

Deutsch-Jozsa Algorithm: n=1 Case

Let us begin with the simplest example, namely the $n=1$ case of the Deutsch-Jozsa algorithm, to gain some intuition about how quantum algorithms work.

Classically, we would need to evaluate $f$ twice to solve the problem, namely compute $f(0)$ and $f(1)$ . There is no way around this. Quantum mechanically, we can do better. To begin, we need to “promote” the function $f$ to a unitary operation $U_f$ , defined by

$U_f |x\rangle |y\rangle = |x\rangle |y \oplus f(x)\rangle$

We do not concern ourselves here with how to implement $U_f$ in an actual quantum circuit. Instead, we treat it as a black box that can be used freely. Such a black box is called a quantum oracle, and we will encounter this concept frequently in many other quantum algorithms, like Simon’s algorithm, Grover search algorith, Shor’s algorithm, etc.

Exercise. As a simple practice, prove that $U_f$ defined above is a unitary operator for both balanced and constant functions $f$ .

Proof. Consider the action of $U_f$ on the four computational basis states:

$|0\rangle |0\rangle \mapsto |0\rangle |0 \oplus f(0)\rangle$

$|0\rangle |1\rangle \mapsto |0\rangle |1 \oplus f(0)\rangle$

$|1\rangle |0\rangle \mapsto |1\rangle |0 \oplus f(1)\rangle$

$|1\rangle |1\rangle \mapsto |1\rangle |1 \oplus f(1)\rangle$

Notice that $0 \oplus f(0) \neq 1 \oplus f(0)$ regardless of the value of $f(0)$ , and similarly $0 \oplus f(1) \neq 1 \oplus f(1)$ .

Hence, the images of the basis states under $U_f$ form an orthonormal basis, which shows that $U_f$ is indeed unitary. ∎

With the quantum oracle $U_f$ , the Deutsch-Jozsa algorithm works as follows.
We prepare two qubits: one for the input of $f$ and one for the output. They are initialized in the state $|0\rangle |1\rangle$ . Then we perform the following steps:

Apply Hadamard gates to both qubits: $(H \otimes H) \, |0\rangle |1\rangle$ .
Apply the oracle: $(U_f \otimes I) \big( (H \otimes H) \, |0\rangle |1\rangle \big)$ .
Apply a Hadamard gate to the first qubit: $(H \otimes I) \, (U_f \otimes I) \big( (H \otimes H) \, |0\rangle |1\rangle \big)$ .
Measure the first qubit in the computational basis.

Let us examine what happens step by step. After steps 1 and 2, we obtain

$|0\rangle |1\rangle \overset{H\otimes H}{\longrightarrow} |+\rangle |-\rangle = \frac{1}{2} \Big( |0\rangle (|0\rangle - |1\rangle) + |1\rangle (|0\rangle - |1\rangle) \Big)$

$\overset{U_f\otimes I}{\longrightarrow} \frac{1}{2}\Big( |0\rangle \big( |0 \oplus f(0)\rangle - |1 \oplus f(0)\rangle \big) + |1\rangle \big( |0 \oplus f(1)\rangle - |1 \oplus f(1)\rangle \big) \Big).$

Next, we use a simple but important trick: we rewrite the action of the oracle so that the value of $f(x)$ appears as a phase. The key identity is

$|0 \oplus z\rangle - |1 \oplus z\rangle = (-1)^z \big( |0\rangle - |1\rangle \big),$

valid for $z = 0,1$ . The state after applying the oracle can be rewritten as

$\frac{1}{2} \Big( |0\rangle (|0 \oplus f(0)\rangle - |1 \oplus f(0)\rangle) + |1\rangle (|0 \oplus f(1)\rangle - |1 \oplus f(1)\rangle) \Big)$

$= \frac{1}{2} \Big( (-1)^{f(0)} |0\rangle (|0\rangle - |1\rangle) + (-1)^{f(1)} |1\rangle (|0\rangle - |1\rangle) \Big)$

$= (-1)^{f(0)} \frac{1}{2} \Big( |0\rangle + (-1)^{f(0)\oplus f(1)} |1\rangle \Big) (|0\rangle - |1\rangle).$

Ignoring the second qubit and the global phase, the first qubit is in the state

$\frac{1}{\sqrt{2}} \Big( |0\rangle + (-1)^{f(0) \oplus f(1)} |1\rangle \Big).$

^{Fig.2 Quantum circuit for Deutsch–Jozsa algorithm (n=1 case).}

Our goal is to determine $f(0) \oplus f(1)$ , which is encoded in the relative phase $(-1)^{f(0) \oplus f(1)}$ of the first qubit. To read out this information, we apply a Hadamard gate to the first qubit and then measure it in the computational basis. This is an example of an interferometric measurement scheme, a standard method for extracting the coefficient of a state in a given basis; you will encounter this technique in many other contexts as well.

After applying the Hadamard, the state of the first qubit becomes

$\frac{1}{2} \Big( |0\rangle + |1\rangle + (-1)^{f(0)\oplus f(1)} |0\rangle - (-1)^{f(0)\oplus f(1)} |1\rangle \Big)$

$= \frac{1}{2} \Big( \big(1 + (-1)^{f(0)\oplus f(1)}\big) |0\rangle + \big(1 - (-1)^{f(0)\oplus f(1)}\big) |1\rangle \Big).$

The probabilities of measuring the first qubit in the computational basis are then

$p(0) = \frac{1}{4} \big| 1 + (-1)^{f(0) \oplus f(1)} \big|^2,$
$p(1) = \frac{1}{4} \big| 1 - (-1)^{f(0) \oplus f(1)} \big|^2.$

Hence, if $f(0)\oplus f(1) = 0$ (the function is constant), we will certainly measure $|0\rangle$ .
If $f(0)\oplus f(1) = 1$ (the function is balanced), we will measure $|1\rangle$ .

In other words, a single query to the quantum oracle suffices to determine with certainty whether $f$ is constant or balanced. In contrast, a classical algorithm would require evaluating $f$ twice.

The key mechanism at work here is quantum parallelism: in step 2, the oracle evaluates ff f across both inputs in parallel, storing the outcomes in superposition. The concluding Hadamard transformation then translates this superposed information into observable probability amplitudes, from which we can efficiently determine the answer.

Deutsch-Jozsa Algorithm

Now let us turn to the general case in which $f$ has an $n$ -bit input. As in the $n = 1$ case, the classical oracle must be replaced by a quantum oracle $U_f$ that coherently evaluates the function. By definition, the quantum oracle acts as

$U_f\,|x\rangle |y\rangle = |x\rangle\, |y \oplus f(x)\rangle,$

where $x = x_1 \cdots x_n$ is an $n$ -bit string, and we write

$|x\rangle := |x_1\rangle \cdots |x_n\rangle.$

This means we use the canonical encoding that maps each bit string to a corresponding computational basis state.

Exercise. Prove that the quantum oracle $U_f$ defined by $U_f\,|x\rangle|y\rangle = |x\rangle|y\oplus f(x)\rangle$ is unitary.

Proof. The operator $U_f$ permutes the computational basis states. Indeed, for each basis vector $|x\rangle|y\rangle$ the image $|x\rangle|y\oplus f(x)\rangle$ is again a computational-basis vector, and distinct input basis vectors have distinct images because $y\mapsto y\oplus f(x)$ is a bijection on ${0,1}$. Hence the set of images of the computational basis is another orthonormal basis of the Hilbert space. A linear map that sends an orthonormal basis to an orthonormal basis preserves inner products and therefore is unitary. Thus $U_f$ is unitary (Equivalently, one can note that $U_f^2 = I$ since $(y\oplus f(x))\oplus f(x)=y$ , so $U_f$ is invertible with $U_f^{-1}=U_f$ , and a permutation matrix with real entries has its inverse equal to its adjoint; either way $U_f$ is unitary.) ∎

The quantum circuit for the Deutsch–Jozsa algorithm is shown in Figure 3. The algorithm proceeds as follows:

Initial State: Prepare an n-qubit input register in the state $|0\rangle^{\otimes n}$ and a single output qubit in the state $|1\rangle$ :
$|0\rangle^{\otimes n} \otimes |1\rangle$ .
Hadamard Transform: Apply the Hadamard gate to each qubit:
$\frac{1}{\sqrt{2^n}} \sum_{x \in \{0,1\}^n} |x\rangle \otimes \frac{1}{\sqrt{2}} (|0\rangle - |1\rangle)$ .
Oracle Query: Apply the quantum oracle $U_f$ :
$U_f |x\rangle |y\rangle = |x\rangle |y \oplus f(x)\rangle$ .
After applying the oracle, the state becomes
$\frac{1}{\sqrt{2^n}} \sum_{x \in \{0,1\}^n} (-1)^{f(x)} |x\rangle \otimes \frac{1}{\sqrt{2}} (|0\rangle - |1\rangle)$ ,
using the identity
$|0 \oplus f(x)\rangle - |1 \oplus f(x)\rangle = (-1)^{f(x)} (|0\rangle - |1\rangle)$ .
The output qubit is no longer entangled with the input register, so we can ignore it, leaving
$\frac{1}{\sqrt{2^n}} \sum_{x \in \{0,1\}^n} (-1)^{f(x)} |x\rangle$ .
Second Hadamard Transform: Apply $H^{\otimes n}$ to the input register:
$H^{\otimes n} |x\rangle = \frac{1}{\sqrt{2^n}} \sum_{z \in \{0,1\}^n} (-1)^{x \cdot z} |z\rangle$ ,
giving
$\frac{1}{2^n} \sum_{z \in \{0,1\}^n} \left( \sum_{x \in \{0,1\}^n} (-1)^{f(x) + x \cdot z} \right) |z\rangle$ .
Measurement: Measure the n-qubit input register. The outcome reveals the type of function:

If $f$ is constant, interference ensures only $|0\rangle^{\otimes n}$ appears.
If $f$ is balanced, destructive interference prevents $|0\rangle^{\otimes n}$ from appearing.

Measuring $|0\rangle^{\otimes n}$ indicates a constant function; any other outcome indicates a balanced function.

Exercise: Verify that the states at each step of the n-qubit Deutsch-Jozsa algorithm match the evolution described above for a given function $f$ .

Classically, if we want to solve the problem by brute force, we need to check about half of the inputs, thus evaluating $f$ up to $2^{n-1}$ times in the worst case. The Deutsch-Jozsa algorithm solves the problem with only one evaluation of the oracle, compared to the exponentially growing number of evaluations required by a classical algorithm. The quantum algorithm achieves this exponential speedup by leveraging quantum superposition and interference.

Quantum Order-Finding Algorithm

The order-finding problem is a fundamental problem that plays a central role in several quantum algorithms; in particular, it underlies Shor’s celebrated factoring algorithm. Classically, order-finding is hard—no efficient classical algorithm is known—but quantum computers can solve it efficiently. Efficiently determining the order of an integer modulo $N$ allows integers to be factored in polynomial time, which would compromise RSA encryption, as we will discuss in the subsequent sections.

The order-finding problem is defined rigorously as follows:

Definition (Order-Finding Problem) Let $N \in \mathbf{Z}_+$ be a positive integer, and let $a \in \mathbf{Z}_N^{\times}$ be an integer coprime to $N$ (that is, the greatest common divisor $\gcd(a,N)$ of $a$ and $N$ equals one, $\gcd(a,N) = 1$ ). The notation $\mathbf{Z}_N^\times$ denotes the set of invertible elements in $\mathbf{Z}_N$ under multiplication—those integers that are coprime to $N$ . The order of $a$ modulo $N$ , denoted $\mathrm{ord}_N(a)$ or simply $r$ , is the smallest positive integer $r$ such that
$a^r \equiv 1 (\mathrm{mod}\, N)$
The order-finding problem is: given $a$ and $N$ , compute $r = \mathrm{ord}_N(a)$ .

At first glance, this definition may appear somewhat abstract.
Do not worry—shortly, we will unpack each of the mathematical ingredients and illustrate their meaning with concrete examples.

The quantum algorithm for order-finding exploits the periodic structure of the function

$f_{a,N}(x) \equiv a^x (\mathrm{mod}\, N) \ \ \ \ \ (1)$

which is periodic with period exactly $r$ . By preparing a quantum superposition over all possible exponents and applying the quantum phase estimation algorithm to the unitary operator

$U |y\rangle = |a y (\mathrm{mod}\, N)\rangle,$

we obtain an efficient quantum procedure that succeeds with high probability.

Preliminaries of Modular Arithmetic

Before presenting the complete quantum algorithm, let us first review a few essential results from number theory. This section may appear somewhat technical compared to others; if you prefer not to delve into the details, simply remember that for coprime integers $a$ and $N$ , there exists a positive order $r$ , and the function in Eq.(1) is periodic and turn to the next section.

An integer $N > 1$ is called a prime number if it has no nontrivial divisors; for example, $N = 2, 3, 5, 7, 11, \dots$ are prime numbers. Otherwise, $N$ is called a composite number. We use $a \mid b$ to denote that $b$ is divisible by $a$ , for instance $3 \mid 12$ , $2 \mid 6$ . Similarly, $a \nmid b$ means that $a$ does not divide $b$ , for example $3 \nmid 14$ .

Every integer $N > 1$ can be uniquely expressed as a product of prime powers:

$N = p_1^{e_1} \times p_2^{e_2} \times \cdots \times p_m^{e_m},$

where $p_1, \ldots, p_m$ are distinct prime numbers and $e_1, \ldots, e_m$ are positive integers. This result is known as the Fundamental Theorem of Arithmetic.

To obtain the prime factorization of a given number $N$ , one typically follows these steps:

Divide $N$ by $2$ repeatedly until the quotient is no longer divisible by $2$ .
Let $e_2$ be the number of times we divide by $2$ , and denote the resulting quotient by $N_2 = N / 2^{e_2}$ , for which $2 \nmid N_2$ .
For the quotient $N_2$ , proceed with the next prime, $3$ , and divide repeatedly until the quotient is no longer divisible by $3$ .
Let $e_3$ be the number of divisions, and denote the resulting quotient by $N_3 = N_2 / 3^{e_3}$ .
Continue in this manner with successive primes $5, 7, 11, \dots$ until the quotient becomes $1$ .

Exercise. Decompose $N = 84$ into its prime factors: $84 = 2^2 \times 3 \times 7$ .

The key operation we will use is modular arithmetic.
In everyday life we count linearly: 1, 2, 3, …, but in modular arithmetic we count on a circle.
Take a clock that shows only 12 hours. When the hand reaches 12, it jumps back to 0.
Mathematically, we say time is measured modulo 12.
This simple idea—wrapping around after a fixed modulus $N$ —is crucial in number theory and its applications in modern public-key cryptography.

Definition (Congruence) Fix an integer $N \ge 2$ , called the modulus.
Two integers $a$ and $b$ are congruent modulo $N$ if their difference is a multiple of $N$ :

$a \equiv b (\mathrm{mod}\, N) \quad \Longleftrightarrow \quad N \mid (a - b) \quad \Longleftrightarrow \quad \exists\, k \in \mathbf{Z} \text{ such that } a = b + kN.$

We read $a \equiv b (\mathrm{mod}\, N)$ as “ $a$ is congruent to $b$ modulo $N$ ”.

The congruence class of $a$ is usually represented by its smallest non-negative residue:

$a (\mathrm{mod}\, N) \in \mathbf{Z}_N := \{0, 1, 2, \dots, N-1\}$

It is also convenient to use $[a]$ to denote the congruence class whenever there is no ambiguity about the modulus.

Example. Let $N = 7$ . We have $\mathbf{Z}_7 :=\{0, 1, 2, \dots, 6\}$ . Then $1 \equiv 8 (\mathrm{mod}\, 7)$ and $2 \equiv 9 (\mathrm{mod}\, 7)$ . The congruence class $[0]$ consists of $\{0, 7, 14, 21, \dots\}$ , the congruence class $[1]$ consists of $\{1, 8, 15, 22, \dots\}$ , and $[2]$ consists of $\{2, 9, 16, 23, \dots\}$ , etc.

The order-finding problem considers the modular exponentiation for given integers $a$ and $N$ :

$a^x \equiv b (\mathrm{mod}\, N)$

The remainder $b$ can take values in $0, 1, \dots, N-1$ :

$a \equiv 0 (\mathrm{mod}\, N)$
$a \equiv 1 (\mathrm{mod}\, N)$
$\vdots$
$a \equiv N-1 (\mathrm{mod}\, N)$

However, not all values of $b = 0, 1, \dots, N-1$ are necessarily attainable.

For example, let $a = 2$ and $N = 6$ . We find that

$2^0 \equiv 1 (\mathrm{mod}\, 6)$
$2^1 \equiv 2 (\mathrm{mod}\, 6)$
$2^2 \equiv 4 (\mathrm{mod}\, 6)$
$2^3 \equiv 2 (\mathrm{mod}\, 6)$
$2^4 \equiv 4 (\mathrm{mod}\, 6)$
$2^5 \equiv 2 (\mathrm{mod}\, 6)$
$\vdots$

We observe a repeating pattern. In this case, the residues $0$ , $3$ , and $5$ never appear.

Note also that $a^0 \equiv 1 (\mathrm{mod}\, N)$ for any integers $a$ and $N$ . Hence, the order of $a$ modulo $N$ is defined as the smallest positive integer $r$ such that $a^r \equiv 1 (\mathrm{mod}\, N)$ . However, we must still ensure that such an order actually exists. In fact, if $\gcd(a, N) = d > 1$ , then no positive integer $r$ satisfies $a^r \equiv 1 (\mathrm{mod}\, N)$ . This can be seen in the previous example with $a = 2$ and $N = 6$ , where the sequence never returns to $1$ modulo $6$ .

Exercise. Let $N > 1$ and $\gcd(a, N) = d > 1$ . Then the only integer $r$ satisfying

$a^r \equiv 1 (\mathrm{mod}\, N)$

is $r = 0$ .

Proof. Suppose there exists an integer $r$ such that $a^r \equiv 1 (\mathrm{mod}\, N)$ .
Then $N$ divides $a^r - 1$ , so in particular

$d \mid (a^r - 1)$ .

Since $d \mid a$ , we also have $d \mid a^r$ , and therefore

$d \mid (a^r - 1) \implies d \mid 1$ ,

which contradicts $d > 1$ . Hence no such $r \neq 0$ exists.

For $r = 0$ , the convention $a^0 = 1$ gives

$a^0 = 1 \equiv 1 (\mathrm{mod}\, N)$

trivially. Thus $r = 0$ is the only solution ∎.

In Shor’s algorithm, we either assume $\gcd(a, N) = 1$ , or first compute the greatest common divisor classically, which already yields a nontrivial factor of $N$ if $d > 1$ .

Another useful concept we will use is Euler’s Totient function: $\varphi(N)$ counts the number of integers $k \in {1, 2, \dots, N-1}$ coprime to $N$ .
For example, $\varphi(1) = 1$ , $\varphi(2) = 1$ , $\varphi(3) = 2$ , $\varphi(4) = 2$ , etc.

For fixed $N$ , consider $\mathbf{Z}_N$ .
We call $a \in \mathbf{Z}_N$ invertible if there exists an integer $c$ such that $a c \equiv 1 (\mathrm{mod}\, N)$ .

Proposition (Bézout’s Identity). Let $a$ and $b$ be integers with greatest common divisor $d = \gcd(a,b)$ . Then there exist integers $x$ and $y$ such that

$a x + b y = d$ .

Moreover, the integers of the form $a z + b t$ are exactly the multiples of $d$ .

The set $\mathbf{Z}_N$ is not a group under multiplication modulo $N$ in general. We denote by $(\mathbf{Z}_N^{\times}, \times)$ the set of invertible elements, which indeed forms a group. By Bézout’s identity, $\mathbf{Z}_N^{\times}$ consists of all integers coprime to $N$ , and this group has order $\phi(N)$ , where $\phi$ denotes Euler’s totient function.

From a group-theoretic perspective, for any $a \in \mathbf{Z}_N^{\times}$ , there exists a positive integer $r$ such that

$a^r \equiv 1 (\mathrm{mod}\, N)$ .

Since $a$ must be invertible, it follows that $\gcd(a, N) = 1$ . In the multiplicative group $\mathbf{Z}_N^{\times}$ , the order of any element divides the order of the group, that is,

$r \mid |\mathbf{Z}_N^{\times}| = \phi(N)$ .

Theorem (Euler’s Theorem). If $\gcd(a, N) = 1$ , then

$a^{\phi(N)} \equiv 1 (\mathrm{mod}\, N)$

and hence the order of $a$ modulo $N$ divides $\phi(N)$ .

By Euler’s theorem, for integers $a$ and $N$ with $\gcd(a, N) = 1$ , there always exists a positive integer $r$ such that

$a^r \equiv 1 (\mathrm{mod}\, N).$

The order $r = \mathrm{ord}_N(a)$ always satisfies $1 \le r \le \phi(N)$ .

Since in the cyclic subgroup generated by $[a] \in \mathbf{Z}_N^{\times}$ we have

${[a^0], [a^1], [a^2], \dots, [a^{r-1}]}, \quad \text{with } [a^r] = [a^0] = [1],$

the sequence $a^n (\mathrm{mod}\, N)$ is periodic with period equal to the order $r$ :

$a^0 \equiv 1 (\mathrm{mod}\, N)$
$a^1 \equiv a (\mathrm{mod}\, N)$
$\dots$
$a^{r-1} \not\equiv 1 (\mathrm{mod}\, N)$
$a^r \equiv 1 (\mathrm{mod}\, N)$
$a^{r+1} \equiv a (\mathrm{mod}\, N)$
$a^{r+2} \equiv a^2 (\mathrm{mod}\, N)$
$a^{r+3} \equiv a^3 (\mathrm{mod}\, N)$

Thus, the sequence repeats with period $r$ .

Example. Set $a = 2$ and $N = 7$ . We have $f_{a,N}(x) = 2^x (\mathrm{mod}\, 7)$ as:

$2^0 \equiv 1 (\mathrm{mod}\, 7)$
$2^1 \equiv 2 (\mathrm{mod}\, 7)$
$2^2 \equiv 4 (\mathrm{mod}\, 7)$
$2^3 \equiv 8 \equiv 1 (\mathrm{mod}\, 7)$
$2^4 \equiv 16 \equiv 2 (\mathrm{mod}\, 7)$
$2^5 \equiv 32 \equiv 4 (\mathrm{mod}\, 7)$
$2^6 \equiv 64 \equiv 1 (\mathrm{mod}\, 7)$
$2^7 \equiv 128 \equiv 2 (\mathrm{mod}\, 7)$
$2^8 \equiv 256 \equiv 4 (\mathrm{mod}\, 7)$

We see that the order is $r = \mathrm{ord}_7(2) = 3$ .

Quantum Order-Finding Algorithm

The quantum order-finding algorithm can be divided into two main steps:

Encode the order into the phase of an eigenvalue of a suitable unitary operator, thereby reducing the order-finding problem to a quantum phase estimation problem.
Recover the order $r$ from the estimated phase $\varphi$ using the method of continued fractions.

Let us first examine how to encode the order into the phase of a unitary operator.

Fix integers $a$ and $N$ such that $\gcd(a, N) = 1$ .
Define the unitary operator $U_{a,N}$ acting on computational basis states $|y\rangle$ by

$U_{a,N} |y\rangle = | a y (\mathrm{mod}\, N) \rangle, \quad y = 0, 1, \dots, N-1$

We will see that the order can be mapped to the phase of an eigenvalue of $U_{a,N}$ .

Proposition. Let $N \ge 2$ and let $a$ be an integer coprime to $N$ .
The operator $U_{a,N}$ on $\mathbf{C}^N$ defined by

$U_{a,N} |y\rangle = | a y (\mathrm{mod}\, N) \rangle, \quad y = 0, 1, \dots, N-1$

is unitary.

Proof. Since $\gcd(a, N) = 1$ , multiplication by $a$ modulo $N$ defines a bijection on the set ${0, 1, \dots, N-1}$ .
Thus $U_{a,N}$ merely permutes the computational basis ${ |0\rangle, |1\rangle, \dots, |N-1\rangle }$ . Any permutation of an orthonormal basis extends to a unitary operator, so $U_{a,N}$ is unitary.

Alternatively, suppose $U_{a,N} |y\rangle = U_{a,N} |y'\rangle$ . Then

$a y \equiv a y' (\mathrm{mod}\, N) \implies a (y - y') \equiv 0 (\mathrm{mod}\, N) \implies N \mid a (y - y')$ .

Because $\gcd(a, N) = 1$ , we may cancel $a$ (more precisely, multiply both sides by the modular inverse of $a$ modulo $N$ ) to obtain

$N \mid (y - y')$ .

Since $0 \le y, y' < N$ , the only possibility is $y - y' = 0$ , so $y = y'$ .
Thus $U_{a,N}$ is injective on a basis of $\mathbf{C}^N$ .
Being an isometry that maps a basis to a basis, it is unitary ∎.

Proposition. Let $r$ be the order of $a$ modulo $N$ , i.e., the smallest positive integer such that

$a^r \equiv 1 (\mathrm{mod}\, N)$ .

For each $s = 0, 1, \dots, r-1$ , define the state

$|u_s\rangle := \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} e^{-2\pi i s k / r} \, | a^k (\mathrm{mod}\, N) \rangle.$

Then, for each $s = 0, 1, \dots, r-1$ , $|u_s\rangle$ is an eigenstate of $U_{a,N}$ with eigenvalue

$e^{2\pi i s / r}$ .

Proof. Apply $U_{a,N}$ to $|u_s\rangle$ :

$U_{a,N} |u_s\rangle = \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} e^{-2\pi i s k / r} \, U_{a,N} | a^k (\mathrm{mod}\, N) \rangle$

$\quad\; = \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} e^{-2\pi i s k / r} \, | a^{k+1} (\mathrm{mod}\, N) \rangle$

Reindex the sum by letting $k \mapsto k+1$ :

$\quad\; = \frac{1}{\sqrt{r}} \sum_{k=1}^{r} e^{-2\pi i s (k-1) / r} \, | a^k (\mathrm{mod}\, N) \rangle$

$\quad\; = \frac{1}{\sqrt{r}} \sum_{k=1}^{r} e^{-2\pi i s k / r} \, e^{2\pi i s / r} \, | a^k (\mathrm{mod}\, N) \rangle$

$\quad\; = e^{2\pi i s / r} \frac{1}{\sqrt{r}} \sum_{k=1}^{r} e^{-2\pi i s k / r} \, | a^k (\mathrm{mod}\, N) \rangle$

The sum now runs from $k = 1$ to $r$ , but the $k = r$ term is

$e^{-2\pi i s r / r} \, | a^r (\mathrm{mod}\, N) \rangle = e^{-2\pi i s} \, | 1 (\mathrm{mod}\, N) \rangle = | 1 (\mathrm{mod}\, N) \rangle,$

while the $k = 0$ term in the original definition of $|u_s\rangle$ is

$e^0 \, | a^0 (\mathrm{mod}\, N) \rangle = | 1 (\mathrm{mod}\, N) \rangle.$

Thus the missing $k = 0$ term is exactly replaced by the $k = r$ term, and we obtain

$U_{a,N} |u_s\rangle = e^{2\pi i s / r} \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} e^{-2\pi i s k / r} \, | a^k (\mathrm{mod}\, N) \rangle = e^{2\pi i s / r} |u_s\rangle.$

This holds for every $s = 0, 1, \dots, r-1$ ∎.

Now the procedure becomes “straightforward”. We apply quantum phase estimation (see this post) to the operators $U_a^{2^j}$ with the eigenstate $|u_s\rangle$ , which yields the phases

$\theta_s = \frac{s}{r}, \quad s = 0, 1, \dots, r-1$

The output of the phase estimation algorithm is thus

$0.\theta_0 \theta_1 \dots \theta_{n}$

represented as binary numbers. A classical algorithm called continued fractions can be applied to obtain the order $r$ .

Quantum Phase Estimation

A key application of the quantum Fourier transform (see this post) is the Quantum Phase Estimation algorithm. This algorithm allows one to estimate the phase (or eigenvalue) of a unitary operator and serves as a central component of many quantum algorithms, including Shor’s factoring algorithm and quantum simulation.

Consider a unitary matrix $U$ . Since $U^\dagger U = I$ , all eigenvalues of $U$ have the form $e^{2\pi i \theta}$ with $0 \leq \theta < 1$ . Expressing $\theta$ in binary gives

$\theta = 0.\theta_1 \theta_2 \theta_3 \cdots = \sum_{k=1}^\infty \theta_k \, 2^{-k},$

where each $\theta_k \in {0,1}$ . The task is to estimate $\theta$ to $n$ binary digits of precision.

More precisely, the quantum phase estimation algorithm aims to determine the phase $\theta$ in the eigenvalue equation

$U |\psi\rangle = e^{2\pi i \theta} |\psi\rangle$ ,

where $U$ is the given unitary operator, $|\psi\rangle$ is an eigenstate of $U$ that can be prepared as an input state, and $\theta$ is the phase to be determined. We further assume that controlled- $U^{2^l}$ operations can be implemented for arbitrary non-negative integer $l$ . The goal is to estimate $\theta$ to the desired number of digits with high probability.

Now suppose that we express $\theta$ in $n$ binary digits,

$\theta = 0.\theta_1\theta_2\cdots \theta_n$ .

Then, we have

$U^{2^l} \lvert \psi \rangle = e^{2 \pi i 0.\theta_1\theta_2\cdots\theta_n 2^l} \lvert \psi \rangle = e^{2 \pi i \theta_1\cdots\theta_l.\theta_{l+1}\cdots\theta_n} \lvert \psi \rangle$

where the integer part of $\theta_1\cdots \theta_l.\theta_{l+1}\cdots \theta_n$ can be omitted, as it contributes only a trivial phase factor of $1$ .

For the controlled unitary $C(U^{2^l})$ , its action on $|+\rangle|\psi\rangle$ (with $|+\rangle$ the control qubit) is:

$C(U^{2^l}) \frac{1}{\sqrt{2}} ( \lvert 0 \rangle + \lvert 1 \rangle ) \lvert \psi \rangle = \frac{1}{\sqrt{2}} \lvert 0 \rangle \otimes \lvert \psi \rangle + e^{2 \pi i 0.\theta_{l+1}\cdots \theta_n} \lvert 1 \rangle \otimes \lvert \psi \rangle$

$= \frac{1}{\sqrt{2}} (|0\rangle + e^{2\pi i 0.\theta_{l+1}\cdots \theta_n} |1\rangle) \otimes |\psi\rangle$

It is convenient to view this control qubit as representing the $(l+1)$-th digit of a binary number $k = k_1 k_2 \cdots k_n = k_1 \times 2^0 + k_2 \times 2^1 + \cdots + k_n \times 2^{n-1}$ .

With this notation, the above expression can be rewritten as

$C(U^{2^l}) \frac{1}{\sqrt{2}} (\lvert0\rangle + \lvert1\rangle)\lvert\psi\rangle = \frac{1}{\sqrt{2}} \sum_{k_{l+1}=0}^1 e^{2 \pi i \theta k_{l+1} 2^l} \lvert k_{l+1}\rangle \lvert\psi\rangle$

This expression for the first qubit is familiar from our earlier discussion of the quantum Fourier transform, and it naturally suggests that the quantum Fourier transform should be applied here. The eigenstate $\lvert \psi \rangle$ therefore remains unchanged and can be reused in subsequent steps.

Fig 1: quantum circuit for quantum phase estimation algorithm.

The quantum phase estimation algorithm uses two quantum registers as shown in Figure 1:

First register (phase record): Contains $n$ qubits, which will store the estimated phase.
Second register (eigenstate): Contains the eigenstate $\lvert \psi \rangle$ , which is an eigenstate of the unitary operator $U$ .

The quantum phase estimation algorithm consists of the following steps:

Prepare Initial State

The system starts with two registers:

The first register is initialized in the state $\lvert 0 \rangle^{\otimes n}$ .
The second register is initialized in the eigenstate $\lvert \psi \rangle$ of $U$ .

The initial state is: $\lvert 0 \rangle^{\otimes n} \otimes \lvert \psi \rangle$ .

Apply Hadamard Gates

Apply Hadamard gates to each qubit in the first register, thereby creating a uniform superposition over all possible computational basis states:

$\frac{1}{\sqrt{2^n}} \sum_{k=0}^{2^n-1} \lvert k \rangle \otimes \lvert \psi \rangle$

Here, $k = k_1 k_2 \cdots k_n$ denotes the binary representation of the integer $k$ , with the bits encoded from the bottom to the top qubit in the first register.

Apply Controlled- $U^{2^l}$ Operations

For each qubit $k_l$ in the first register, apply the controlled- $U^{2^l}$ operation.
This step entangles the first and second registers, yielding the state:

$\frac{1}{\sqrt{2^n}} \sum_{k=0}^{2^n-1} \lvert k \rangle \otimes U^k \lvert \psi \rangle$

Since $U \lvert \psi \rangle = e^{2 \pi i \theta} \lvert \psi \rangle$ , this becomes:

$\frac{1}{\sqrt{2^n}} \sum_{k=0}^{2^n-1} e^{2 \pi i k \theta} \lvert k \rangle \otimes \lvert \psi \rangle$

At this stage, the second register is no longer needed, as all phase information is now encoded in the first register.

Apply the Inverse Quantum Fourier Transform

Apply the inverse quantum Fourier transform ( $U_{\rm QFT}^\dagger$ ) on the first register to extract the phase information:

$\frac{1}{\sqrt{2^n}} \sum_{k=0}^{2^n-1} e^{2 \pi i k \theta} \lvert k \rangle$

Measure the First Register

Finally, measure the qubits in the first register. The measurement will yield the binary representation of the phase $\theta$ with an $n$ -bit approximation.
The output will be an $n$ -bit approximation of the phase $\theta$ , such that:

$\theta \approx 0.\theta_1 \theta_2 \dots \theta_n$

where $\theta_1, \theta_2, \dots, \theta_n$ are the binary digits of $\theta$ .