Simons’s algorithm – Zhian Jia

The Simon’s algorithm is a quantum algorithm designed to find a hidden string s in a particular class of functions, achieving an exponential speedup over any classical algorithm. In this setting, the hidden string s appears in a bitwise addition relation that plays a role analogous to the period of a function. For this reason, the problem is sometimes referred to as the period-finding problem.

Simon’s algorithm holds historical significance as it was the first to demonstrate an exponential quantum advantage over the best known classical algorithm, revealing how quantum computation can surpass classical methods in well-defined computational tasks. The algorithm was proposed by Daniel R. Simon in 1994 in the paper On the Power of Quantum Computation.

Simon later recounted the story behind his algorithm in a blog article (Grant Salton, Daniel Simon, and Cedric Lin, Exploring Simon’s Algorithm with Daniel Simon, October 11, 2021.):

“I submitted my algorithm to a theoretical computer science conference (STOC 1993), but it was rejected. However, Peter Shor was on the program committee for that conference, and immediately saw the potential (that I had had absolutely no inkling of, to be honest) for applying the same general algorithm structure to concrete problems like factoring and discrete logarithms. After trying unsuccessfully to persuade the committee to accept my paper, he got to work on his own, and submitted it to the next major theoretical computer science conference (FOCS 1993)—in parallel with mine, resubmitted. We had in fact agreed to merge the papers if only his was accepted, but the committee fortunately agreed to accept them both, giving us each credit for our respective portions of the resulting achievement.”

From this account, we see that Simon’s algorithm served as a precursor to Peter Shor’s celebrated factoring algorithm. It laid the conceptual foundation for many of the seminal advances in quantum computation that followed. The problem addressed by Simon’s algorithm can be rigorously stated as follows:

Definition (Simon’s Problem). Let $f : \{0,1\}^n \to \{0,1\}^n$ be a function satisfying the periodicity condition: there exists a string $s \in \{0,1\}^n$ such that $\displaystyle f(x) = f(y)$ if and only if $y = x \oplus s$ for all $x, y \in \{0,1\}^n$ , where $\oplus$ denotes bitwise addition modulo two (XOR). This implies that $f$ can only take one of the following two forms:
• One-to-One Case: If $s = 0$ , then $f$ is one-to-one.
• Two-to-One Case: If $s \neq 0$ , then for any distinct $x, y \in \{0,1\}^n$ , $\displaystyle f(x) = f(y)$ if and only if $y = x \oplus s$ , so $f$ is two-to-one.
The task is to determine the hidden string $s$ using as few evaluations of $f$ as possible, where $f$ is implemented as a black box (oracle).

The secret period bit string $s$ is also referred to as a mask—a term commonly used in computer science—and, since it is applied to the inputs via bitwise XOR, it is often called an XOR mask. It is important to distinguish between periodicity under ordinary addition and that under bitwise addition modulo two, as this difference can be subtle for first-time learners. For real-valued periodic functions, if $T$ is a period, then $2T = T + T$ is also a period. Likewise, if we treat a bit string $s$ as a binary number, then when $s$ is a period, sums such as $s + s$ , $s + s + s$ , and so forth are also periods. In contrast, for functions defined using bitwise addition modulo two, any bit string $s$ satisfies $s \oplus s = 0$ . This property leads to an important consequence: if a nonzero period exists in this setting, it must be unique.

In the classical setting, we can find the secret XOR mask $s$ by identifying a collision—a pair of inputs that produce the same output. More precisely, we query the oracle by sending a bit string $x_1$ as input and observing the corresponding output $f(x_1)$ . We then repeat this process for another input $x_2$ , obtaining $f(x_2)$ , and continue in this manner for additional queries. By keeping a record of the function values, we eventually find a pair of input strings $x_i$ and $x_j$ such that $f(x_i) = f(x_j)$ . According to the defining property of Simon’s problem, this equality implies that the two inputs differ by the secret string, i.e., $x_j = x_i \oplus s$ . Using the property that any bit string satisfies $x \oplus x = 0$ , we can verify that $\displaystyle s = x_i \oplus x_j.$ Thus, the secret string $s$ can be recovered.

Example. Consider the three-bit string $s = 011$ . We can compute the bitwise addition $x \oplus s$ for all $x \in \{0,1\}^3$ as follows:

$\displaystyle 000 \oplus 011 = 011$
$\displaystyle 001 \oplus 011 = 010$
$\displaystyle 010 \oplus 011 = 001$
$\displaystyle 011 \oplus 011 = 000$
$\displaystyle 100 \oplus 011 = 111$
$\displaystyle 101 \oplus 011 = 110$
$\displaystyle 110 \oplus 011 = 101$
$\displaystyle 111 \oplus 011 = 100$

A function $f: \{0,1\}^3 \to \{0,1\}^3$ satisfying Simon’s property with period $s = 011$ must be two-to-one. That is, its outputs are paired as follows:

$\displaystyle f(000) = f(011) = z_1$
$\displaystyle f(001) = f(010) = z_2$
$\displaystyle f(100) = f(111) = z_3$
$\displaystyle f(101) = f(110) = z_4$

Where $z_1, z_2, z_3, z_4$ are all distinct. For example, one possible assignment is:

$\displaystyle z_1 = 000$ , $\displaystyle z_2 = 001$ , $\displaystyle z_3 = 010$ , $\displaystyle z_4 = 011$

This illustrates that the function naturally pairs inputs that differ by the secret string $s = 011$ .

One may now ask how many queries are required to solve Simon’s problem—that is, to determine the secret XOR mask $s$ . We have seen that $s$ can be obtained by finding a collision pair, two distinct inputs $x_i$ and $x_j$ such that $f(x_i) = f(x_j)$ . The question, then, becomes: how many queries does it take to find such a pair?

A straightforward approach is to query the oracle for every possible input. Since $f$ is defined on bit strings of length $n$ , there are $2^n$ possible inputs in total. By exhaustively querying all of them, we are guaranteed to find a matching pair and thus recover $s$ . However, this brute-force strategy requires up to $2^n$ queries. In fact, because $f$ is a two-to-one function, it is sufficient to check at most half of the inputs in the worst case. Therefore, we require no more than $2^{n-1}$ queries to find a collision and determine $s$ .

Can we do better than brute-force querying? The answer is yes, if we query $f$ with random inputs—but the improvement is limited. By reasoning similar to that used in the birthday paradox (which considers the probability that at least two people in a group of $n$ share the same birthday), we can draw an analogy to Simon’s problem. Roughly speaking, the total number of bit strings, $2^{n}$ , is analogous to the total number of days in a year, and the number of queries corresponds to the number of people chosen. More precisely, suppose we query one input $x$ and then another input $y$ . The probability that they form a collision is $p[f(x) = f(y)] = \frac{1}{2^n - 1}$ , since there are $2^n$ possible bit strings and we have already queried one. If we query $k$ inputs, there are $\binom{k}{2}$ pairs of inputs. Hence, the probability of finding at least one collision is approximately $p_{\rm suc} \sim \frac{\binom{k}{2}}{2^n - 1}$ . Assuming a reasonable success probability, for example $p_{\rm suc} \ge \tfrac{2}{3}$ , a simple calculation then shows that we expect to find a collision after roughly $t = O(2^{n/2})$ queries. This quadratic improvement arises from the probabilistic nature of collisions: as we collect outputs of $f$ , the likelihood of encountering a repeated value increases roughly with the square of the number of queries.

However, Simon’s algorithm performs significantly better, achieving an exponential speedup over classical approaches. It requires the use of a quantum oracle $U_f$ which is given to us as a black box (an “oracle”), and has a similar form to the oracle used in the Deutsch-Jozsa algorithm: $U_f|x\rangle |y\rangle =|x\rangle |y\oplus f(x)\rangle$ . Here, both $f(x)$ and $y$ are $n$ -qubit states, and the bitwise addition $\oplus$ indicates that they are not treated as ordinary binary numbers, but rather as bit strings under modulo-two addition.

Simon’s algorithm mainly consists of two parts: (i) The quantum part, implemented by the quantum circuit shown in the following Figure, which is run $t = O(n)$ times. (ii) The classical postprocessing, which operates on the measurement outcomes and scales efficiently in $n$ , used to determine the secret XOR mask $s$ .

Exercise. The formula
$H^{\otimes n}|x\rangle= \frac{1}{\sqrt{2^n}} \sum_{z=0}^{2^n-1} (-1)^{x \cdot z} |x\rangle$
is a crucial trick that we will use repeatedly hereinafter—prove it and remember it. Notice that we have used the convention $x \cdot z = x_1 z_1 \oplus \cdots \oplus x_n z_n$ .

The following is a detailed breakdown of the algorithm’s steps:

Initial Setup
Prepare two quantum registers: An $n$ -qubit input register initialized to $|0\rangle^{\otimes n}$ . An $n$ -qubit output register also initialized to $|0\rangle^{\otimes n}$ . The initial state of the system is $|0\rangle^{\otimes n} |0\rangle^{\otimes n}$ .
Apply Hadamard Gates
Apply the Hadamard gate $H$ to all qubits in the input register to create a superposition of all possible input states: $H^{\otimes n} |0\rangle^{\otimes n} = \frac{1}{\sqrt{2^n}} \sum_{x=0}^{2^n-1} |x\rangle$ .
The state now becomes $\frac{1}{\sqrt{2^n}} \sum_{x=0}^{2^n-1} |x\rangle |0\rangle^{\otimes n}$ .
Query the Oracle
Use the oracle $U_f$ to compute $f(x)$ for each $x$ : $U_f |x\rangle |y\rangle = |x\rangle |y \oplus f(x)\rangle$ .
After applying the oracle, the state becomes $\frac{1}{\sqrt{2^n}} \sum_{x=0}^{2^n-1} |x\rangle |f(x)\rangle$ .
Apply Hadamard Gates
Apply the Hadamard gate $H$ to all qubits in the input register, which gives $\frac{1}{2^n} \sum_{z} \sum_x (-1)^{x \cdot z} |z\rangle |f(x)\rangle$ , where we use the equation in Exercise.
Measure the Output Register
Measure the output register and obtain the value $f(x) = w$ . There are two possible cases:
One-to-One Case: If $f$ is one-to-one, there is only a single $x$ such that $f(x) = w$ . The post-measurement state becomes $\frac{1}{\sqrt{2^n}} \sum_{z} (-1)^{x \cdot z} |z\rangle |f(x)=w\rangle$ .
Two-to-One Case: If $f$ is two-to-one, there exist $x$ and $y = x \oplus s$ such that $f(x) = f(y) = w$ . The post-measurement state is $\frac{1}{\sqrt{2^n}} \sum_{z} \frac{1}{\sqrt{2}} \Big[ (-1)^{x \cdot z} + (-1)^{y \cdot z} \Big] |z\rangle |f(x) = w\rangle$ .
Notice that this measurement step is not strictly necessary, as the result itself is not used in subsequent computations; however, we include it here because it simplifies the analysis and helps illustrate the underlying idea.
Generate Linear Equations
The relationships derived from the measurements create linear equations involving the secret string $s$ . If $f$ is one-to-one, measuring the state in the previous step for the input register yields a uniformly random bit string $z$ . If $f$ is two-to-one, however, the measurement returns a random bit string $z$ satisfying $x \cdot z = y \cdot z \pmod{2}$ , since otherwise the amplitude $\displaystyle (-1)^{x \cdot z} + (-1)^{y \cdot z}$ cancels out. Explicitly, $(-1)^{x \cdot z} + (-1)^{y \cdot z} =\begin{cases} 0, & x \cdot z \neq y \cdot z,\\ \pm 2, & x \cdot z = y \cdot z.\end{cases}$
So only the basis states $|z\rangle$ satisfying $x \cdot z = y \cdot z \pmod{2}$ have nonzero amplitude. Since $y = x \oplus s$ , we find that $x \cdot z = (x \oplus s) \cdot z \pmod{2} \;\Rightarrow\; s \cdot z = 0 \pmod{2}$ . Thus, each measurement produces a random bit string $z$ orthogonal to $s$ .
Repeat the Process
Repeat steps 1–6 about $n$ times. Each iteration yields a new, independent equation of the form $z \cdot s = 0$ involving the hidden string $s$ . Since $s$ has $n$ independent components, only $n$ such linear equations are required to determine it. This results in an exponential reduction in the number of queries needed.
Classical Postprocessing: Solve the System of Equations
Once a sufficient number of equations have been collected:
$z_1 \cdot s = 0$ , $z_2 \cdot s = 0$ , … , $z_m \cdot s = 0$ .
Use classical linear algebra methods (such as Gaussian elimination) to solve for the hidden string $s$ .

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28