Chapter 2: The Mathematics of Information

Quantum computing lives at the intersection of physics and mathematics. Before we can understand qubits, superposition, or entanglement, we need a small but powerful toolkit of mathematical ideas. None of them are individually difficult, but together they form the language in which quantum computing is written.

This chapter introduces five mathematical pillars: number systems, probability, vectors, matrices, and complex numbers. If you have basic algebra under your belt, you have everything you need to follow along. We will build each idea from scratch, connect it to concrete intuition, and show you why quantum computing demands it. By the end, you will have every mathematical tool required for the rest of this textbook.

2.1 Counting and Number Systems

We grow up counting in base 10 - the decimal system - because we happen to have ten fingers. But there is nothing sacred about the number ten. A number system is just a way of representing quantities using a fixed set of symbols, and different bases turn out to be useful for different purposes.

Decimal (Base 10)

In decimal, we have ten digits: 0 through 9. The position of each digit tells you which power of 10 it represents. For example, the number 347 means:

$$347 = 3 \times 10^2 + 4 \times 10^1 + 7 \times 10^0 = 300 + 40 + 7$$

Each position is worth ten times more than the position to its right. This is so familiar that we rarely think about it, but the same principle works for any base.

Binary (Base 2)

Computers use base 2 because their fundamental building blocks - transistors - have two states: on and off, high voltage and low voltage, 1 and 0. In binary, we have only two digits (called bits), and each position represents a power of 2.

$$1011_2 = 1 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 1 \times 2^0 = 8 + 0 + 2 + 1 = 11_{10}$$

The powers of 2 that you will see constantly are: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. Memorizing this sequence up to $2^{10} = 1024$ will save you time throughout this textbook.

Key Concept.

A single bit stores exactly one binary digit: 0 or 1. With $n$ bits, you can represent $2^n$ distinct values. This exponential relationship between bits and states is the seed from which quantum computing's power will grow. A qubit, as we will see, exploits this same exponential scaling in a fundamentally richer way.

Hexadecimal (Base 16)

Hexadecimal (hex) uses sixteen symbols: 0-9 and A-F (where A = 10, B = 11, ..., F = 15). Hex is popular in computing because each hex digit corresponds to exactly four binary digits, making it a compact way to write binary data.

$$\text{1F}_{16} = 1 \times 16^1 + 15 \times 16^0 = 16 + 15 = 31_{10} = 11111_2$$

The conversion between hex and binary is mechanical: replace each hex digit with its four-bit binary equivalent. For instance, $\text{A3}_{16}$ becomes $1010\;0011_2$ because A = 1010 and 3 = 0011.

Why Computers Use Binary, Why Programmers Use Hex

Binary is the natural language of hardware because transistors are reliable switches: clearly on or clearly off. Trying to distinguish ten different voltage levels would introduce far more errors than distinguishing just two. But binary strings get long quickly - the decimal number 255 is $11111111_2$ (eight digits). Hexadecimal compresses this to just $\text{FF}_{16}$ (two digits), because each hex digit encodes exactly one group of four bits. This makes hex a convenient shorthand for programmers who need to inspect raw binary data.

Converting Between Bases

Any base to decimal: expand the positional notation and add up the terms, as shown above.

Decimal to another base: repeatedly divide by the target base and collect the remainders from bottom to top. For example, to convert $25_{10}$ to binary:

$25 \div 2 = 12$ remainder $1$
$12 \div 2 = 6$ remainder $0$
$6 \div 2 = 3$ remainder $0$
$3 \div 2 = 1$ remainder $1$
$1 \div 2 = 0$ remainder $1$

Reading the remainders from bottom to top: $25_{10} = 11001_2$. You can verify: $1 \times 16 + 1 \times 8 + 0 \times 4 + 0 \times 2 + 1 \times 1 = 25$. Correct.

Hex to/from binary: simply replace each hex digit with its four-bit pattern (or vice versa). No division required.

2.2 Probability: Quantifying Uncertainty

In everyday life, we deal with uncertainty constantly: will it rain tomorrow? What are the odds of rolling a six? Probability gives us a precise language for reasoning about uncertain outcomes, and it turns out to be inseparable from quantum mechanics. When you measure a qubit, you do not always get a deterministic answer - you get a probabilistic one. The math of probability is therefore not optional; it is woven into the very fabric of quantum computing.

Sample Spaces and Events

A sample space $S$ is the set of all possible outcomes of an experiment. For a coin flip, $S = \{H, T\}$. For a six-sided die, $S = \{1, 2, 3, 4, 5, 6\}$. An event is any subset of the sample space. For example, "rolling an even number" is the event $\{2, 4, 6\}$.

The probability of an event $A$, written $P(A)$, is a number between 0 and 1:

$$0 \leq P(A) \leq 1$$

$P(A) = 0$ means the event is impossible. $P(A) = 1$ means it is certain. For a fair die, each face has probability $\frac{1}{6}$, and the probability of rolling an even number is:

$$P(\text{even}) = P(2) + P(4) + P(6) = \frac{1}{6} + \frac{1}{6} + \frac{1}{6} = \frac{1}{2}$$

A fundamental requirement is that the probabilities of all outcomes in the sample space must add up to 1:

$$\sum_{x \in S} P(x) = 1$$

Key Concept.

The rule that probabilities sum to 1 has a direct quantum analogue. When we describe a qubit's state, the squared magnitudes of its amplitudes must also sum to 1. This is called the normalization condition, and it ensures that when you measure, you get exactly one outcome with certainty.

Probability Rules

Three basic rules let you compute the probability of compound events:

Complement rule. The probability that an event does not happen is one minus the probability that it does:

$$P(\text{not } A) = 1 - P(A)$$

Addition rule. For two events $A$ and $B$, the probability that at least one occurs is:

$$P(A \text{ or } B) = P(A) + P(B) - P(A \text{ and } B)$$

We subtract $P(A \text{ and } B)$ to avoid double-counting outcomes that belong to both events. If $A$ and $B$ are mutually exclusive (they cannot both happen), then $P(A \text{ and } B) = 0$ and the formula simplifies to $P(A \text{ or } B) = P(A) + P(B)$.

Multiplication rule. For independent events - those where the occurrence of one does not affect the other - the probability that both occur is:

$$P(A \text{ and } B) = P(A) \times P(B)$$

For example, the probability of flipping two heads in a row with a fair coin is $\frac{1}{2} \times \frac{1}{2} = \frac{1}{4}$.

Conditional Probability and Bayes' Rule

Sometimes events are not independent. The conditional probability of $A$ given $B$, written $P(A \mid B)$, is the probability that $A$ occurs if we know that $B$ has already occurred:

$$P(A \mid B) = \frac{P(A \text{ and } B)}{P(B)}$$

Bayes' rule lets you "reverse" conditional probabilities. If you know $P(B \mid A)$ and want $P(A \mid B)$, Bayes' rule says:

$$P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)}$$

Here is a concrete example that shows why Bayes' rule matters. Suppose a medical test for a rare disease is 99% accurate: it correctly identifies someone with the disease 99% of the time (sensitivity) and correctly clears someone without it 99% of the time (specificity). The disease affects 1 in 1000 people. If you test positive, what is the probability you actually have the disease?

Let $D$ = having the disease and $+$ = testing positive. Then:

$P(D) = 0.001$ (prevalence)
$P(+ \mid D) = 0.99$ (true positive rate)
$P(+ \mid \text{no } D) = 0.01$ (false positive rate)

$$P(D \mid +) = \frac{P(+ \mid D) \, P(D)}{P(+)} = \frac{0.99 \times 0.001}{0.99 \times 0.001 + 0.01 \times 0.999} \approx \frac{0.00099}{0.01098} \approx 0.09$$

Despite the test being 99% accurate, a positive result only means about a 9% chance of having the disease. The low prevalence overwhelms the test's accuracy. This kind of counterintuitive reasoning appears in quantum measurement theory, where prior knowledge about a quantum state affects how we interpret measurement results.

Common Misconception.

Many people confuse the accuracy of a test with the probability of having the condition given a positive result. These are fundamentally different quantities. Bayes' rule teaches us that the base rate (how common the condition is) matters enormously. In quantum computing, a similar principle applies: the prior state of a qubit determines the probability distribution of measurement outcomes.

Random Variables, Expectation, and Variance

A random variable $X$ assigns a numerical value to each outcome in a sample space. For a die roll, $X$ might simply be the number showing on the face. The expected value (or mean) of $X$ is the long-run average you would observe over many repetitions:

$$E(X) = \sum_{x} x \cdot P(X = x)$$

For a fair die: $E(X) = 1 \cdot \frac{1}{6} + 2 \cdot \frac{1}{6} + \cdots + 6 \cdot \frac{1}{6} = \frac{21}{6} = 3.5$. Note that the expected value need not be a value the random variable can actually take.

The variance measures how spread out the values are around the mean:

$$\text{Var}(X) = E\!\left[(X - E(X))^2\right] = E(X^2) - [E(X)]^2$$

A small variance means the outcomes cluster tightly around the mean; a large variance means they are widely spread.

Note.

In quantum computing, measurement outcomes are inherently probabilistic. The Born rule - the quantum analogue of classical probability - tells us that the probability of measuring a particular outcome equals the squared magnitude of the corresponding amplitude. The expected value of a measurement is called an expectation value in quantum mechanics, and it is computed using exactly the same formula you just learned.

Use the widget below to explore the sample space of rolling two dice. Click on cells in the grid to highlight events and see their probabilities.

2.3 Vectors: Arrows That Encode Information

A vector is one of the most useful objects in all of mathematics. At its simplest, a vector is an arrow - it has a direction and a length (magnitude). But vectors are much more than geometric arrows. They are the fundamental language for describing quantum states. When we write the state of a qubit, we write it as a vector. Understanding vectors is therefore not just helpful for quantum computing - it is essential.

Vectors in Two Dimensions

Picture a 2D coordinate plane. A vector $\vec{v}$ can be drawn as an arrow from the origin to a point, and we write it as a pair of numbers stacked vertically:

$$\vec{v} = \begin{pmatrix} v_x \\ v_y \end{pmatrix}$$

The number $v_x$ is the horizontal component and $v_y$ is the vertical component. For example, $\vec{v} = \begin{pmatrix} 3 \\ 2 \end{pmatrix}$ is an arrow pointing 3 units right and 2 units up from the origin.

Vector Addition

To add two vectors, add their corresponding components:

$$\begin{pmatrix} a_x \\ a_y \end{pmatrix} + \begin{pmatrix} b_x \\ b_y \end{pmatrix} = \begin{pmatrix} a_x + b_x \\ a_y + b_y \end{pmatrix}$$

Geometrically, this is the "tip-to-tail" rule: place the tail of $\vec{b}$ at the tip of $\vec{a}$, and the sum $\vec{a} + \vec{b}$ is the arrow from the origin to the new tip. The two original vectors and their sum form a parallelogram - which is why this is also called the parallelogram rule.

Concrete example: $\begin{pmatrix} 3 \\ 1 \end{pmatrix} + \begin{pmatrix} 1 \\ 3 \end{pmatrix} = \begin{pmatrix} 4 \\ 4 \end{pmatrix}$.

Scalar Multiplication

Multiplying a vector by a number (a scalar) $c$ scales every component:

$$c \begin{pmatrix} v_x \\ v_y \end{pmatrix} = \begin{pmatrix} c \, v_x \\ c \, v_y \end{pmatrix}$$

If $c > 1$, the arrow gets longer. If $0 < c < 1$, it gets shorter. If $c < 0$, it flips direction. If $c = 0$, it collapses to the zero vector.

Magnitude (Norm) and Direction

The magnitude (or length, or norm) of a vector $\vec{v} = \begin{pmatrix} v_x \\ v_y \end{pmatrix}$ is given by the Pythagorean theorem:

$$\|\vec{v}\| = \sqrt{v_x^2 + v_y^2}$$

For example, $\left\|\begin{pmatrix} 3 \\ 4 \end{pmatrix}\right\| = \sqrt{9 + 16} = \sqrt{25} = 5$.

The direction of a vector is the angle it makes with the positive $x$-axis. Two vectors can have the same direction but different magnitudes, or the same magnitude but different directions.

Unit Vectors and Normalization

A unit vector is a vector with magnitude 1. You can turn any nonzero vector into a unit vector by dividing by its magnitude:

$$\hat{v} = \frac{\vec{v}}{\|\vec{v}\|}$$

For instance, normalizing $\begin{pmatrix} 3 \\ 4 \end{pmatrix}$: $\hat{v} = \frac{1}{5}\begin{pmatrix} 3 \\ 4 \end{pmatrix} = \begin{pmatrix} 0.6 \\ 0.8 \end{pmatrix}$. You can verify: $\sqrt{0.6^2 + 0.8^2} = \sqrt{0.36 + 0.64} = 1$.

Key Concept.

In quantum computing, the state of a qubit is always represented by a unit vector. The requirement that the state vector has length 1 is precisely the normalization condition from probability: it ensures measurement probabilities sum to 1. This is why unit vectors are so important - they are the only physically valid quantum states.

The Dot Product (Inner Product)

The dot product of two vectors measures how much they point in the same direction:

$$\vec{a} \cdot \vec{b} = a_x b_x + a_y b_y = \|\vec{a}\| \, \|\vec{b}\| \cos\theta$$

where $\theta$ is the angle between them. The second form reveals the geometric meaning: the dot product is large and positive when vectors point in similar directions, zero when they are perpendicular, and negative when they point in roughly opposite directions.

If the dot product is zero, the vectors are orthogonal (perpendicular). For example, $\begin{pmatrix} 1 \\ 0 \end{pmatrix} \cdot \begin{pmatrix} 0 \\ 1 \end{pmatrix} = 1 \cdot 0 + 0 \cdot 1 = 0$ - these two vectors are indeed perpendicular.

Two orthogonal unit vectors form a basis - a coordinate system in which any vector can be expressed as a combination of them. The standard basis vectors $\vec{e}_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$ and $\vec{e}_2 = \begin{pmatrix} 0 \\ 1 \end{pmatrix}$ are orthogonal unit vectors, and any 2D vector can be written as $\vec{v} = v_x \vec{e}_1 + v_y \vec{e}_2$.

Note.

In quantum computing, the standard basis for a single qubit is $|0\rangle = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$ and $|1\rangle = \begin{pmatrix} 0 \\ 1 \end{pmatrix}$. These are exactly the standard basis vectors with fancy notation (called "ket" notation). Any qubit state can be written as $|\psi\rangle = \alpha|0\rangle + \beta|1\rangle$, which is a linear combination of basis vectors - the same idea you just learned.

Beyond Two Dimensions

Everything above extends naturally. A 3D vector has three components: $\vec{v} = \begin{pmatrix} v_x \\ v_y \\ v_z \end{pmatrix}$, with magnitude $\|\vec{v}\| = \sqrt{v_x^2 + v_y^2 + v_z^2}$. In quantum computing, we work with vectors that have 2, 4, 8, or even $2^n$ components - one for each possible state of $n$ qubits. The algebra stays the same; only the number of components grows.

Use the visualizer below to explore vectors in 2D. Draw vectors, see their sum, and compute dot products.

2.4 Matrices: Transformations on Vectors

If vectors are the nouns of linear algebra, matrices are the verbs. A matrix is a rectangular grid of numbers that describes a transformation - a way of turning one vector into another. In quantum computing, every operation on a qubit (every quantum gate) is a matrix. When we apply a gate to a qubit, we are multiplying the qubit's state vector by a matrix. This is why understanding matrices is absolutely essential.

What Is a Matrix?

A matrix is a rectangular array of numbers arranged in rows and columns. A matrix with $m$ rows and $n$ columns is called an $m \times n$ matrix. For quantum computing, we will mostly work with $2 \times 2$ matrices (which act on single-qubit states). A $2 \times 2$ matrix looks like:

$$M = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$$

The entry in row $i$, column $j$ is denoted $M_{ij}$. In the matrix above, $M_{11} = a$, $M_{12} = b$, $M_{21} = c$, and $M_{22} = d$.

Matrix-Vector Multiplication: Step by Step

The most important matrix operation for us is multiplying a matrix by a vector. For a $2 \times 2$ matrix and a 2D vector:

$$\begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} ax + by \\ cx + dy \end{pmatrix}$$

Each entry of the result is a dot product of one row of the matrix with the input vector. Let us work through a concrete example. Consider the matrix $M = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}$ and the vector $\vec{v} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$:

$$M\vec{v} = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 1 \cdot 1 + 2 \cdot 0 \\ 3 \cdot 1 + 4 \cdot 0 \end{pmatrix} = \begin{pmatrix} 1 \\ 3 \end{pmatrix}$$

The matrix has transformed $(1, 0)$ into $(1, 3)$. Now let us try a rotation. The matrix that rotates vectors by 90 degrees counterclockwise is:

$$R_{90} = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}$$

Applying it to $\vec{v} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$:

$$R_{90}\vec{v} = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 0 \\ 1 \end{pmatrix}$$

The vector $(1, 0)$ pointing right has been rotated to $(0, 1)$ pointing up - exactly 90 degrees counterclockwise. A matrix is a machine that eats a vector and produces a new one.

Matrix-Matrix Multiplication

You can also multiply two matrices together. The result is a new matrix that represents doing both transformations in sequence:

$$\begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} e & f \\ g & h \end{pmatrix} = \begin{pmatrix} ae+bg & af+bh \\ ce+dg & cf+dh \end{pmatrix}$$

Each entry of the result is a dot product of a row from the first matrix with a column from the second.

Common Misconception.

Matrix multiplication is not commutative. In general, $AB \neq BA$. The order matters. Rotating then scaling is different from scaling then rotating. In quantum computing, this means the order in which you apply gates matters.

Key Concept.

In quantum computing, applying gate $A$ followed by gate $B$ is represented by the matrix product $BA$ (note the reversed order - the gate applied first goes on the right, closest to the state vector). We evaluate $B(A|\psi\rangle) = (BA)|\psi\rangle$.

The Identity Matrix

The identity matrix $I$ is the "do nothing" transformation:

$$I = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}$$

For any vector $\vec{v}$, $I\vec{v} = \vec{v}$. For any matrix $M$, $MI = IM = M$. It is the matrix equivalent of multiplying by 1.

The Determinant

The determinant of a $2 \times 2$ matrix is a single number that tells you how the matrix scales areas:

$$\det\begin{pmatrix} a & b \\ c & d \end{pmatrix} = ad - bc$$

If the determinant is positive, the matrix preserves orientation. If negative, it flips orientation (like a reflection). If zero, the matrix collapses 2D space into a line or a point - it squashes information irreversibly.

Example: $\det\begin{pmatrix} 2 & 1 \\ 0 & 3 \end{pmatrix} = 2 \cdot 3 - 1 \cdot 0 = 6$. This matrix scales areas by a factor of 6.

Matrix Inverse

A matrix $M$ has an inverse $M^{-1}$ if $M M^{-1} = M^{-1} M = I$. The inverse "undoes" the transformation. For a $2 \times 2$ matrix:

$$M^{-1} = \frac{1}{ad - bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}$$

This formula only works when $ad - bc \neq 0$ (the determinant is nonzero). A matrix with zero determinant has no inverse - it is called singular.

Special Matrices

Two types of special matrices will appear frequently in later chapters:

Symmetric matrices satisfy $M = M^T$ (the matrix equals its transpose - rows and columns are swapped). They have the property that all their eigenvalues are real numbers.
Orthogonal matrices satisfy $M^T M = I$ (the transpose is the inverse). They preserve lengths and angles - think of rotations and reflections. These are the real-number version of unitary matrices, which are central to quantum computing.

Note.

Quantum gates are always reversible - every quantum gate matrix has an inverse. In fact, quantum gate matrices are unitary, which means their inverse equals their conjugate transpose ($M^{-1} = M^\dagger$). This is a stronger condition than merely being invertible, and it is what preserves the normalization of quantum states. We will explore unitary matrices in detail when we meet our first quantum gates.

Use the calculator below to experiment with $2 \times 2$ matrix-vector multiplication and see the geometric transformation.

2.5 Complex Numbers: The Missing Piece

If you have worked only with real numbers, you might wonder: why would we ever need anything else? The answer comes from a simple equation that has no real solution:

$$x^2 = -1$$

No real number, when squared, gives a negative result. Mathematicians resolved this by inventing a new number, $i$, defined by the property:

$$i^2 = -1$$

The number $i$ is called the imaginary unit. This is an unfortunate name - there is nothing imaginary about it. Complex numbers are as real and useful as any other kind of number. They show up whenever you need to describe phenomena involving rotation, oscillation, or waves - which is to say, they show up everywhere in physics, and especially in quantum mechanics.

The Anatomy of a Complex Number

A complex number $z$ has the form:

$$z = a + bi$$

where $a$ is the real part and $b$ is the imaginary part. Both $a$ and $b$ are ordinary real numbers. We write $\text{Re}(z) = a$ and $\text{Im}(z) = b$.

Arithmetic with Complex Numbers

Arithmetic follows the rules you already know, with the extra rule that $i^2 = -1$:

Addition: $(a + bi) + (c + di) = (a+c) + (b+d)i$
Subtraction: $(a + bi) - (c + di) = (a-c) + (b-d)i$
Multiplication: $(a + bi)(c + di) = ac + adi + bci + bdi^2 = (ac - bd) + (ad + bc)i$

Concrete example: $(3 + 2i)(1 - i) = 3 \cdot 1 + 3 \cdot (-i) + 2i \cdot 1 + 2i \cdot (-i) = 3 - 3i + 2i - 2i^2 = 3 - i - 2(-1) = 5 - i$.

The Complex Plane

Every complex number can be plotted on a 2D plane, with the real part on the horizontal axis and the imaginary part on the vertical axis. This is the complex plane (also called the Argand diagram). The complex number $3 + 2i$ sits at the point $(3, 2)$.

Notice the connection to vectors: a complex number $a + bi$ can be thought of as a vector $\begin{pmatrix} a \\ b \end{pmatrix}$ in the complex plane. This geometric view will help us understand what complex multiplication really does.

Magnitude (Modulus) and Conjugate

The magnitude (or modulus, or absolute value) of $z = a + bi$ is the distance from the origin to the point $(a, b)$:

$$|z| = \sqrt{a^2 + b^2}$$

The complex conjugate of $z = a + bi$ is:

$$\bar{z} = z^* = a - bi$$

It reflects $z$ across the real axis. A crucial identity links the conjugate to the magnitude:

$$z \bar{z} = (a + bi)(a - bi) = a^2 + b^2 = |z|^2$$

This identity is used constantly in quantum mechanics to compute probabilities.

Key Concept.

In quantum mechanics, the probability of a measurement outcome is given by $|z|^2$, where $z$ is a complex amplitude. This is the Born rule. The complex conjugate is essential because $|z|^2 = z \bar{z}$ is how we extract real-valued probabilities from complex-valued amplitudes. Complex numbers carry both magnitude (probability information) and phase (interference information).

Polar Form

Instead of specifying a complex number by its real and imaginary parts (rectangular form), we can use its magnitude $r = |z|$ and its angle $\theta$ (measured counterclockwise from the positive real axis):

$$z = r(\cos\theta + i\sin\theta)$$

The angle $\theta$ is called the phase or argument of $z$. Two complex numbers with the same magnitude but different phases sit on the same circle centered at the origin, but at different positions on that circle.

Converting between forms: given $z = a + bi$, we have $r = \sqrt{a^2 + b^2}$ and $\theta = \arctan(b/a)$ (with appropriate attention to which quadrant the point lies in). Conversely, $a = r\cos\theta$ and $b = r\sin\theta$.

Euler's Formula

One of the most beautiful results in mathematics connects the exponential function to trigonometry:

$$e^{i\theta} = \cos\theta + i\sin\theta$$

This is Euler's formula. It can be proved using the Taylor series expansions of $e^x$, $\cos x$, and $\sin x$. The key insight: $e^{i\theta}$ traces out the unit circle in the complex plane as $\theta$ varies from $0$ to $2\pi$.

Using Euler's formula, the polar form becomes wonderfully compact:

$$z = r e^{i\theta}$$

Multiplication in polar form is especially elegant - you multiply magnitudes and add angles:

$$z_1 z_2 = r_1 r_2 \, e^{i(\theta_1 + \theta_2)}$$

This means multiplying by $e^{i\theta}$ is a pure rotation by angle $\theta$ - the magnitude stays the same, but the direction changes. When $\theta = \pi$, Euler's formula gives us the famous identity:

$$e^{i\pi} + 1 = 0$$

This single equation links five of the most important constants in mathematics: $e$, $i$, $\pi$, $1$, and $0$.

Why Quantum Mechanics Needs Complex Numbers

You might wonder: could we do quantum computing with just real numbers? The answer is no, and here is the intuition. Quantum mechanics is fundamentally a theory of interference. Two quantum states can combine and either reinforce each other (constructive interference) or cancel each other out (destructive interference), depending on their relative phase.

Real numbers can only be positive or negative, giving you two options: add up or cancel. Complex numbers, with their full 360 degrees of phase, allow for a much richer set of interference patterns. The phase $\theta$ in $e^{i\theta}$ is the invisible "clock hand" that determines how quantum states interact. Without it, quantum algorithms like Shor's factoring algorithm or Grover's search algorithm would not work, because they rely on carefully engineering interference to amplify correct answers and suppress wrong ones.

Note.

When we write a qubit state as $|\psi\rangle = \alpha|0\rangle + \beta|1\rangle$, the amplitudes $\alpha$ and $\beta$ are complex numbers. The probabilities of measuring 0 and 1 are $|\alpha|^2$ and $|\beta|^2$, but the phases of $\alpha$ and $\beta$ are equally important - they determine how the qubit interferes with other qubits and with itself after quantum gate operations.

Explore complex numbers on the complex plane below. Enter a number and see it plotted, along with its conjugate, a rotation, or its square.

2.6 Putting It All Together

Let us take a moment to see how the five topics of this chapter form a unified toolkit for quantum computing.

Number systems taught us that information can be encoded in different bases, and that binary encoding is the foundation of classical computing. With $n$ bits, we can represent $2^n$ states.

Probability taught us how to reason about uncertain outcomes and how probabilities must sum to 1. Quantum measurement is inherently probabilistic, so this framework carries directly into the quantum world.

Vectors gave us a way to encode information as arrows in space. Quantum states are unit vectors, and the components of those vectors (the amplitudes) encode the probabilities of different measurement outcomes.

Matrices showed us how to transform vectors. Every quantum gate is a matrix that transforms quantum states while preserving their unit length.

Complex numbers gave us the final ingredient: phase. Quantum amplitudes are complex, and the phases they carry determine interference patterns that make quantum algorithms powerful.

Together, these form the complete mathematical vocabulary you need. In the next chapter, we will see how classical computing can be made reversible - the crucial bridge between the classical world and the quantum world. And in Part II, we will put every tool from this chapter to immediate use when we write down the state of our first qubit.