Chapter 2: The Mathematics of Information

Quantum computing lives at the intersection of physics and mathematics. Before we can understand qubits, superposition, or entanglement, we need a small but powerful toolkit of mathematical ideas. None of them are individually difficult, but together they form the language in which quantum computing is written.

This chapter introduces five mathematical pillars: number systems, probability, vectors, matrices, and complex numbers. If you have basic algebra under your belt, you have everything you need to follow along. We will build each idea from scratch, connect it to concrete intuition, and show you why quantum computing demands it. By the end, you will have every mathematical tool required for the rest of this textbook.

2.1 Counting and Number Systems

We grow up counting in base 10 - the decimal system - because we happen to have ten fingers. But there is nothing sacred about the number ten. A number system is just a way of representing quantities using a fixed set of symbols, and different bases turn out to be useful for different purposes.

Decimal (Base 10)

In decimal, we have ten digits: 0 through 9. The position of each digit tells you which power of 10 it represents. For example, the number 347 means:

$$347 = 3 \times 10^2 + 4 \times 10^1 + 7 \times 10^0 = 300 + 40 + 7$$

Each position is worth ten times more than the position to its right. This is so familiar that we rarely think about it, but the same principle works for any base.

Binary (Base 2)

Computers use base 2 because their fundamental building blocks - transistors - have two states: on and off, high voltage and low voltage, 1 and 0. In binary, we have only two digits (called bits), and each position represents a power of 2.

$$1011_2 = 1 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 1 \times 2^0 = 8 + 0 + 2 + 1 = 11_{10}$$

The powers of 2 that you will see constantly are: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. Memorizing this sequence up to $2^{10} = 1024$ will save you time throughout this textbook.

Key Idea. A single bit stores exactly one binary digit: 0 or 1. With $n$ bits, you can represent $2^n$ distinct values. This exponential relationship between bits and states is the seed from which quantum computing's power will grow. A qubit, as we will see, exploits this same exponential scaling in a fundamentally richer way.

Hexadecimal (Base 16)

Hexadecimal (hex) uses sixteen symbols: 0-9 and A-F (where A = 10, B = 11, ..., F = 15). Hex is popular in computing because each hex digit corresponds to exactly four binary digits, making it a compact way to write binary data.

$$\text{1F}_{16} = 1 \times 16^1 + 15 \times 16^0 = 16 + 15 = 31_{10} = 11111_2$$

The conversion between hex and binary is mechanical: replace each hex digit with its four-bit binary equivalent. For instance, $\text{A3}_{16}$ becomes $1010\;0011_2$ because A = 1010 and 3 = 0011.

Converting Between Bases

To convert from any base to decimal, expand the positional notation and add up the terms, as shown above. To convert from decimal to another base, repeatedly divide by the target base and collect the remainders from bottom to top. For example, to convert $25_{10}$ to binary:

  • $25 \div 2 = 12$ remainder $1$
  • $12 \div 2 = 6$ remainder $0$
  • $6 \div 2 = 3$ remainder $0$
  • $3 \div 2 = 1$ remainder $1$
  • $1 \div 2 = 0$ remainder $1$

Reading the remainders from bottom to top: $25_{10} = 11001_2$.

Try the interactive converter below to build intuition for how the same number looks in different bases.

Base Converter

2.2 Probability: Quantifying Uncertainty

In everyday life, we deal with uncertainty constantly: will it rain tomorrow? What are the odds of rolling a six? Probability gives us a precise language for reasoning about uncertain outcomes, and it turns out to be inseparable from quantum mechanics. When you measure a qubit, you do not always get a deterministic answer - you get a probabilistic one. The math of probability is therefore not optional; it is woven into the very fabric of quantum computing.

Sample Spaces and Events

A sample space $S$ is the set of all possible outcomes of an experiment. For a coin flip, $S = \{H, T\}$. For a six-sided die, $S = \{1, 2, 3, 4, 5, 6\}$. An event is any subset of the sample space. For example, "rolling an even number" is the event $\{2, 4, 6\}$.

The probability of an event $A$, written $P(A)$, is a number between 0 and 1:

$$0 \leq P(A) \leq 1$$

$P(A) = 0$ means the event is impossible. $P(A) = 1$ means it is certain. For a fair die, each face has probability $\frac{1}{6}$, and the probability of rolling an even number is:

$$P(\text{even}) = P(2) + P(4) + P(6) = \frac{1}{6} + \frac{1}{6} + \frac{1}{6} = \frac{1}{2}$$

A fundamental requirement is that the probabilities of all outcomes in the sample space must add up to 1:

$$\sum_{x \in S} P(x) = 1$$
Key Idea. The rule that probabilities sum to 1 has a direct quantum analogue. When we describe a qubit's state, the squared magnitudes of its amplitudes must also sum to 1. This is called the normalization condition, and it ensures that when you measure, you get exactly one outcome with certainty.

Independent Events

Two events $A$ and $B$ are independent if the occurrence of one does not affect the probability of the other. Flipping a coin twice gives independent outcomes. For independent events, the probability of both occurring is the product of their individual probabilities:

$$P(A \text{ and } B) = P(A) \times P(B)$$

For example, the probability of flipping two heads in a row with a fair coin is $\frac{1}{2} \times \frac{1}{2} = \frac{1}{4}$.

Conditional Probability

Sometimes events are not independent. The conditional probability of $A$ given $B$, written $P(A | B)$, is the probability that $A$ occurs if we know that $B$ has already occurred:

$$P(A | B) = \frac{P(A \text{ and } B)}{P(B)}$$

Consider a bag with 3 red and 2 blue marbles. You draw one marble without looking, then draw a second. What is the probability the second marble is red, given the first was red?

After removing one red marble, the bag has 2 red and 2 blue marbles left:

$$P(\text{2nd red} | \text{1st red}) = \frac{2}{4} = \frac{1}{2}$$

Bayes' Rule

Bayes' rule lets you "reverse" conditional probabilities. If you know $P(B | A)$ and want $P(A | B)$, Bayes' rule says:

$$P(A | B) = \frac{P(B | A) \, P(A)}{P(B)}$$

Here is a concrete example. Suppose a medical test for a rare disease is 99% accurate (it correctly identifies someone with the disease 99% of the time and correctly clears someone without it 99% of the time). The disease affects 1 in 1000 people. If you test positive, what is the probability you actually have the disease?

Let $D$ = having the disease and $+$ = testing positive. Then:

  • $P(D) = 0.001$ (prevalence)
  • $P(+ | D) = 0.99$ (true positive rate)
  • $P(+ | \text{no } D) = 0.01$ (false positive rate)
$$P(D | +) = \frac{P(+ | D) \, P(D)}{P(+)} = \frac{0.99 \times 0.001}{0.99 \times 0.001 + 0.01 \times 0.999} \approx \frac{0.00099}{0.01098} \approx 0.09$$

Despite the test being 99% accurate, a positive result only means about a 9% chance of having the disease. The low prevalence of the disease overwhelms the test's accuracy. This kind of counterintuitive reasoning shows up frequently in quantum measurement theory, where prior knowledge about a quantum state affects how we interpret measurement results.

Looking ahead. In quantum computing, measurement outcomes are inherently probabilistic. The Born rule - the quantum analogue of classical probability - tells us that the probability of measuring a particular outcome equals the squared magnitude of the corresponding amplitude. The mathematical structure of probability that you have just learned carries directly into the quantum world.

2.3 Vectors: Arrows That Encode Information

A vector is one of the most useful objects in all of mathematics. At its simplest, a vector is an arrow - it has a direction and a length (magnitude). But vectors are much more than geometric arrows. They are the fundamental language for describing quantum states. When we write the state of a qubit, we write it as a vector. Understanding vectors is therefore not just helpful for quantum computing - it is essential.

Vectors in Two Dimensions

A 2D vector $\vec{v}$ can be written as a pair of numbers:

$$\vec{v} = \begin{pmatrix} v_x \\ v_y \end{pmatrix}$$

The number $v_x$ is the horizontal component and $v_y$ is the vertical component. Geometrically, $\vec{v}$ is an arrow from the origin $(0,0)$ to the point $(v_x, v_y)$. For example, $\vec{v} = \begin{pmatrix} 3 \\ 2 \end{pmatrix}$ is an arrow pointing 3 units right and 2 units up.

Vector Addition

To add two vectors, add their corresponding components:

$$\begin{pmatrix} a_x \\ a_y \end{pmatrix} + \begin{pmatrix} b_x \\ b_y \end{pmatrix} = \begin{pmatrix} a_x + b_x \\ a_y + b_y \end{pmatrix}$$

Geometrically, this is the "tip-to-tail" rule: place the tail of $\vec{b}$ at the tip of $\vec{a}$, and the sum $\vec{a} + \vec{b}$ is the arrow from the tail of $\vec{a}$ to the tip of $\vec{b}$. This forms a parallelogram.

Scalar Multiplication

Multiplying a vector by a number (a scalar) $c$ scales every component:

$$c \begin{pmatrix} v_x \\ v_y \end{pmatrix} = \begin{pmatrix} c \, v_x \\ c \, v_y \end{pmatrix}$$

If $c > 1$, the arrow gets longer. If $0 < c < 1$, it gets shorter. If $c < 0$, it flips direction. If $c = 0$, it collapses to the zero vector.

Magnitude and Unit Vectors

The magnitude (length) of a vector $\vec{v} = \begin{pmatrix} v_x \\ v_y \end{pmatrix}$ is given by the Pythagorean theorem:

$$\|\vec{v}\| = \sqrt{v_x^2 + v_y^2}$$

A unit vector is a vector with magnitude 1. You can turn any nonzero vector into a unit vector by dividing by its magnitude: $\hat{v} = \frac{\vec{v}}{\|\vec{v}\|}$.

Key Idea. In quantum computing, the state of a qubit is always represented by a unit vector. The requirement that the state vector has length 1 is precisely the normalization condition from probability: it ensures measurement probabilities sum to 1. This is why unit vectors are so important - they are the only physically valid quantum states.

The Dot Product

The dot product of two vectors measures how much they point in the same direction:

$$\vec{a} \cdot \vec{b} = a_x b_x + a_y b_y = \|\vec{a}\| \, \|\vec{b}\| \cos\theta$$

where $\theta$ is the angle between them. If the dot product is zero, the vectors are orthogonal (perpendicular). Two orthogonal unit vectors form a basis - a coordinate system in which any vector can be expressed as a combination of them.

For example, the standard basis vectors $\vec{e}_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$ and $\vec{e}_2 = \begin{pmatrix} 0 \\ 1 \end{pmatrix}$ are orthogonal unit vectors, and any 2D vector can be written as $\vec{v} = v_x \vec{e}_1 + v_y \vec{e}_2$.

Extending to Three Dimensions and Beyond

Everything above extends naturally. A 3D vector has three components: $\vec{v} = \begin{pmatrix} v_x \\ v_y \\ v_z \end{pmatrix}$, with magnitude $\|\vec{v}\| = \sqrt{v_x^2 + v_y^2 + v_z^2}$. In quantum computing, we will work with vectors that have 2, 4, 8, or even $2^n$ components - one for each possible state of $n$ qubits. The algebra stays the same; only the number of components grows.

Use the visualizer below to explore how vectors add and scale in 2D.

2D Vector Visualizer
Vector A
Vector B
Operation

2.4 Matrices: Transformations on Vectors

If vectors are the nouns of linear algebra, matrices are the verbs. A matrix is a rectangular grid of numbers that describes a transformation - a way of turning one vector into another. In quantum computing, every operation on a qubit (every quantum gate) is a matrix. When we apply a gate to a qubit, we are multiplying the qubit's state vector by a matrix. This is why understanding matrices is absolutely essential.

What Is a Matrix?

A matrix is a rectangular array of numbers arranged in rows and columns. A matrix with $m$ rows and $n$ columns is called an $m \times n$ matrix. For example, a $2 \times 2$ matrix looks like:

$$M = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$$

The entry in row $i$, column $j$ is denoted $M_{ij}$. In the matrix above, $M_{11} = a$, $M_{12} = b$, $M_{21} = c$, and $M_{22} = d$.

Matrix-Vector Multiplication

The most important matrix operation for us is multiplying a matrix by a vector. For a $2 \times 2$ matrix and a 2D vector:

$$\begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} ax + by \\ cx + dy \end{pmatrix}$$

Each entry of the result is a dot product of one row of the matrix with the input vector. This is how a matrix transforms a vector: it takes in one vector and produces a new one.

For example, consider the matrix that rotates vectors by 90 degrees counterclockwise:

$$R = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}$$

Applying it to $\vec{v} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$:

$$R\vec{v} = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 0 \\ 1 \end{pmatrix}$$

The vector $(1, 0)$ pointing right has been rotated to $(0, 1)$ pointing up - exactly 90 degrees.

Matrix-Matrix Multiplication

You can also multiply two matrices together. The result is a new matrix that represents doing both transformations in sequence. For $2 \times 2$ matrices:

$$\begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} e & f \\ g & h \end{pmatrix} = \begin{pmatrix} ae+bg & af+bh \\ ce+dg & cf+dh \end{pmatrix}$$

Each entry of the result is a dot product of a row from the first matrix with a column from the second. A critical fact: matrix multiplication is not commutative. In general, $AB \neq BA$. The order in which you apply transformations matters - rotating then scaling is different from scaling then rotating.

Key Idea. In quantum computing, applying gate $A$ followed by gate $B$ is represented by the matrix product $BA$ (note the reversed order - the gate applied first goes on the right, closest to the vector). This non-commutativity is not a quirk; it reflects a deep truth about quantum operations. The order in which you manipulate a qubit matters.

The Identity Matrix

The identity matrix $I$ is the "do nothing" transformation:

$$I = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}$$

For any vector $\vec{v}$, $I\vec{v} = \vec{v}$. For any matrix $M$, $MI = IM = M$. It is the matrix equivalent of multiplying by 1.

Matrix Inverses

A matrix $M$ has an inverse $M^{-1}$ if $M M^{-1} = M^{-1} M = I$. The inverse "undoes" the transformation. If $M$ rotates by 90 degrees, $M^{-1}$ rotates by -90 degrees.

For a $2 \times 2$ matrix $M = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$, the inverse is:

$$M^{-1} = \frac{1}{ad - bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}$$

The quantity $ad - bc$ is the determinant. If the determinant is zero, the matrix has no inverse - it squashes space into a lower dimension and the transformation cannot be reversed.

Looking ahead. Quantum gates are always reversible - every quantum gate matrix has an inverse. In fact, quantum gate matrices are unitary, which means their inverse equals their conjugate transpose ($M^{-1} = M^\dagger$). This is a stronger condition than merely being invertible, and it is what preserves the normalization of quantum states. We will explore unitary matrices in detail when we meet our first quantum gates.

Use the calculator below to experiment with $2 \times 2$ matrix operations.

2x2 Matrix Calculator
Matrix M
Vector v (or Matrix N)
Operation

2.5 Complex Numbers: The Missing Piece

If you have worked only with real numbers, you might wonder: why would we ever need anything else? The answer comes from a simple equation that has no real solution:

$$x^2 = -1$$

No real number, when squared, gives a negative result. Mathematicians resolved this by inventing a new number, $i$, defined by the property:

$$i^2 = -1$$

The number $i$ is called the imaginary unit. This is an unfortunate name - there is nothing imaginary about it. Complex numbers are as real and useful as any other kind of number. They show up whenever you need to describe phenomena involving rotation, oscillation, or waves - which is to say, they show up everywhere in physics, and especially in quantum mechanics.

The Anatomy of a Complex Number

A complex number $z$ has the form:

$$z = a + bi$$

where $a$ is the real part and $b$ is the imaginary part. Both $a$ and $b$ are ordinary real numbers. We write $\text{Re}(z) = a$ and $\text{Im}(z) = b$.

Arithmetic with complex numbers follows the rules you already know, with the extra rule that $i^2 = -1$:

  • Addition: $(a + bi) + (c + di) = (a+c) + (b+d)i$
  • Multiplication: $(a + bi)(c + di) = ac + adi + bci + bdi^2 = (ac - bd) + (ad + bc)i$

The Complex Plane

Every complex number can be plotted on a 2D plane, with the real part on the horizontal axis and the imaginary part on the vertical axis. This is the complex plane (also called the Argand diagram). The complex number $3 + 2i$ sits at the point $(3, 2)$.

Magnitude and Conjugate

The magnitude (or absolute value) of $z = a + bi$ is the distance from the origin to the point $(a, b)$:

$$|z| = \sqrt{a^2 + b^2}$$

The complex conjugate of $z = a + bi$ is:

$$\bar{z} = z^* = a - bi$$

It reflects $z$ across the real axis. A crucial identity links the conjugate to the magnitude:

$$z \bar{z} = (a + bi)(a - bi) = a^2 + b^2 = |z|^2$$
Key Idea. In quantum mechanics, the probability of a measurement outcome is given by $|z|^2$, where $z$ is a complex amplitude. This is the Born rule. The complex conjugate is essential because $|z|^2 = z \bar{z}$ is how we extract real-valued probabilities from complex-valued amplitudes. This is why quantum states use complex numbers: the amplitudes carry both magnitude (probability information) and phase (interference information).

Polar Form

Instead of specifying a complex number by its real and imaginary parts, we can use its magnitude $r = |z|$ and its angle $\theta$ (measured counterclockwise from the positive real axis):

$$z = r(\cos\theta + i\sin\theta)$$

The angle $\theta$ is called the phase or argument of $z$. Two complex numbers with the same magnitude but different phases sit on the same circle centered at the origin, but at different positions on that circle.

Euler's Formula

One of the most beautiful results in mathematics connects the exponential function to trigonometry:

$$e^{i\theta} = \cos\theta + i\sin\theta$$

This is Euler's formula. It tells us that the complex exponential $e^{i\theta}$ traces out the unit circle in the complex plane as $\theta$ varies. Using Euler's formula, the polar form becomes wonderfully compact:

$$z = r e^{i\theta}$$

Multiplication in polar form is especially elegant: you multiply magnitudes and add angles:

$$z_1 z_2 = r_1 r_2 \, e^{i(\theta_1 + \theta_2)}$$

This means multiplying by $e^{i\theta}$ is a pure rotation by angle $\theta$ - the magnitude stays the same, but the direction changes. When $\theta = \pi$, Euler's formula gives us the famous identity:

$$e^{i\pi} + 1 = 0$$

Why Quantum Mechanics Needs Complex Numbers

You might wonder: could we do quantum computing with just real numbers? The answer is no, and here is the intuition. Quantum mechanics is fundamentally a theory of interference. Two quantum states can combine and either reinforce each other (constructive interference) or cancel each other out (destructive interference), depending on their relative phase.

Real numbers can only be positive or negative, giving you two options: add up or cancel. Complex numbers, with their full 360 degrees of phase, allow for a much richer set of interference patterns. The phase $\theta$ in $e^{i\theta}$ is the invisible "clock hand" that determines how quantum states interact. Without it, quantum algorithms like Shor's factoring algorithm or Grover's search algorithm would not work, because they rely on carefully engineering interference to amplify correct answers and suppress wrong ones.

Looking ahead. When we write a qubit state as $|\psi\rangle = \alpha|0\rangle + \beta|1\rangle$, the amplitudes $\alpha$ and $\beta$ are complex numbers. The probabilities of measuring 0 and 1 are $|\alpha|^2$ and $|\beta|^2$, but the phases of $\alpha$ and $\beta$ are equally important - they determine how the qubit interferes with other qubits and with itself after quantum gate operations.

Explore complex numbers on the complex plane below. Enter a number in either rectangular ($a + bi$) or polar ($r e^{i\theta}$) form and see it plotted.

Complex Plane Visualizer
Complex Number $z$
Operation

Putting It All Together

Let us take a moment to see how the five topics of this chapter form a unified toolkit.

Number systems taught us that information can be encoded in different bases, and that binary encoding is the foundation of classical computing. With $n$ bits, we can represent $2^n$ states.

Probability taught us how to reason about uncertain outcomes and how probabilities must sum to 1. Quantum measurement is inherently probabilistic, so this framework carries directly into the quantum world.

Vectors gave us a way to encode information as arrows in space. Quantum states are unit vectors, and the components of those vectors (the amplitudes) encode the probabilities of different measurement outcomes.

Matrices showed us how to transform vectors. Every quantum gate is a matrix that transforms quantum states while preserving their unit length.

Complex numbers gave us the final ingredient: phase. Quantum amplitudes are complex, and the phases they carry determine interference patterns that make quantum algorithms powerful.

Together, these form the complete mathematical vocabulary you need. In the next chapter, we will see how classical computing can be made reversible - the crucial bridge between the classical world and the quantum world. And in Part II, we will put every tool from this chapter to immediate use when we write down the state of our first qubit.

Review Cards
How many distinct values can $n$ bits represent?
$2^n$ distinct values. For example, 3 bits can represent $2^3 = 8$ values (000 through 111). This exponential growth is central to quantum computing, where $n$ qubits can exist in a superposition of all $2^n$ states simultaneously.
What is the normalization condition for probabilities?
The probabilities of all outcomes in a sample space must sum to 1: $\sum_{x \in S} P(x) = 1$. In quantum computing, this becomes $|\alpha|^2 + |\beta|^2 = 1$ for a qubit with amplitudes $\alpha$ and $\beta$.
What is Bayes' rule?
$P(A | B) = \frac{P(B | A) \, P(A)}{P(B)}$. It lets you reverse conditional probabilities - computing the probability of a cause given an observed effect. It requires knowing the prior probability $P(A)$ and the likelihood $P(B | A)$.
Why must quantum states be unit vectors?
A unit vector has magnitude 1, which ensures that the measurement probabilities (the squared magnitudes of the vector's components) sum to 1. This is the normalization condition: $\|\vec{v}\| = 1$ guarantees a valid probability distribution over measurement outcomes.
If you apply gate $A$ and then gate $B$, what matrix represents the combined operation?
The matrix product $BA$ (not $AB$). The gate applied first goes on the right, closest to the state vector. This is because we evaluate $B(A|\psi\rangle)$, which equals $(BA)|\psi\rangle$. Matrix multiplication is not commutative, so order matters.
What is the complex conjugate of $z = a + bi$, and how does it relate to $|z|^2$?
The conjugate is $\bar{z} = a - bi$ (negate the imaginary part). The magnitude squared is $|z|^2 = z\bar{z} = a^2 + b^2$. In quantum mechanics, $|z|^2$ converts complex amplitudes into real probabilities via the Born rule.
State Euler's formula and explain what $e^{i\theta}$ represents geometrically.
$e^{i\theta} = \cos\theta + i\sin\theta$. Geometrically, $e^{i\theta}$ is the point on the unit circle in the complex plane at angle $\theta$ from the positive real axis. Multiplying a complex number by $e^{i\theta}$ rotates it by $\theta$ without changing its magnitude.
Why does quantum mechanics require complex numbers rather than just real numbers?
Quantum mechanics relies on interference, where amplitudes combine constructively or destructively. Real numbers only allow in-phase or out-of-phase combination. Complex numbers provide a full 360 degrees of phase, enabling the rich interference patterns that quantum algorithms exploit to amplify correct answers and suppress wrong ones.
What does it mean when the dot product of two vectors equals zero?
The vectors are orthogonal (perpendicular). Since $\vec{a} \cdot \vec{b} = \|\vec{a}\|\|\vec{b}\|\cos\theta$, a zero dot product means $\cos\theta = 0$, so $\theta = 90°$. Orthogonal vectors represent completely distinguishable states in quantum computing.
What happens when a $2 \times 2$ matrix has a determinant of zero?
The matrix has no inverse - it is singular. Geometrically, it collapses 2D space into a line or a point, losing information irreversibly. Quantum gates always have nonzero determinants (they are unitary), ensuring that quantum operations are always reversible.