Chapter 2: The Mathematics of Information
Quantum computing lives at the intersection of physics and mathematics. Before we can understand qubits, superposition, or entanglement, we need a small but powerful toolkit of mathematical ideas. None of them are individually difficult, but together they form the language in which quantum computing is written.
This chapter introduces five mathematical pillars: number systems, probability, vectors, matrices, and complex numbers. If you have basic algebra under your belt, you have everything you need to follow along. We will build each idea from scratch, connect it to concrete intuition, and show you why quantum computing demands it. By the end, you will have every mathematical tool required for the rest of this textbook.
2.1 Counting and Number Systems
We grow up counting in base 10 - the decimal system - because we happen to have ten fingers. But there is nothing sacred about the number ten. A number system is just a way of representing quantities using a fixed set of symbols, and different bases turn out to be useful for different purposes.
Decimal (Base 10)
In decimal, we have ten digits: 0 through 9. The position of each digit tells you which power of 10 it represents. For example, the number 347 means:
$$347 = 3 \times 10^2 + 4 \times 10^1 + 7 \times 10^0 = 300 + 40 + 7$$Each position is worth ten times more than the position to its right. This is so familiar that we rarely think about it, but the same principle works for any base.
Binary (Base 2)
Computers use base 2 because their fundamental building blocks - transistors - have two states: on and off, high voltage and low voltage, 1 and 0. In binary, we have only two digits (called bits), and each position represents a power of 2.
$$1011_2 = 1 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 1 \times 2^0 = 8 + 0 + 2 + 1 = 11_{10}$$The powers of 2 that you will see constantly are: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. Memorizing this sequence up to $2^{10} = 1024$ will save you time throughout this textbook.
Hexadecimal (Base 16)
Hexadecimal (hex) uses sixteen symbols: 0-9 and A-F (where A = 10, B = 11, ..., F = 15). Hex is popular in computing because each hex digit corresponds to exactly four binary digits, making it a compact way to write binary data.
$$\text{1F}_{16} = 1 \times 16^1 + 15 \times 16^0 = 16 + 15 = 31_{10} = 11111_2$$The conversion between hex and binary is mechanical: replace each hex digit with its four-bit binary equivalent. For instance, $\text{A3}_{16}$ becomes $1010\;0011_2$ because A = 1010 and 3 = 0011.
Converting Between Bases
To convert from any base to decimal, expand the positional notation and add up the terms, as shown above. To convert from decimal to another base, repeatedly divide by the target base and collect the remainders from bottom to top. For example, to convert $25_{10}$ to binary:
- $25 \div 2 = 12$ remainder $1$
- $12 \div 2 = 6$ remainder $0$
- $6 \div 2 = 3$ remainder $0$
- $3 \div 2 = 1$ remainder $1$
- $1 \div 2 = 0$ remainder $1$
Reading the remainders from bottom to top: $25_{10} = 11001_2$.
Try the interactive converter below to build intuition for how the same number looks in different bases.
2.2 Probability: Quantifying Uncertainty
In everyday life, we deal with uncertainty constantly: will it rain tomorrow? What are the odds of rolling a six? Probability gives us a precise language for reasoning about uncertain outcomes, and it turns out to be inseparable from quantum mechanics. When you measure a qubit, you do not always get a deterministic answer - you get a probabilistic one. The math of probability is therefore not optional; it is woven into the very fabric of quantum computing.
Sample Spaces and Events
A sample space $S$ is the set of all possible outcomes of an experiment. For a coin flip, $S = \{H, T\}$. For a six-sided die, $S = \{1, 2, 3, 4, 5, 6\}$. An event is any subset of the sample space. For example, "rolling an even number" is the event $\{2, 4, 6\}$.
The probability of an event $A$, written $P(A)$, is a number between 0 and 1:
$$0 \leq P(A) \leq 1$$$P(A) = 0$ means the event is impossible. $P(A) = 1$ means it is certain. For a fair die, each face has probability $\frac{1}{6}$, and the probability of rolling an even number is:
$$P(\text{even}) = P(2) + P(4) + P(6) = \frac{1}{6} + \frac{1}{6} + \frac{1}{6} = \frac{1}{2}$$A fundamental requirement is that the probabilities of all outcomes in the sample space must add up to 1:
$$\sum_{x \in S} P(x) = 1$$Independent Events
Two events $A$ and $B$ are independent if the occurrence of one does not affect the probability of the other. Flipping a coin twice gives independent outcomes. For independent events, the probability of both occurring is the product of their individual probabilities:
$$P(A \text{ and } B) = P(A) \times P(B)$$For example, the probability of flipping two heads in a row with a fair coin is $\frac{1}{2} \times \frac{1}{2} = \frac{1}{4}$.
Conditional Probability
Sometimes events are not independent. The conditional probability of $A$ given $B$, written $P(A | B)$, is the probability that $A$ occurs if we know that $B$ has already occurred:
$$P(A | B) = \frac{P(A \text{ and } B)}{P(B)}$$Consider a bag with 3 red and 2 blue marbles. You draw one marble without looking, then draw a second. What is the probability the second marble is red, given the first was red?
After removing one red marble, the bag has 2 red and 2 blue marbles left:
$$P(\text{2nd red} | \text{1st red}) = \frac{2}{4} = \frac{1}{2}$$Bayes' Rule
Bayes' rule lets you "reverse" conditional probabilities. If you know $P(B | A)$ and want $P(A | B)$, Bayes' rule says:
$$P(A | B) = \frac{P(B | A) \, P(A)}{P(B)}$$Here is a concrete example. Suppose a medical test for a rare disease is 99% accurate (it correctly identifies someone with the disease 99% of the time and correctly clears someone without it 99% of the time). The disease affects 1 in 1000 people. If you test positive, what is the probability you actually have the disease?
Let $D$ = having the disease and $+$ = testing positive. Then:
- $P(D) = 0.001$ (prevalence)
- $P(+ | D) = 0.99$ (true positive rate)
- $P(+ | \text{no } D) = 0.01$ (false positive rate)
Despite the test being 99% accurate, a positive result only means about a 9% chance of having the disease. The low prevalence of the disease overwhelms the test's accuracy. This kind of counterintuitive reasoning shows up frequently in quantum measurement theory, where prior knowledge about a quantum state affects how we interpret measurement results.
2.3 Vectors: Arrows That Encode Information
A vector is one of the most useful objects in all of mathematics. At its simplest, a vector is an arrow - it has a direction and a length (magnitude). But vectors are much more than geometric arrows. They are the fundamental language for describing quantum states. When we write the state of a qubit, we write it as a vector. Understanding vectors is therefore not just helpful for quantum computing - it is essential.
Vectors in Two Dimensions
A 2D vector $\vec{v}$ can be written as a pair of numbers:
$$\vec{v} = \begin{pmatrix} v_x \\ v_y \end{pmatrix}$$The number $v_x$ is the horizontal component and $v_y$ is the vertical component. Geometrically, $\vec{v}$ is an arrow from the origin $(0,0)$ to the point $(v_x, v_y)$. For example, $\vec{v} = \begin{pmatrix} 3 \\ 2 \end{pmatrix}$ is an arrow pointing 3 units right and 2 units up.
Vector Addition
To add two vectors, add their corresponding components:
$$\begin{pmatrix} a_x \\ a_y \end{pmatrix} + \begin{pmatrix} b_x \\ b_y \end{pmatrix} = \begin{pmatrix} a_x + b_x \\ a_y + b_y \end{pmatrix}$$Geometrically, this is the "tip-to-tail" rule: place the tail of $\vec{b}$ at the tip of $\vec{a}$, and the sum $\vec{a} + \vec{b}$ is the arrow from the tail of $\vec{a}$ to the tip of $\vec{b}$. This forms a parallelogram.
Scalar Multiplication
Multiplying a vector by a number (a scalar) $c$ scales every component:
$$c \begin{pmatrix} v_x \\ v_y \end{pmatrix} = \begin{pmatrix} c \, v_x \\ c \, v_y \end{pmatrix}$$If $c > 1$, the arrow gets longer. If $0 < c < 1$, it gets shorter. If $c < 0$, it flips direction. If $c = 0$, it collapses to the zero vector.
Magnitude and Unit Vectors
The magnitude (length) of a vector $\vec{v} = \begin{pmatrix} v_x \\ v_y \end{pmatrix}$ is given by the Pythagorean theorem:
$$\|\vec{v}\| = \sqrt{v_x^2 + v_y^2}$$A unit vector is a vector with magnitude 1. You can turn any nonzero vector into a unit vector by dividing by its magnitude: $\hat{v} = \frac{\vec{v}}{\|\vec{v}\|}$.
The Dot Product
The dot product of two vectors measures how much they point in the same direction:
$$\vec{a} \cdot \vec{b} = a_x b_x + a_y b_y = \|\vec{a}\| \, \|\vec{b}\| \cos\theta$$where $\theta$ is the angle between them. If the dot product is zero, the vectors are orthogonal (perpendicular). Two orthogonal unit vectors form a basis - a coordinate system in which any vector can be expressed as a combination of them.
For example, the standard basis vectors $\vec{e}_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$ and $\vec{e}_2 = \begin{pmatrix} 0 \\ 1 \end{pmatrix}$ are orthogonal unit vectors, and any 2D vector can be written as $\vec{v} = v_x \vec{e}_1 + v_y \vec{e}_2$.
Extending to Three Dimensions and Beyond
Everything above extends naturally. A 3D vector has three components: $\vec{v} = \begin{pmatrix} v_x \\ v_y \\ v_z \end{pmatrix}$, with magnitude $\|\vec{v}\| = \sqrt{v_x^2 + v_y^2 + v_z^2}$. In quantum computing, we will work with vectors that have 2, 4, 8, or even $2^n$ components - one for each possible state of $n$ qubits. The algebra stays the same; only the number of components grows.
Use the visualizer below to explore how vectors add and scale in 2D.
2.4 Matrices: Transformations on Vectors
If vectors are the nouns of linear algebra, matrices are the verbs. A matrix is a rectangular grid of numbers that describes a transformation - a way of turning one vector into another. In quantum computing, every operation on a qubit (every quantum gate) is a matrix. When we apply a gate to a qubit, we are multiplying the qubit's state vector by a matrix. This is why understanding matrices is absolutely essential.
What Is a Matrix?
A matrix is a rectangular array of numbers arranged in rows and columns. A matrix with $m$ rows and $n$ columns is called an $m \times n$ matrix. For example, a $2 \times 2$ matrix looks like:
$$M = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$$The entry in row $i$, column $j$ is denoted $M_{ij}$. In the matrix above, $M_{11} = a$, $M_{12} = b$, $M_{21} = c$, and $M_{22} = d$.
Matrix-Vector Multiplication
The most important matrix operation for us is multiplying a matrix by a vector. For a $2 \times 2$ matrix and a 2D vector:
$$\begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} ax + by \\ cx + dy \end{pmatrix}$$Each entry of the result is a dot product of one row of the matrix with the input vector. This is how a matrix transforms a vector: it takes in one vector and produces a new one.
For example, consider the matrix that rotates vectors by 90 degrees counterclockwise:
$$R = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}$$Applying it to $\vec{v} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$:
$$R\vec{v} = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 0 \\ 1 \end{pmatrix}$$The vector $(1, 0)$ pointing right has been rotated to $(0, 1)$ pointing up - exactly 90 degrees.
Matrix-Matrix Multiplication
You can also multiply two matrices together. The result is a new matrix that represents doing both transformations in sequence. For $2 \times 2$ matrices:
$$\begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} e & f \\ g & h \end{pmatrix} = \begin{pmatrix} ae+bg & af+bh \\ ce+dg & cf+dh \end{pmatrix}$$Each entry of the result is a dot product of a row from the first matrix with a column from the second. A critical fact: matrix multiplication is not commutative. In general, $AB \neq BA$. The order in which you apply transformations matters - rotating then scaling is different from scaling then rotating.
The Identity Matrix
The identity matrix $I$ is the "do nothing" transformation:
$$I = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}$$For any vector $\vec{v}$, $I\vec{v} = \vec{v}$. For any matrix $M$, $MI = IM = M$. It is the matrix equivalent of multiplying by 1.
Matrix Inverses
A matrix $M$ has an inverse $M^{-1}$ if $M M^{-1} = M^{-1} M = I$. The inverse "undoes" the transformation. If $M$ rotates by 90 degrees, $M^{-1}$ rotates by -90 degrees.
For a $2 \times 2$ matrix $M = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$, the inverse is:
$$M^{-1} = \frac{1}{ad - bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}$$The quantity $ad - bc$ is the determinant. If the determinant is zero, the matrix has no inverse - it squashes space into a lower dimension and the transformation cannot be reversed.
Use the calculator below to experiment with $2 \times 2$ matrix operations.
2.5 Complex Numbers: The Missing Piece
If you have worked only with real numbers, you might wonder: why would we ever need anything else? The answer comes from a simple equation that has no real solution:
$$x^2 = -1$$No real number, when squared, gives a negative result. Mathematicians resolved this by inventing a new number, $i$, defined by the property:
$$i^2 = -1$$The number $i$ is called the imaginary unit. This is an unfortunate name - there is nothing imaginary about it. Complex numbers are as real and useful as any other kind of number. They show up whenever you need to describe phenomena involving rotation, oscillation, or waves - which is to say, they show up everywhere in physics, and especially in quantum mechanics.
The Anatomy of a Complex Number
A complex number $z$ has the form:
$$z = a + bi$$where $a$ is the real part and $b$ is the imaginary part. Both $a$ and $b$ are ordinary real numbers. We write $\text{Re}(z) = a$ and $\text{Im}(z) = b$.
Arithmetic with complex numbers follows the rules you already know, with the extra rule that $i^2 = -1$:
- Addition: $(a + bi) + (c + di) = (a+c) + (b+d)i$
- Multiplication: $(a + bi)(c + di) = ac + adi + bci + bdi^2 = (ac - bd) + (ad + bc)i$
The Complex Plane
Every complex number can be plotted on a 2D plane, with the real part on the horizontal axis and the imaginary part on the vertical axis. This is the complex plane (also called the Argand diagram). The complex number $3 + 2i$ sits at the point $(3, 2)$.
Magnitude and Conjugate
The magnitude (or absolute value) of $z = a + bi$ is the distance from the origin to the point $(a, b)$:
$$|z| = \sqrt{a^2 + b^2}$$The complex conjugate of $z = a + bi$ is:
$$\bar{z} = z^* = a - bi$$It reflects $z$ across the real axis. A crucial identity links the conjugate to the magnitude:
$$z \bar{z} = (a + bi)(a - bi) = a^2 + b^2 = |z|^2$$Polar Form
Instead of specifying a complex number by its real and imaginary parts, we can use its magnitude $r = |z|$ and its angle $\theta$ (measured counterclockwise from the positive real axis):
$$z = r(\cos\theta + i\sin\theta)$$The angle $\theta$ is called the phase or argument of $z$. Two complex numbers with the same magnitude but different phases sit on the same circle centered at the origin, but at different positions on that circle.
Euler's Formula
One of the most beautiful results in mathematics connects the exponential function to trigonometry:
$$e^{i\theta} = \cos\theta + i\sin\theta$$This is Euler's formula. It tells us that the complex exponential $e^{i\theta}$ traces out the unit circle in the complex plane as $\theta$ varies. Using Euler's formula, the polar form becomes wonderfully compact:
$$z = r e^{i\theta}$$Multiplication in polar form is especially elegant: you multiply magnitudes and add angles:
$$z_1 z_2 = r_1 r_2 \, e^{i(\theta_1 + \theta_2)}$$This means multiplying by $e^{i\theta}$ is a pure rotation by angle $\theta$ - the magnitude stays the same, but the direction changes. When $\theta = \pi$, Euler's formula gives us the famous identity:
$$e^{i\pi} + 1 = 0$$Why Quantum Mechanics Needs Complex Numbers
You might wonder: could we do quantum computing with just real numbers? The answer is no, and here is the intuition. Quantum mechanics is fundamentally a theory of interference. Two quantum states can combine and either reinforce each other (constructive interference) or cancel each other out (destructive interference), depending on their relative phase.
Real numbers can only be positive or negative, giving you two options: add up or cancel. Complex numbers, with their full 360 degrees of phase, allow for a much richer set of interference patterns. The phase $\theta$ in $e^{i\theta}$ is the invisible "clock hand" that determines how quantum states interact. Without it, quantum algorithms like Shor's factoring algorithm or Grover's search algorithm would not work, because they rely on carefully engineering interference to amplify correct answers and suppress wrong ones.
Explore complex numbers on the complex plane below. Enter a number in either rectangular ($a + bi$) or polar ($r e^{i\theta}$) form and see it plotted.
Putting It All Together
Let us take a moment to see how the five topics of this chapter form a unified toolkit.
Number systems taught us that information can be encoded in different bases, and that binary encoding is the foundation of classical computing. With $n$ bits, we can represent $2^n$ states.
Probability taught us how to reason about uncertain outcomes and how probabilities must sum to 1. Quantum measurement is inherently probabilistic, so this framework carries directly into the quantum world.
Vectors gave us a way to encode information as arrows in space. Quantum states are unit vectors, and the components of those vectors (the amplitudes) encode the probabilities of different measurement outcomes.
Matrices showed us how to transform vectors. Every quantum gate is a matrix that transforms quantum states while preserving their unit length.
Complex numbers gave us the final ingredient: phase. Quantum amplitudes are complex, and the phases they carry determine interference patterns that make quantum algorithms powerful.
Together, these form the complete mathematical vocabulary you need. In the next chapter, we will see how classical computing can be made reversible - the crucial bridge between the classical world and the quantum world. And in Part II, we will put every tool from this chapter to immediate use when we write down the state of our first qubit.