Chapter 26: Quantum States and Measurements, Revisited

In earlier chapters we described quantum states as ket vectors $|\psi\rangle$ living in a Hilbert space. That description works beautifully for isolated systems in perfectly known states, but much of quantum information theory requires something more general. What if we have incomplete knowledge about which state was prepared? What if we only have access to part of a larger entangled system? What if our measurement apparatus cannot distinguish certain outcomes? The density operator, the partial trace, and the POVM framework answer all three questions. This chapter revisits quantum states and measurements with the mature mathematical tools that underpin quantum information theory.

26.1 Pure States, Mixed States, and the Density Operator

A pure state $|\psi\rangle$ encodes complete knowledge of a quantum system. Its physics is fully captured by the state vector (up to a global phase). But suppose Alice prepares a qubit in the state $|0\rangle$ with probability $p$ and in the state $|1\rangle$ with probability $1 - p$. Bob, who does not know Alice's choice, cannot describe his qubit with a single ket vector. He needs a mixed state.

The mathematical object that handles both pure and mixed states uniformly is the density operator (or density matrix):

$$\rho = \sum_i p_i |\psi_i\rangle\langle\psi_i|$$

where $\{p_i\}$ is a probability distribution ($p_i \geq 0$, $\sum_i p_i = 1$) and $|\psi_i\rangle$ are normalized (but not necessarily orthogonal) state vectors. Each term $|\psi_i\rangle\langle\psi_i|$ is the outer product or projector onto the state $|\psi_i\rangle$.

Key Concept.

A density operator $\rho$ is a positive semi-definite, Hermitian operator with unit trace: (1) $\rho = \rho^\dagger$, (2) $\langle\phi|\rho|\phi\rangle \geq 0$ for all $|\phi\rangle$, and (3) $\text{Tr}(\rho) = 1$. Conversely, every operator satisfying these three properties is a valid density operator.

Pure vs. Mixed: The Purity Test

A state $\rho$ is pure if and only if $\rho^2 = \rho$ (i.e., $\rho$ is a projector), or equivalently $\text{Tr}(\rho^2) = 1$. For a mixed state, $\text{Tr}(\rho^2) < 1$. The quantity $\text{Tr}(\rho^2)$ is called the purity of the state. For a $d$-dimensional system it ranges from $1/d$ (the maximally mixed state $\rho = I/d$) to $1$ (a pure state).

Single-Qubit Density Matrices

For a single qubit, any density matrix can be written in terms of the Pauli matrices:

$$\rho = \frac{1}{2}(I + \vec{r} \cdot \vec{\sigma}) = \frac{1}{2}\begin{pmatrix} 1 + r_z & r_x - i r_y \\ r_x + i r_y & 1 - r_z \end{pmatrix}$$

where $\vec{r} = (r_x, r_y, r_z)$ is the Bloch vector and $\vec{\sigma} = (\sigma_x, \sigma_y, \sigma_z)$ are the Pauli matrices. Pure states live on the surface of the Bloch sphere ($|\vec{r}| = 1$), mixed states occupy the interior ($|\vec{r}| < 1$), and the maximally mixed state sits at the origin ($\vec{r} = \vec{0}$).

Expectation Values and the Born Rule

The density operator encapsulates all statistical predictions. For any observable $A$, the expectation value is:

$$\langle A \rangle = \text{Tr}(\rho A)$$

The probability of measurement outcome $m$ associated with projector $P_m$ is:

$$p(m) = \text{Tr}(\rho P_m)$$

After obtaining outcome $m$, the post-measurement state is:

$$\rho' = \frac{P_m \rho P_m}{\text{Tr}(\rho P_m)}$$

This generalizes the Born rule and the projection postulate to mixed states. The density operator formalism lets us compute everything we could compute with kets, but it also handles statistical mixtures and subsystems of entangled states - situations that kets alone cannot describe.

Non-Uniqueness of Ensembles

An important subtlety: different ensembles can produce the same density operator. For example, the maximally mixed qubit state $\rho = I/2$ can arise from a 50/50 mixture of $|0\rangle$ and $|1\rangle$, or from a 50/50 mixture of $|+\rangle$ and $|-\rangle$, or indeed from any ensemble $\{p_i, |\psi_i\rangle\}$ whose weighted sum of projectors equals $I/2$. No measurement can distinguish these preparations. The density operator captures exactly the physically accessible information - nothing more, nothing less.

Note.

The non-uniqueness of ensemble decompositions is related to a deep result: two ensembles $\{p_i, |\psi_i\rangle\}$ and $\{q_j, |\phi_j\rangle\}$ give the same density operator if and only if they are related by a unitary matrix acting on the "square root" vectors $\sqrt{p_i}|\psi_i\rangle$. This is the Hughston-Jozsa-Wootters (HJW) theorem, sometimes called the GHJW theorem.

Worked Example: Computing a Density Matrix

Suppose Alice prepares $|+\rangle = \frac{1}{\sqrt{2}}(|0\rangle + |1\rangle)$ with probability $1/3$ and $|0\rangle$ with probability $2/3$. The density matrix is:

$$\rho = \frac{1}{3}|+\rangle\langle+| + \frac{2}{3}|0\rangle\langle 0|$$

Computing each term:

$$|+\rangle\langle+| = \frac{1}{2}\begin{pmatrix}1 & 1 \\ 1 & 1\end{pmatrix}, \quad |0\rangle\langle 0| = \begin{pmatrix}1 & 0 \\ 0 & 0\end{pmatrix}$$ $$\rho = \frac{1}{3}\cdot\frac{1}{2}\begin{pmatrix}1 & 1 \\ 1 & 1\end{pmatrix} + \frac{2}{3}\begin{pmatrix}1 & 0 \\ 0 & 0\end{pmatrix} = \begin{pmatrix}5/6 & 1/6 \\ 1/6 & 1/6\end{pmatrix}$$

We can verify: $\text{Tr}(\rho) = 5/6 + 1/6 = 1$. The eigenvalues are $\lambda_\pm = \frac{1}{2}(1 \pm \sqrt{1 - 4 \cdot 4/36}) = \frac{1}{2}(1 \pm \frac{\sqrt{5}}{3})$, giving $\lambda_+ \approx 0.873$ and $\lambda_- \approx 0.127$. The purity is $\text{Tr}(\rho^2) = \lambda_+^2 + \lambda_-^2 \approx 0.778$, confirming the state is mixed. The Bloch vector has components $r_x = 2\text{Re}(\rho_{01}) = 1/3$, $r_y = 2\text{Im}(\rho_{10}) = 0$, and $r_z = \rho_{00} - \rho_{11} = 2/3$, with $|\vec{r}| = \sqrt{1/9 + 4/9} = \sqrt{5}/3 \approx 0.745 < 1$ - inside the Bloch sphere.

Time Evolution of Density Matrices

Under unitary evolution with Hamiltonian $H$, the density operator evolves according to the von Neumann equation:

$$i\hbar \frac{d\rho}{dt} = [H, \rho]$$

This is the density matrix analog of the Schrodinger equation. The solution is $\rho(t) = U(t)\rho(0)U^\dagger(t)$ where $U(t) = e^{-iHt/\hbar}$. The von Neumann equation preserves the eigenvalues of $\rho$ (and hence its entropy), reflecting the fact that unitary evolution is reversible and cannot change the "mixedness" of a state.

When the system interacts with an environment, the evolution is no longer unitary on the system alone, and the density matrix can become more mixed over time. This is the process of decoherence, which we formalize using quantum channels in Chapter 27.

26.2 The Partial Trace: Describing Subsystems

Suppose we have a bipartite quantum system $AB$ in state $\rho_{AB}$. How do we describe the state of subsystem $A$ alone? The answer is the partial trace over $B$:

$$\rho_A = \text{Tr}_B(\rho_{AB}) = \sum_j (I_A \otimes \langle j|_B) \rho_{AB} (I_A \otimes |j\rangle_B)$$

where $\{|j\rangle_B\}$ is any orthonormal basis for system $B$. The result is independent of which basis we choose - a fact that follows from the linearity and cyclic property of the trace.

Key Concept.

The partial trace is the unique operation that correctly describes local measurements on a subsystem. If we measure only observables on $A$ (of the form $M_A \otimes I_B$), then $\text{Tr}(\rho_{AB}(M_A \otimes I_B)) = \text{Tr}(\rho_A M_A)$. The partial trace is not just a convenient tool - it is the mathematically necessary prescription for extracting subsystem descriptions from composite states.

Example: Bell State

Consider the Bell state $|\Phi^+\rangle = \frac{1}{\sqrt{2}}(|00\rangle + |11\rangle)$. The density matrix of the full system is:

$$\rho_{AB} = |\Phi^+\rangle\langle\Phi^+| = \frac{1}{2}(|00\rangle\langle 00| + |00\rangle\langle 11| + |11\rangle\langle 00| + |11\rangle\langle 11|)$$

Tracing out $B$:

$$\rho_A = \text{Tr}_B(\rho_{AB}) = \frac{1}{2}(|0\rangle\langle 0| + |1\rangle\langle 1|) = \frac{I}{2}$$

Although the joint system is in a pure entangled state, subsystem $A$ is maximally mixed. This is a hallmark of entanglement: a pure state of the whole can produce a mixed state of a part. This never happens classically - classically, if you know the state of the whole, you know the state of every part.

The Schmidt Decomposition

For any pure bipartite state $|\psi\rangle_{AB}$, there exist orthonormal bases $\{|a_i\rangle\}$ for $A$ and $\{|b_i\rangle\}$ for $B$ such that:

$$|\psi\rangle_{AB} = \sum_{i=1}^{r} \lambda_i |a_i\rangle|b_i\rangle$$

where $\lambda_i > 0$, $\sum_i \lambda_i^2 = 1$, and $r \leq \min(d_A, d_B)$. The $\lambda_i$ are the Schmidt coefficients and $r$ is the Schmidt rank. This decomposition always exists and the coefficients $\lambda_i$ are unique (though the bases may not be when coefficients are degenerate).

The Schmidt decomposition reveals the entanglement structure at a glance:

$r = 1$: the state is a product state $|\psi\rangle_A \otimes |\phi\rangle_B$ - no entanglement.
$r > 1$: the state is entangled. The larger $r$ is, and the more uniform the Schmidt coefficients are, the more entangled the state.
$r = \min(d_A, d_B)$ with all $\lambda_i$ equal: maximally entangled.

The reduced density matrices are immediately given by the Schmidt coefficients:

$$\rho_A = \sum_i \lambda_i^2 |a_i\rangle\langle a_i|, \quad \rho_B = \sum_i \lambda_i^2 |b_i\rangle\langle b_i|$$

So $\rho_A$ and $\rho_B$ have the same nonzero eigenvalues $\lambda_i^2$, which means they share the same von Neumann entropy. This is a powerful constraint: for a pure bipartite state, $S(\rho_A) = S(\rho_B)$, even though systems $A$ and $B$ may have very different dimensions.

Note.

The Schmidt decomposition is a special case of the singular value decomposition (SVD) from linear algebra. If we write $|\psi\rangle_{AB} = \sum_{ij} c_{ij}|i\rangle|j\rangle$, then the matrix $C$ with entries $c_{ij}$ has SVD $C = U \Lambda V^\dagger$, and the Schmidt coefficients are the singular values.

Purification

The partial trace destroys information. Can we run it in reverse? Given any mixed state $\rho_A$ on system $A$, can we find a pure state on a larger system $AR$ such that $\text{Tr}_R(|\psi\rangle\langle\psi|_{AR}) = \rho_A$? Yes, always. If $\rho_A = \sum_i p_i |a_i\rangle\langle a_i|$, then $|\psi\rangle_{AR} = \sum_i \sqrt{p_i}|a_i\rangle|r_i\rangle$ is a purification of $\rho_A$, where $\{|r_i\rangle\}$ is any orthonormal set for the reference system $R$.

Purification is not unique: any unitary on $R$ gives a different purification of the same $\rho_A$. This freedom is exactly the HJW theorem in disguise and plays a central role in quantum channel theory (Chapter 27) and quantum Shannon theory (Chapter 29).

Worked Example: Partial Trace of a Product State

Not every bipartite state is entangled. Consider the product state $|\psi\rangle_{AB} = |+\rangle_A \otimes |0\rangle_B$. The density matrix is:

$$\rho_{AB} = |+\rangle\langle+| \otimes |0\rangle\langle 0| = \frac{1}{2}\begin{pmatrix}1\\1\end{pmatrix}\begin{pmatrix}1&1\end{pmatrix} \otimes \begin{pmatrix}1&0\\0&0\end{pmatrix}$$

Tracing out $B$: $\rho_A = \text{Tr}_B(\rho_{AB}) = |+\rangle\langle+| \cdot \text{Tr}(|0\rangle\langle 0|) = |+\rangle\langle+|$. The reduced state is pure - as it must be for a product state. The partial trace of a product state always returns the factor on the kept system. Contrast this with the Bell state example above, where the partial trace of a pure entangled state yields a mixed state. This distinction - pure marginal implies product, mixed marginal implies entangled - is a clean test for entanglement of pure bipartite states.

Entanglement and the Partial Trace: A Summary

For a pure bipartite state $|\psi\rangle_{AB}$, the following are equivalent:

$|\psi\rangle_{AB}$ is entangled (not a product state).
The Schmidt rank is greater than 1.
$\rho_A = \text{Tr}_B(|\psi\rangle\langle\psi|)$ is mixed ($\text{Tr}(\rho_A^2) < 1$).
$S(\rho_A) > 0$.

This equivalence is specific to pure states. For mixed bipartite states, a mixed marginal does not necessarily imply entanglement - classical correlations can also produce mixed marginals. Detecting entanglement in mixed states requires more sophisticated tools like the PPT criterion (Chapter 28).

26.3 Generalized Measurements: POVMs

Projective (von Neumann) measurements are defined by a set of orthogonal projectors $\{P_m\}$ with $\sum_m P_m = I$. They are the standard measurement model, but they are not the most general type of measurement allowed by quantum mechanics. The general framework is the Positive Operator-Valued Measure (POVM).

Definition

A POVM is a set of positive semi-definite operators $\{E_m\}$ satisfying:

$$E_m \geq 0 \quad \text{for all } m, \quad \sum_m E_m = I$$

The probability of outcome $m$ when measuring state $\rho$ is:

$$p(m) = \text{Tr}(\rho E_m)$$

The operators $E_m$ are called POVM elements or effects. Unlike projective measurements, POVM elements need not be orthogonal, need not be projectors, and there can be more outcomes than the dimension of the Hilbert space.

Key Concept.

Every POVM can be realized as a projective measurement on a larger system. This is Naimark's dilation theorem: given a POVM $\{E_m\}$ on system $A$, there exists an ancilla system $B$, an initial ancilla state $|0\rangle_B$, a unitary $U$ on $AB$, and projectors $\{P_m\}$ on $B$ such that $\text{Tr}(\rho_A E_m) = \text{Tr}(U(\rho_A \otimes |0\rangle\langle 0|_B)U^\dagger (I_A \otimes P_m))$. POVMs do not add new physics beyond projective measurements plus ancillas, but they provide a cleaner mathematical framework.

Why POVMs?

The power of POVMs lies in their flexibility. A qubit has a two-dimensional Hilbert space, so a projective measurement can have at most two outcomes. But a POVM can have three, four, or any number of outcomes. This extra freedom is crucial for:

Optimal state discrimination: Distinguishing non-orthogonal states as well as quantum mechanics allows (see Section 26.4).
Quantum tomography: Extracting maximal information about an unknown state from a limited number of copies.
Quantum key distribution: Security proofs often require analyzing the most general possible measurement an eavesdropper could perform.

Projective Measurements as Special POVMs

Every projective measurement is automatically a POVM: if $\{P_m\}$ are orthogonal projectors with $\sum_m P_m = I$, then each $P_m$ is positive semi-definite and $P_m^2 = P_m$. The converse is not true: a general POVM element $E_m$ need not satisfy $E_m^2 = E_m$. The distinction matters because projective measurements are repeatable (measuring twice gives the same result) while general POVM measurements are not.

Example: Trine POVM

Consider three qubit states equally spaced in the $xz$-plane of the Bloch sphere, forming an equilateral triangle on a great circle:

$$|e_1\rangle = |0\rangle, \quad |e_2\rangle = \frac{-1}{2}|0\rangle + \frac{\sqrt{3}}{2}|1\rangle, \quad |e_3\rangle = \frac{-1}{2}|0\rangle - \frac{\sqrt{3}}{2}|1\rangle$$

The three operators $E_k = \frac{2}{3}|e_k\rangle\langle e_k|$ satisfy $\sum_k E_k = I$ and form a valid POVM with three outcomes on a two-dimensional space - impossible with projective measurements. This trine POVM is optimal for certain state estimation tasks.

To verify completeness, note that the three Bloch vectors of $|e_k\rangle$ point to the vertices of an equilateral triangle in the $xz$-plane. The sum $\sum_k |e_k\rangle\langle e_k|$ yields $\frac{3}{2}I$ by symmetry (the three rank-1 projectors sum to $3/2$ times the identity because each direction is equally represented). Multiplying by $2/3$ gives the identity.

General Quantum Instruments

A POVM tells us the probability of each outcome but does not specify the post-measurement state. The full description of a measurement that includes both outcome probabilities and state update rules is a quantum instrument: a collection of completely positive maps $\{\mathcal{E}_m\}$ such that $\sum_m \mathcal{E}_m$ is trace-preserving. The probability of outcome $m$ is $p(m) = \text{Tr}(\mathcal{E}_m(\rho))$, and the post-measurement state (conditioned on outcome $m$) is $\rho_m = \mathcal{E}_m(\rho)/p(m)$. The associated POVM elements are recovered by $E_m = \mathcal{E}_m^\dagger(I)$.

26.4 Quantum State Discrimination

One of the most fundamental tasks in quantum information is state discrimination: given a quantum system prepared in one of several known states, determine which state it is. This task reveals a profound difference between classical and quantum information. Classically, distinct states can always be perfectly distinguished. Quantumly, non-orthogonal states cannot be perfectly distinguished - a fundamental limit imposed by the laws of physics.

Minimum-Error Discrimination

Suppose Alice prepares one of two states $\rho_0$ or $\rho_1$ with prior probabilities $p_0$ and $p_1 = 1 - p_0$. Bob performs a measurement to guess which state was sent. His measurement is described by a POVM $\{E_0, E_1\}$, and his success probability is:

$$p_{\text{succ}} = p_0 \text{Tr}(\rho_0 E_0) + p_1 \text{Tr}(\rho_1 E_1)$$

The Helstrom bound gives the maximum success probability:

$$p_{\text{succ}}^{\max} = \frac{1}{2}\left(1 + \|\,p_0 \rho_0 - p_1 \rho_1\,\|_1\right)$$

where $\|A\|_1 = \text{Tr}\sqrt{A^\dagger A}$ is the trace norm. The optimal POVM is to project onto the positive and negative eigenspaces of the operator $p_0 \rho_0 - p_1 \rho_1$. For two pure states $|\psi_0\rangle$ and $|\psi_1\rangle$ with equal priors, this simplifies to:

$$p_{\text{succ}}^{\max} = \frac{1}{2}\left(1 + \sqrt{1 - |\langle\psi_0|\psi_1\rangle|^2}\right)$$

When $\langle\psi_0|\psi_1\rangle = 0$, we get $p_{\text{succ}} = 1$ (perfect discrimination). When $|\langle\psi_0|\psi_1\rangle| = 1$, the states are identical and $p_{\text{succ}} = 1/2$ (pure guessing).

Worked Example: Discriminating $|0\rangle$ and $|+\rangle$

Consider the concrete problem of distinguishing $|0\rangle$ and $|+\rangle = \frac{1}{\sqrt{2}}(|0\rangle + |1\rangle)$ with equal priors. The overlap is $|\langle 0|+\rangle|^2 = 1/2$, so perfect discrimination is impossible. The Helstrom bound gives:

$$p_{\text{succ}}^{\max} = \frac{1}{2}\left(1 + \sqrt{1 - 1/2}\right) = \frac{1}{2}\left(1 + \frac{1}{\sqrt{2}}\right) \approx 0.854$$

The optimal measurement projects onto the eigenstates of $\frac{1}{2}|0\rangle\langle 0| - \frac{1}{2}|+\rangle\langle +|$. This measurement is tilted at an intermediate angle between the $Z$ basis and the $X$ basis - it is not aligned with either state, but rather balances the errors optimally.

Unambiguous Discrimination

An alternative strategy allows an inconclusive outcome ("I don't know") but demands that whenever a definite answer is given, it is always correct. For two pure states with equal priors, the optimal failure probability is:

$$p_{\text{fail}} = |\langle\psi_0|\psi_1\rangle|$$

This is the Ivanovic-Dieks-Peres (IDP) bound. Unambiguous discrimination requires a three-outcome POVM $\{E_0, E_1, E_?\}$ - precisely the kind of measurement that projective measurements cannot implement on a qubit.

For $|0\rangle$ and $|+\rangle$: $p_{\text{fail}} = 1/\sqrt{2} \approx 0.707$. The POVM element $E_1$ (detecting $|0\rangle$) must be proportional to the projector onto $|-\rangle$ (orthogonal to $|+\rangle$), and $E_0$ (detecting $|+\rangle$) must be proportional to $|1\rangle\langle 1|$ (orthogonal to $|0\rangle$). When the measurement gives outcome "?", the result is inconclusive - but when it gives 0 or 1, the identification is certain.

Multiple State Discrimination

For $M > 2$ states, the optimal discrimination strategy is given by the pretty good measurement (PGM), also known as the square root measurement. For an ensemble $\{p_x, \rho_x\}$, the PGM is the POVM with elements:

$$E_x = \rho^{-1/2} (p_x \rho_x) \rho^{-1/2}$$

where $\rho = \sum_x p_x \rho_x$ is the average state (and the inverse is taken on the support of $\rho$). The PGM is not always optimal, but it is simple, explicit, and often near-optimal. For symmetric state sets (like the trine states), it is exactly optimal.

The Holevo Bound

State discrimination is intimately connected to the amount of classical information that can be extracted from quantum states. Suppose Alice encodes a classical random variable $X$ by preparing state $\rho_x$ with probability $p_x$. Bob performs a measurement, obtaining outcome $Y$. How much classical mutual information $I(X:Y)$ can Bob extract?

Key Concept.

The Holevo bound (1973) states that for any measurement Bob performs: $$I(X:Y) \leq \chi \equiv S\!\left(\sum_x p_x \rho_x\right) - \sum_x p_x S(\rho_x)$$ where $S(\rho) = -\text{Tr}(\rho \log \rho)$ is the von Neumann entropy. The quantity $\chi$ is called the Holevo information or Holevo $\chi$-quantity. For a $d$-dimensional system, $\chi \leq \log d$, which implies that a qubit can carry at most one classical bit of accessible information - even though its state space is continuous.

The Holevo bound has several important consequences:

No superdense coding without entanglement: Without pre-shared entanglement, a single qubit cannot convey more than one bit.
Quantum states are not infinitely informative: Despite the continuous parameters needed to describe a quantum state, the extractable classical information per system is bounded by $\log d$.
Achievability: The Holevo bound is asymptotically tight. The Holevo-Schumacher-Westmoreland (HSW) theorem (Chapter 29) shows that the rate $\chi$ can be achieved using block coding over many channel uses.

Common Misconception.

The Holevo bound does not say that $n$ qubits can carry at most $n$ classical bits in all circumstances. With pre-shared entanglement, superdense coding lets one qubit carry two classical bits. The Holevo bound applies to the scenario without pre-shared resources. Similarly, quantum random access codes can achieve advantages in certain communication tasks, but they do not violate the Holevo bound - they operate in a setting where the success criterion is different from maximizing mutual information.

Interactive: Bloch Ball Interior

Pure states live on the surface of the Bloch sphere. Mixed states live inside. Adjust the purity parameter to move from a pure state ($r = 1$) toward the maximally mixed state at the center ($r = 0$). The density matrix, purity, and von Neumann entropy update in real time.

$r$ (Bloch radius): 1.00 $\phi$: 0.00

OPENQASM 2.0; include "qelib1.inc"; qreg q[1]; creg c[1]; ry({r}) q[0]; rz({phi}) q[0]; c[0] = measure q[0];

Interactive: Partial Trace Step-Through

Watch the partial trace in action. Starting from the Bell state density matrix $\rho_{AB} = |\Phi^+\rangle\langle\Phi^+|$, trace out qubit B to obtain the reduced state $\rho_A = I/2$.

The Bell State Density Matrix

$|\Phi^+\rangle = \frac{1}{\sqrt{2}}(|00\rangle + |11\rangle)$. The $4 \times 4$ density matrix has entries only in the corners of the matrix.

1/2

Rows/cols: |00>, |01>, |10>, |11>

Partition into 2x2 Blocks

To trace out B, view the 4x4 matrix as a 2x2 grid of 2x2 blocks. Each block corresponds to a pair of values of qubit A.

A=0, A'=0

1/2, 0
0, 0

A=0, A'=1

0, 1/2
0, 0

A=1, A'=0

0, 0
1/2, 0

A=1, A'=1

0, 0
0, 1/2

Trace Each Block

The partial trace sums the diagonal of each 2x2 block: $(\rho_A)_{ij} = \text{Tr}(\text{block}_{ij})$. The diagonal-block traces give the diagonal entries of $\rho_A$, and the off-diagonal-block traces give the off-diagonal entries.

Tr = 1/2

Tr = 0

Tr = 1/2

Result: $\rho_A = I/2$

The reduced state of qubit A is the maximally mixed state $I/2$. Despite the joint state being pure, the subsystem is maximally uncertain. This is the hallmark of maximal entanglement: $S(\rho_A) = 1$ bit.

1/2

Purity = 0.5 | Entropy = 1.0 bit | Maximally mixed

Sandbox: Density Matrix Explorer

The sandbox below prepares different quantum states and measures them. Observe how a pure state like $|0\rangle$ always gives the same outcome, while creating a Bell state and tracing out one qubit yields a maximally mixed reduced state with 50/50 outcomes. Try modifying the circuit: add an $H$ gate before measurement to see how the Bloch vector direction affects outcome statistics.

Remove the CNOT line (cx q[0], q[1];) - now qubit 0 is a pure state $|+\rangle$ and you should still see 50/50, but the state is pure rather than mixed.
Add c[1] = measure q[1]; to measure both qubits of the Bell state. You should see correlations: 00 and 11 appear, but never 01 or 10.
Replace the Hadamard with a rotation rx(0.5) q[0]; to create a partially mixed reduced state when combined with the CNOT.

Interactive: Density Matrix Under Noise

The density matrix formalism comes into its own when describing noisy quantum states. The simulation below prepares a Bell state $\frac{1}{\sqrt{2}}(|00\rangle + |11\rangle)$ and subjects it to a noise channel. The ideal panel shows the pure Bell state outcomes (only 00 and 11), while the noisy panel shows how noise populates the forbidden outcomes 01 and 10. The purity drops from 1.0 (pure state) toward 0.25 (maximally mixed on 2 qubits) as noise increases, and the StateCity visualization reveals the density matrix structure directly.

Things to observe:

Phase damping destroys the off-diagonal elements of the density matrix (the $|00\rangle\langle 11|$ and $|11\rangle\langle 00|$ coherences) while leaving the diagonal populations unchanged. The histogram stays close to 50/50 between 00 and 11, but the state becomes mixed - it transitions from a coherent superposition to a classical mixture.
Amplitude damping preferentially drives population toward $|00\rangle$. At high noise, the histogram is dominated by the 00 outcome.
Depolarizing noise at strength $p = 0.5$ reduces the purity to near its minimum value of 0.25. The density matrix approaches $I/4$ and all four outcomes become equally likely.

Chapter 26: Quantum States and Measurements, Revisited

26.1 Pure States, Mixed States, and the Density Operator

Pure vs. Mixed: The Purity Test

Single-Qubit Density Matrices

Expectation Values and the Born Rule

Non-Uniqueness of Ensembles

Worked Example: Computing a Density Matrix

Time Evolution of Density Matrices

26.2 The Partial Trace: Describing Subsystems

Example: Bell State

The Schmidt Decomposition

Purification

Worked Example: Partial Trace of a Product State

Entanglement and the Partial Trace: A Summary

26.3 Generalized Measurements: POVMs

Definition

Why POVMs?

Projective Measurements as Special POVMs

Example: Trine POVM

General Quantum Instruments

26.4 Quantum State Discrimination

Minimum-Error Discrimination

Worked Example: Discriminating $|0\rangle$ and $|+\rangle$

Unambiguous Discrimination

Multiple State Discrimination

The Holevo Bound

Interactive: Bloch Ball Interior

Interactive: Partial Trace Step-Through

The Bell State Density Matrix

Partition into 2x2 Blocks

Trace Each Block

Result: $\rho_A = I/2$

Interactive: Schmidt Decomposition

Interactive: POVM vs Projective Measurement

Sandbox: Density Matrix Explorer

Interactive: Density Matrix Under Noise