Chapter 28: Quantum Entropy and Information

Information theory begins with a single question: how do we quantify uncertainty? In the classical world, Claude Shannon answered this in 1948 with the entropy function. In the quantum world, John von Neumann answered it even earlier - in 1927 - with what we now call the von Neumann entropy. This chapter develops quantum entropy from its classical roots, introduces the key information-theoretic quantities (relative entropy, mutual information, conditional entropy), proves the fundamental inequalities that constrain them, and connects entropy to entanglement. These tools are the foundation of quantum Shannon theory in Chapter 29.

28.1 Classical Entropy: Shannon's Foundation

Before diving into quantum entropy, we review the classical theory that it generalizes. Let $X$ be a discrete random variable taking values $x$ with probabilities $p(x)$. The Shannon entropy is:

$$H(X) = -\sum_x p(x) \log p(x)$$

where logarithms are base 2 (so entropy is measured in bits) and we adopt the convention $0 \log 0 = 0$. The Shannon entropy measures the average surprise, or equivalently the minimum average number of bits needed to describe an outcome of $X$.

Properties of Shannon Entropy

The Shannon entropy satisfies several fundamental properties:

Non-negativity: $H(X) \geq 0$, with equality iff $X$ is deterministic.
Maximum: $H(X) \leq \log |X|$, with equality iff $X$ is uniformly distributed over its alphabet.
Concavity: $H(\lambda p + (1-\lambda) q) \geq \lambda H(p) + (1-\lambda) H(q)$ for $0 \leq \lambda \leq 1$.
Additivity: $H(X, Y) = H(X) + H(Y)$ when $X$ and $Y$ are independent.
Chain rule: $H(X, Y) = H(X) + H(Y|X)$.

Joint, Conditional, and Mutual Information

For a pair of random variables $(X, Y)$:

Joint entropy: $H(X, Y) = -\sum_{x,y} p(x,y) \log p(x,y)$
Conditional entropy: $H(Y|X) = H(X, Y) - H(X) = -\sum_{x,y} p(x,y) \log p(y|x)$
Mutual information: $I(X:Y) = H(X) + H(Y) - H(X,Y) = H(X) - H(Y|X)$

Mutual information $I(X:Y) \geq 0$ measures the total correlation between $X$ and $Y$. It equals zero when $X$ and $Y$ are independent and equals $H(X)$ when $Y$ determines $X$ completely.

Worked Example: Entropy of a Binary Source

Consider a binary source $X$ that produces 0 with probability $p$ and 1 with probability $1 - p$. The Shannon entropy is the binary entropy function:

$$H(X) = h(p) = -p \log p - (1-p)\log(1-p)$$

At $p = 0$ or $p = 1$: $H = 0$ (the outcome is certain, no surprise). At $p = 1/2$: $H = 1$ bit (maximum uncertainty - each outcome is equally likely). Between these extremes, $H$ interpolates smoothly. By Shannon's source coding theorem, a stream of i.i.d. samples from this source can be compressed to an average of $h(p)$ bits per sample and no fewer. For $p = 0.1$, this is $h(0.1) \approx 0.469$ bits - less than half a bit per symbol, because the source is highly predictable.

Classical Relative Entropy

The relative entropy (Kullback-Leibler divergence) between two distributions $p$ and $q$ is:

$$D(p \| q) = \sum_x p(x) \log \frac{p(x)}{q(x)}$$

It satisfies $D(p \| q) \geq 0$ with equality iff $p = q$ (Gibbs' inequality). Relative entropy is not symmetric and not a distance in the metric sense, but it measures a kind of statistical distinguishability. Many information-theoretic quantities can be expressed as special cases of relative entropy: for instance, $I(X:Y) = D(p_{XY} \| p_X \otimes p_Y)$.

Key Concept.

Classical entropy is the unique function (up to a constant) satisfying continuity, maximality for the uniform distribution, and the chain rule. Shannon's 1948 theorem shows that $H(X)$ is both the minimum compression rate for the source (source coding theorem) and a building block for the capacity of noisy channels (channel coding theorem). The quantum generalizations in this chapter follow the same program.

28.2 Von Neumann Entropy

The quantum generalization of Shannon entropy is the von Neumann entropy, defined for a density operator $\rho$ as:

$$S(\rho) = -\text{Tr}(\rho \log \rho)$$

If $\rho$ has eigenvalues $\{\lambda_i\}$, this reduces to $S(\rho) = -\sum_i \lambda_i \log \lambda_i$, which is just the Shannon entropy of the eigenvalue distribution. The von Neumann entropy measures the uncertainty associated with the quantum state $\rho$ - how mixed it is.

Basic Properties

The von Neumann entropy inherits and extends many properties of Shannon entropy:

Non-negativity: $S(\rho) \geq 0$, with equality iff $\rho$ is pure.
Maximum: $S(\rho) \leq \log d$ for a $d$-dimensional system, with equality iff $\rho = I/d$ (maximally mixed).
Concavity: $S(\sum_i p_i \rho_i) \geq \sum_i p_i S(\rho_i)$. Mixing states never decreases entropy.
Unitary invariance: $S(U\rho U^\dagger) = S(\rho)$. Unitary evolution does not change entropy.
Additivity on product states: $S(\rho_A \otimes \rho_B) = S(\rho_A) + S(\rho_B)$.

Key Concept.

The von Neumann entropy is the quantum analog of Shannon entropy. It equals zero for pure states (complete knowledge) and reaches its maximum $\log d$ for the maximally mixed state (maximum ignorance). It is the fundamental quantity in quantum information theory, playing the role of information content, compression rate, and capacity building block.

Where Classical Intuition Breaks Down

One property of Shannon entropy that does not carry over to von Neumann entropy is monotonicity. Classically, $H(X, Y) \geq H(X)$: the joint entropy is always at least as large as the marginal entropy. Knowing the whole system cannot be less uncertain than knowing a part. In quantum mechanics, this fails spectacularly.

Consider a Bell state $|\Phi^+\rangle = \frac{1}{\sqrt{2}}(|00\rangle + |11\rangle)$. The joint system is pure, so $S(\rho_{AB}) = 0$. But the reduced state of each qubit is maximally mixed: $\rho_A = I/2$, so $S(\rho_A) = 1$ bit. We have $S(\rho_{AB}) = 0 < 1 = S(\rho_A)$ - the whole has less entropy than the part. This is a uniquely quantum phenomenon, a direct consequence of entanglement.

Common Misconception.

It is tempting to think that "less entropy means less information," so a pure entangled state has no information. This is wrong. The pure entangled state contains maximal information about the joint system (we know the state perfectly). What it lacks is information about the individual subsystems. The entropy of a subsystem reflects our ignorance about that subsystem when we disregard the other subsystem. Entanglement distributes information globally in a way that leaves local subsystems highly uncertain.

Worked Example: Entropy of a Werner State

A Werner state on two qubits is a mixture of the maximally entangled Bell state and the maximally mixed state:

$$\rho_W = p |\Phi^+\rangle\langle\Phi^+| + (1-p)\frac{I}{4}$$

The eigenvalues are $\frac{1+3p}{4}$ (with multiplicity 1, corresponding to the Bell state subspace) and $\frac{1-p}{4}$ (with multiplicity 3). The von Neumann entropy is:

$$S(\rho_W) = -\frac{1+3p}{4}\log\frac{1+3p}{4} - 3 \cdot \frac{1-p}{4}\log\frac{1-p}{4}$$

At $p = 1$: $S = 0$ (pure Bell state). At $p = 0$: $S = 2$ bits (maximally mixed on 4 dimensions). The reduced state of each qubit is always $I/2$ regardless of $p$ (because the Bell state and the maximally mixed state both have maximally mixed marginals), so $S(\rho_A) = 1$ for all $p$. For $p > 0$, we have $S(\rho_W) < S(\rho_A) + S(\rho_B) = 2$, confirming correlations, and for $p$ large enough, $S(\rho_W) < S(\rho_A)$, confirming entanglement.

The Binary Entropy Function

For a qubit state with eigenvalues $\lambda$ and $1 - \lambda$, the von Neumann entropy reduces to the binary entropy:

$$h(\lambda) = -\lambda \log \lambda - (1 - \lambda) \log(1 - \lambda)$$

This function rises from $h(0) = 0$ to $h(1/2) = 1$ and back to $h(1) = 0$. It appears throughout quantum information theory whenever a two-outcome situation arises - channel capacities, error correction thresholds, and entanglement measures.

Continuity: The Fannes-Audenaert Inequality

How much can the entropy change when the state changes by a small amount? The Fannes-Audenaert inequality provides a tight bound: if $\frac{1}{2}\|\rho - \sigma\|_1 \leq \epsilon \leq 1 - 1/d$, then:

$$|S(\rho) - S(\sigma)| \leq \epsilon \log(d - 1) + h(\epsilon)$$

where $h$ is the binary entropy. This continuity bound is essential for proving robustness results in quantum information theory: small errors in state preparation lead to only small errors in entropy, and hence in capacity calculations.

28.3 Quantum Relative Entropy and Mutual Information

The quantum relative entropy generalizes the Kullback-Leibler divergence to density operators:

$$D(\rho \| \sigma) = \text{Tr}(\rho \log \rho) - \text{Tr}(\rho \log \sigma)$$

defined when the support of $\rho$ is contained in the support of $\sigma$ (otherwise $D(\rho \| \sigma) = +\infty$). Quantum relative entropy is the single most important quantity in quantum information theory because virtually every other entropy quantity can be expressed in terms of it.

Klein's Inequality

The fundamental property of quantum relative entropy is Klein's inequality:

$$D(\rho \| \sigma) \geq 0$$

with equality if and only if $\rho = \sigma$. This is the quantum analog of Gibbs' inequality. The proof uses the operator convexity of $x \mapsto x \log x$ and the spectral theorem. Klein's inequality is the source of almost all entropy inequalities in quantum information theory.

Monotonicity (Data Processing Inequality)

A deeper property is the monotonicity of quantum relative entropy: for any quantum channel $\mathcal{E}$,

$$D(\mathcal{E}(\rho) \| \mathcal{E}(\sigma)) \leq D(\rho \| \sigma)$$

Processing (applying a channel) can never increase the distinguishability of two states. This is also called the quantum data processing inequality. It was proved by Lindblad (1975) and is equivalent to strong subadditivity (Section 28.4).

Key Concept.

The monotonicity of relative entropy $D(\mathcal{E}(\rho) \| \mathcal{E}(\sigma)) \leq D(\rho \| \sigma)$ is perhaps the most powerful single inequality in quantum information theory. It encodes the irreversibility of quantum channels: physical processes can only destroy, never create, the ability to distinguish quantum states.

Quantum Mutual Information

For a bipartite state $\rho_{AB}$, the quantum mutual information is:

$$I(A:B) = S(\rho_A) + S(\rho_B) - S(\rho_{AB})$$

This can be written as a relative entropy:

$$I(A:B) = D(\rho_{AB} \| \rho_A \otimes \rho_B)$$

which makes non-negativity immediate from Klein's inequality. The quantum mutual information measures the total correlations (both classical and quantum) between $A$ and $B$. It equals zero iff $\rho_{AB} = \rho_A \otimes \rho_B$ (product state, no correlations).

For a pure bipartite state, $I(A:B) = 2S(\rho_A) = 2S(\rho_B)$. The factor of two reflects the fact that entangled states contain quantum correlations above and beyond classical correlations.

Quantum Conditional Entropy

The quantum conditional entropy is defined as:

$$S(A|B) = S(\rho_{AB}) - S(\rho_B)$$

Unlike its classical counterpart $H(X|Y) \geq 0$, the quantum conditional entropy can be negative. This happens precisely for entangled states. For a maximally entangled state of two qubits: $S(A|B) = S(\rho_{AB}) - S(\rho_B) = 0 - 1 = -1$. Negative conditional entropy is a signature of entanglement and has an operational interpretation: it quantifies the potential for quantum state merging, where the "negative information" represents the entanglement that can be gained in the merging protocol.

Note.

The chain rule still holds: $S(A, B) = S(A|B) + S(B)$. But because $S(A|B)$ can be negative, the chain rule does not imply $S(A, B) \geq S(B)$ - and indeed this inequality fails for entangled states. This is consistent with our earlier observation that the von Neumann entropy is not monotone.

Conditional Mutual Information

For a tripartite state $\rho_{ABC}$, the conditional mutual information is:

$$I(A:B|C) = S(A|C) + S(B|C) - S(A,B|C) = S(\rho_{AC}) + S(\rho_{BC}) - S(\rho_{ABC}) - S(\rho_C)$$

Strong subadditivity (Section 28.4) is equivalent to the statement $I(A:B|C) \geq 0$ - conditioning on a third system cannot make correlations negative.

Worked Example: Mutual Information of a Bell State

For the Bell state $|\Phi^+\rangle$: $S(\rho_{AB}) = 0$ (pure state), $S(\rho_A) = 1$ bit, $S(\rho_B) = 1$ bit. The quantum mutual information is:

$$I(A:B) = S(\rho_A) + S(\rho_B) - S(\rho_{AB}) = 1 + 1 - 0 = 2 \text{ bits}$$

This is the maximum possible for two qubits (mutual information is bounded by $2\log d = 2$). Compare with a classically correlated state $\rho_{AB} = \frac{1}{2}(|00\rangle\langle 00| + |11\rangle\langle 11|)$: the joint state has $S(\rho_{AB}) = 1$, and the marginals are still $S(\rho_A) = S(\rho_B) = 1$, giving $I(A:B) = 1$ bit. The Bell state has exactly twice the mutual information of its classical counterpart because it contains quantum correlations (entanglement) on top of classical correlations.

Summary of Entropic Quantities

Quantity	Definition	Range
$S(\rho)$	$-\text{Tr}(\rho \log \rho)$	$[0, \log d]$
$D(\rho \\| \sigma)$	$\text{Tr}(\rho \log \rho - \rho \log \sigma)$	$[0, +\infty]$
$I(A:B)$	$S(A) + S(B) - S(AB)$	$[0, 2\log d]$
$S(A\|B)$	$S(AB) - S(B)$	$[-\log d, \log d]$
$I(A:B\|C)$	$S(AC) + S(BC) - S(ABC) - S(C)$	$[0, 2\log d]$

28.4 Strong Subadditivity

The most important inequality in quantum information theory is the strong subadditivity (SSA) of the von Neumann entropy, proved by Lieb and Ruskai in 1973:

Key Concept.

Strong Subadditivity. For any tripartite quantum state $\rho_{ABC}$: $$S(\rho_{ABC}) + S(\rho_C) \leq S(\rho_{AC}) + S(\rho_{BC})$$ Equivalently, $I(A:B|C) \geq 0$: the quantum conditional mutual information is non-negative. This was proved by Lieb and Ruskai (1973) and is considered the deepest known fact about the von Neumann entropy.

Strong subadditivity has an intuitive reading: the correlations between $A$ and $B$ cannot become negative when conditioned on $C$. Knowing $C$ might reduce the correlations between $A$ and $B$, but it cannot make them negative.

Equivalent Forms

Strong subadditivity can be restated in several useful ways. By relabeling subsystems or choosing special cases:

Discarding never increases mutual information: $I(A:BC) \geq I(A:B)$. Tracing out part of $B$'s system cannot increase correlations with $A$.
Subadditivity: Setting $C$ to be trivial gives $S(\rho_{AB}) \leq S(\rho_A) + S(\rho_B)$.
Triangle inequality (Araki-Lieb): $S(\rho_{AB}) \geq |S(\rho_A) - S(\rho_B)|$.
Conditioning reduces entropy: $S(A|BC) \leq S(A|B)$. Having more information (system $C$) can only reduce conditional entropy.

Proof Sketch

The original proof by Lieb and Ruskai reduced SSA to Lieb's concavity theorem, a deep result in matrix analysis. The key technical step is proving the joint convexity of the quantum relative entropy: $D(\lambda \rho_1 + (1-\lambda)\rho_2 \| \lambda \sigma_1 + (1-\lambda)\sigma_2) \leq \lambda D(\rho_1 \| \sigma_1) + (1-\lambda)D(\rho_2 \| \sigma_2)$. From this, the monotonicity of relative entropy follows, and SSA is a direct consequence.

An elegant alternative proof, given by Nielsen and Petz, derives SSA from the monotonicity of relative entropy under partial trace. Consider the relative entropy $D(\rho_{ABC} \| I_A/d_A \otimes \rho_{BC})$. Applying the partial trace over $A$ (a quantum channel) and using monotonicity gives $D(\rho_{BC} \| \rho_{BC}) = 0$ on the right, and expanding the left side gives SSA.

Equality Conditions

SSA holds with equality ($I(A:B|C) = 0$) if and only if $\rho_{ABC}$ is a quantum Markov chain: there exists a recovery channel $\mathcal{R}: C \to AC$ such that $\mathcal{R}(\rho_{BC}) = \rho_{ABC}$. This was proved by Hayden, Jozsa, Petz, and Winter (2004). In other words, $A$ and $B$ are conditionally independent given $C$ in a precise quantum sense: all information that $A$ has about $B$ is already contained in $C$.

Applications

Strong subadditivity is not merely an abstract inequality. It is a workhorse:

Quantum error correction: SSA constrains the information accessible to an eavesdropper, which is essential for proving the security of quantum key distribution.
Channel capacity: The additivity (or non-additivity) of channel capacities is closely tied to entropic inequalities derived from SSA.
Area laws in physics: Entropy bounds for ground states of local Hamiltonians rely on SSA to constrain how entanglement scales with subsystem size.
Thermodynamics: The second law of thermodynamics in its quantum information-theoretic form is a consequence of monotonicity of relative entropy, which is equivalent to SSA.

Common Misconception.

"Subadditivity" and "strong subadditivity" sound similar but are very different in depth. Subadditivity ($S(AB) \leq S(A) + S(B)$) follows easily from the non-negativity of mutual information. Strong subadditivity ($S(ABC) + S(C) \leq S(AC) + S(BC)$) is far more powerful and far harder to prove. It took 5 years from when the conjecture was first posed (Lanford and Robinson, 1968) to when it was proved (Lieb and Ruskai, 1973). No simple proof is known even today.

28.5 Entanglement Measures

Entanglement is a resource: it enables teleportation, superdense coding, and quantum key distribution. But how much entanglement does a given state contain? The answer requires a way to quantify entanglement, and the entropy tools developed in this chapter provide exactly that.

Entanglement Entropy

For a pure bipartite state $|\psi\rangle_{AB}$, the entanglement entropy (or entropy of entanglement) is the von Neumann entropy of either reduced state:

$$E(|\psi\rangle) = S(\rho_A) = S(\rho_B)$$

This equals $-\sum_i \lambda_i^2 \log \lambda_i^2$, where $\lambda_i$ are the Schmidt coefficients. For pure states, entanglement entropy is the unique measure of entanglement (up to normalization) satisfying a set of natural axioms: it is zero for product states, maximal for maximally entangled states, non-increasing under local operations and classical communication (LOCC), and additive on tensor products.

Entanglement of Formation

For mixed states, quantifying entanglement is much harder because a mixed state can have correlations that are purely classical. The entanglement of formation extends entanglement entropy to mixed states by minimizing over all possible pure-state decompositions:

$$E_f(\rho_{AB}) = \min_{\{p_i, |\psi_i\rangle\}} \sum_i p_i E(|\psi_i\rangle)$$

where the minimum is over all ensembles $\{p_i, |\psi_i\rangle\}$ with $\rho_{AB} = \sum_i p_i |\psi_i\rangle\langle\psi_i|$. For two qubits, Wootters (1998) found an analytical formula in terms of the concurrence.

Distillable Entanglement

An operationally motivated measure is the distillable entanglement $E_d(\rho)$: the maximum rate at which Bell pairs can be extracted from many copies of $\rho$ using LOCC. This is the entanglement that is actually usable for tasks like teleportation. For pure states, $E_d = E_f = S(\rho_A)$, but for mixed states $E_d \leq E_f$ in general.

The PPT Criterion and Bound Entanglement

The Peres-Horodecki criterion (PPT test) provides a necessary condition for separability: if $\rho_{AB}$ is separable, then its partial transpose $\rho_{AB}^{T_B}$ (transpose on system $B$ only) is positive semi-definite. For $2 \times 2$ and $2 \times 3$ systems, this condition is also sufficient. In higher dimensions, there exist entangled states with positive partial transpose - bound entangled states - from which no Bell pairs can be distilled ($E_d = 0$) even though $E_f > 0$.

Note.

The existence of bound entangled states shows that entanglement theory is fundamentally irreversible for mixed states: it can cost entanglement to create a state from which no entanglement can be recovered. This irreversibility is a consequence of the noisy nature of mixed-state entanglement and has deep connections to the irreversibility of thermodynamic processes.

Summary of Entanglement Measures

Measure	Definition	Operational meaning
Entanglement entropy $E$	$S(\rho_A)$ for pure $\|\psi\rangle_{AB}$	Rate of Bell pair interconversion
Entanglement of formation $E_f$	$\min \sum_i p_i E(\|\psi_i\rangle)$	Cost to create $\rho$ from Bell pairs via LOCC
Distillable entanglement $E_d$	Max Bell pair extraction rate	Usable entanglement for protocols
Relative entropy of entanglement	$\min_{\sigma \in \text{SEP}} D(\rho \\| \sigma)$	Distinguishability from separable states
Negativity	$(\\|\rho^{T_B}\\|_1 - 1)/2$	Computable entanglement witness

Interactive: Von Neumann Entropy on the Bloch Ball

The von Neumann entropy of a qubit state depends only on its purity (distance from the center of the Bloch sphere). Pure states on the surface have $S = 0$; the maximally mixed state at the center has $S = 1$ bit. The simulation below shows how noise pushes a pure state inward, increasing its entropy.

Channel: Noise $p$: 0.20

OPENQASM 2.0; include "qelib1.inc"; qreg q[1]; creg c[1]; h q[0]; c[0] = measure q[0];

Interactive: Entropy Venn Diagram

The quantum entropy relationships are often visualized as a Venn diagram. Select a state type to see how $S(A)$, $S(B)$, $S(AB)$, $I(A:B)$, and $S(A|B)$ relate. For entangled states, the conditional entropy $S(A|B)$ can be negative - shown in red - a uniquely quantum phenomenon.

Interactive: Entanglement Entropy

The state $\cos\theta|00\rangle + \sin\theta|11\rangle$ smoothly interpolates between a product state ($\theta = 0$, $S = 0$) and a maximally entangled Bell state ($\theta = \pi/4$, $S = 1$ bit). Sweep $\theta$ to watch the entanglement entropy and the measurement distribution change together.

$\theta$: 0.79

OPENQASM 2.0; include "qelib1.inc"; qreg q[2]; creg c[2]; ry({theta}) q[0]; cx q[0], q[1]; c[0] = measure q[0]; c[1] = measure q[1];

Sandbox: Entropy Explorer

Use this sandbox to explore how entanglement entropy arises. The circuit below creates a partially entangled state by applying a rotation followed by a CNOT. By changing the rotation angle, you control the Schmidt coefficients and hence the entanglement. When the rotation is $0$, the state is a product state. When it is $\pi/2$ (i.e., a Hadamard), the state is maximally entangled.

Experiments to try:

No entanglement: Change ry(1.5708) to ry(0). The state is $|00\rangle$ - both qubits are deterministic. Entanglement entropy: 0 bits.
Maximum entanglement: Keep ry(1.5708) (this is $\pi/2$). The state is $\frac{1}{\sqrt{2}}(|00\rangle + |11\rangle)$. You see 50/50 between 00 and 11. Entanglement entropy: 1 bit.
Partial entanglement: Try ry(0.5). The outcomes are no longer 50/50 - one is more likely. Entanglement entropy is between 0 and 1 bit.
Product vs. mixed: With $\theta = 0$ (product state), measuring q[0] gives deterministic results. With $\theta = \pi/2$ (Bell state), measuring q[0] alone gives 50/50 - the reduced state is maximally mixed.

Interactive: Purity and Noise

The von Neumann entropy $S(\rho) = -\text{Tr}(\rho \log \rho)$ is closely related to purity $\text{Tr}(\rho^2)$: both measure how mixed a state is, and both are monotonically related for qubit systems. The simulation below shows a single qubit initially in the pure state $|0\rangle$ subjected to increasing noise. Watch the purity drop from 1.0 (pure, $S = 0$) toward 0.5 (maximally mixed, $S = 1$ bit) as noise increases.

Observations connecting purity to entropy:

At zero noise, the state is pure $|0\rangle$ with purity 1.0 and entropy $S = 0$. The ideal and noisy histograms agree - outcome 0 with certainty.
Under depolarizing noise at strength $p$ (replacement probability), the purity is $(1 + 3(1-p)^2)/4$ for a qubit starting in a pure state. At $p = 0.5$, purity is approximately 0.69; at $p = 1$ (fully depolarizing), purity reaches the minimum of $1/2$ and the entropy reaches 1 bit.
Under amplitude damping, the state is already $|0\rangle$ (the ground state), so the channel has minimal effect - the purity stays near 1.0. Try modifying the circuit to start in $|1\rangle$ (add an $X$ gate before measurement) to see amplitude damping in action.
Under phase damping, the state $|0\rangle$ is an eigenstate of $Z$, so dephasing has no effect. Try preparing $|+\rangle$ instead (add an $H$ gate) to see how phase damping destroys coherence and increases entropy.

Chapter 28: Quantum Entropy and Information

28.1 Classical Entropy: Shannon's Foundation

Properties of Shannon Entropy

Joint, Conditional, and Mutual Information

Worked Example: Entropy of a Binary Source

Classical Relative Entropy

28.2 Von Neumann Entropy

Basic Properties

Where Classical Intuition Breaks Down

Worked Example: Entropy of a Werner State

The Binary Entropy Function

Continuity: The Fannes-Audenaert Inequality

28.3 Quantum Relative Entropy and Mutual Information

Klein's Inequality

Monotonicity (Data Processing Inequality)

Quantum Mutual Information

Quantum Conditional Entropy

Conditional Mutual Information

Worked Example: Mutual Information of a Bell State

Summary of Entropic Quantities

28.4 Strong Subadditivity

Equivalent Forms

Proof Sketch

Equality Conditions

Applications

28.5 Entanglement Measures

Entanglement Entropy

Entanglement of Formation

Distillable Entanglement

The PPT Criterion and Bound Entanglement

Summary of Entanglement Measures

Interactive: Binary Entropy Function

Interactive: Von Neumann Entropy on the Bloch Ball

Interactive: Entropy Venn Diagram

Interactive: Entanglement Entropy

Sandbox: Entropy Explorer

Interactive: Purity and Noise