Why This Concept Exists
Order statistics arise whenever we care not about individual observations, but about the smallest, largest, or k-th extreme value in a sample. This topic appears in roughly 30-40% of PTS2 exams and carries significant mark weight because it synthesises multiple univariate and bivariate concepts into a single calculation pipeline.
The practical motivation is clear: in engineering, the weakest link determines system survival; in insurance, the largest claim drives reserve requirements; in finance, Value-at-Risk is fundamentally an order statistic; in reliability engineering, the minimum lifetime of components in series determines failure. Examiners love order statistics because they test whether you can derive distributions from first principles rather than simply recognising named distributions.
This node covers the two extremes (minimum and maximum) separately as warm-ups, then delivers the general k-th order statistic formula (the single most important result), joint distributions of order statistics (which examiners love), and worked problems where you must build the entire apparatus from scratch given only a parent PDF.
Prerequisites
Before engaging with this node, you should be comfortable with:
- Cumulative Distribution Functions (CDFs): \(F(x) = P(X \leq x)\), the connection between PDF and CDF via differentiation/integration, and the property \(F(-\infty) = 0\), \(F(+\infty) = 1\).
- I.i.d. assumption: Independent and identically distributed random variables. The key consequence: \(P(X_1 \leq x, X_2 \leq x, \ldots, X_n \leq x) = [F(x)]^n\) under independence.
- I.i.d. random samples: Understanding that \(X_1, \ldots, X_n\) is a random sample from \(f(x)\) means each \(X_i\) has PDF \(f(x)\) and all are mutually independent.
- Binomial probability: \(P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}\) — this appears in the derivation of the general k-th order statistic.
- Integration and differentiation: Differentiating CDFs to get PDFs, integrating PDFs to get CDFs, computing expectations from PDFs.
- Transformations (basic): If \(Y = g(X)\) and you know the CDF of \(X\), you can find the CDF of \(Y\).
Core Exposition
3.1 Setup and Notation
Suppose \(X_1, X_2, \ldots, X_n\) is an i.i.d. sample from a continuous distribution with PDF \(f(x)\) and CDF \(F(x)\). Let \(X_{(1)} \leq X_{(2)} \leq \cdots \leq X_{(n)}\) denote the same values sorted into ascending order. Then:\(X_{(1)}\) is the sample minimum, \(X_{(n)}\) is the sample maximum, and \(X_{(k)}\) is the k-th order statistic.
3.2 Distribution of the Maximum: \(X_{(n)}\)
The maximum is at most \(x\) if and only if every observation is at most \(x\):
By independence: \(= P(X_1 \leq x) \cdot P(X_2 \leq x) \cdots P(X_n \leq x) = [F(x)]^n\)
Differentiating gives the PDF:
3.3 Distribution of the Minimum: \(X_{(1)}\)
The most powerful trick: use the complement. The minimum exceeds \(x\) if and only if every observation exceeds \(x\):
So: \(F_{X_{(1)}}(x) = 1 - [1 - F(x)]^n\)
Differentiating:
3.4 The General k-th Order Statistic
This is the single most important result. For \(X_{(k)}\) (the k-th smallest of \(n\)):
More compactly: \(f_{X_{(k)}}(x) = n \binom{n-1}{k-1}\,[F(x)]^{k-1}\,[1 - F(x)]^{n-k}\, f(x)\)
3.5 Joint Distribution of Two Order Statistics
The joint PDF of \(X_{(i)}\) and \(X_{(j)}\) for \(i < j\):
for \(-\infty < u < v < \infty\)
3.6 Joint Distribution of Min and Max
A very common special case with \(i=1\) and \(j = n\):
From this, the range \(R = X_{(n)} - X_{(1)}\) can be derived by transformation.
3.7 Expected Values of Order Statistics
The expectation formula is always:
For the uniform distribution on \([0,1]\), there is a beautiful closed form:
For general distributions, use the probability integral transform: if \(X \sim F\), then \(U = F(X) \sim U(0,1)\). So:\(X_{(k)} = F^{-1}(U_{(k)})\) and \(E[X_{(k)}] = E[F^{-1}(U_{(k)})]\) where \(U_{(k)} \sim \text{Beta}(k, n-k+1)\).
Worked Examples
Example 1: Min and Max from \(f(x) = 2/x^3\), \(x > 1\)
Let \(X_1, \ldots, X_n\) be i.i.d. with PDF \(f(x) = 2/x^3\) for \(x > 1\). Find the distributions of \(X_{(1)}\) and \(X_{(n)}\).
Check: \(F(1^+) = 0\), \(F(+\infty) = 1\). \(\checkmark\)
\(f_{X_{(n)}}(x) = n\left(1 - \dfrac{1}{x^2}\right)^{n-1} \cdot \dfrac{2}{x^3}\) for \(x > 1\)
\(f_{X_{(1)}}(x) = 2n \cdot x^{-2n-1}\) for \(x > 1\)
Notice that \(X_{(1)}\) follows a Pareto distribution with parameters \((2n, 1)\).
When does this converge? Need \(2n > 1\), i.e. \(n \geq 1\). \(\checkmark\)
\(= 2n \cdot \dfrac{1}{2n-1} = \dfrac{2n}{2n-1}\)
As \(n \to \infty\), \(E[X_{(1)}] \to 1\), which makes sense: the minimum of a large sample from a right-skewed distribution concentrates near the lower bound.
Example 2: Median of Uniform — Finding \(E[X_{(k)}]\)
Let \(X_1, \ldots, X_5\) be i.i.d. \(U(0, 2)\). Find the expected value of the sample median \(X_{(3)}\).
\(= 5 \cdot 6 \cdot \dfrac{1}{2} \cdot \dfrac{x^2}{4} \cdot \dfrac{(2-x)^2}{4} = \dfrac{15}{16}\,x^2(2-x)^2\) for \(0 < x < 2\)
\(= \dfrac{15}{16}\left[\dfrac{4x^4}{4} - \dfrac{4x^5}{5} + \dfrac{x^6}{6}\right]_0^2 = \dfrac{15}{16}\left[4 - \dfrac{128}{5} + \dfrac{64}{6}\right]\)
\(= \dfrac{15}{16}\left[4 - 25.6 + 10.667\right] = \dfrac{15}{16}[-10.933]\)\ Wait — this is wrong! Let's expand more carefully:
\(E[X_{(3)}] = \displaystyle\int_0^2 \dfrac{15}{16}\,x^3(4-4x+x^2)\,dx\)
\(= \dfrac{15}{16}\left[4\cdot\dfrac{16}{4} - 4\cdot\dfrac{32}{5} + \dfrac{64}{6}\right] = \dfrac{15}{16}\left[16 - \dfrac{128}{5} + \dfrac{32}{3}\right]\)
\(= \dfrac{15}{16} \cdot \dfrac{240 - 384 + 160}{15} = \dfrac{15}{16} \cdot \dfrac{16}{15} = 1\)
Example 3: Range from Exponential Distribution
Let \(X_1, X_2, X_3\) be i.i.d. \(\text{Exp}(\lambda)\). Find the PDF of the range \(R = X_{(3)} - X_{(1)}\).
for \(0 < u < v < \infty\).
\(f_{R,S}(r, s) = 6\,[e^{-\lambda s} - e^{-\lambda(r+s)}]\,\lambda^2 e^{-\lambda s}\,e^{-\lambda(r+s)} \)
\(= 6\lambda^2\,e^{-3\lambda s}\,e^{-\lambda r}\,[1 - e^{-\lambda r}]\) for \(r > 0\), \(s > 0\).
\(= 2\lambda\, e^{-\lambda r}(1 - e^{-\lambda r})\) for \(r > 0\)
Pattern Recognition & Examiner Traps
- "Given \(f(x) = \ldots\), find the PDF of the sample maximum" — immediately signals: find CDF first, then raise to power \(n\), then differentiate.
- "Find the probability that the range exceeds \(r\)" — use joint PDF of min/max or work from the CDF approach.
- "Show that the sample median is an unbiased estimator of the population mean" — compute \(E[X_{((n+1)/2)}]\) using the k-th order formula.
- The Pareto/power-law example (\(f(x) = c/x^k\)) appears repeatedly because the CDF has a clean closed form and the resulting order statistic distributions are tractable.
Connections
- ← N4 (Transformations): The change-of-variable technique from N4 is used to derive the distribution of the range and other functions of order statistics.
- → N6 (Sampling Distributions): Order statistics are themselves sampling statistics. The chi-squared, t, and F distributions covered in N6 can be connected to order statistics in multivariate settings.
- → N8-N12 (Inference): The sample maximum/minimum are often used as sufficient statistics (e.g., for the Uniform distribution). MLE in N7 sometimes involves order statistics (e.g., MLE of the upper bound of a Uniform distribution is the sample maximum).
Order statistics also connect to non-parametric methods (beyond the PTS2 syllabus), where the entire inference is based on ranks rather than raw values.
Summary Table
| Quantity | CDF | Key Idea | |
|---|---|---|---|
| Maximum \(X_{(n)}\) | \([F(x)]^n\) | \(n[F(x)]^{n-1}f(x)\) | All must be \(\leq x\) |
| Minimum \(X_{(1)}\) | \(1 - [1-F(x)]^n\) | \(n[1-F(x)]^{n-1}f(x)\) | Complement: all \(> x\) |
| k-th order \(X_{(k)}\) | Integrate PDF | \(n\binom{n-1}{k-1} F^{k-1}(1-F)^{n-k}f\) | Multinomial binning |
| Min and Max joint | — | \(n(n-1)[F(v)-F(u)]^{n-2}f(u)f(v)\) | 5-bin partition |
| Uniform \(X_{(k)}\) | Beta CDF | Beta\((k, n-k+1)\) shape | \(E[X_{(k)}] = \frac{k}{n+1}\) |
Self-Assessment
Test your understanding by working through these before moving to N6:
- Given any parent CDF, write down the CDF and PDF of the sample minimum and maximum.
- Derive the k-th order statistic PDF from first principles using the binomial argument.
- Given \(f(x) = 3x^2\) on \(0 < x < 1\), find the distribution of the sample median for \(n = 5\).
- Compute \(E[X_{(1)}]\) and \(E[X_{(n)}]\) for an i.i.d. exponential sample.
- Write the joint PDF of \(X_{(1)}\) and \(X_{(n)}\) for a given parent distribution.
- Use the probability integral transform to find \(E[X_{(k)}]\) for a given distribution.
- If \(X_1, \ldots, X_n\) are i.i.d. from \(f(x) = \theta x^{\theta - 1}\) on \((0,1)\), find the PDF of \(X_{(n)}\). [Answer: \(f_{X_{(n)}}(x) = n\theta x^{n\theta-1}\) on (0,1).]
- Show that for \(X_i \sim \text{Exp}(1)\), the minimum \(X_{(1)} \sim \text{Exp}(n)\).
- If \(X_1, X_2, X_3\) are i.i.d. \(U(0,1)\), find \(P(X_{(2)} > 1/2)\). [Answer: Use \(1 - F_{X_{(2)}}(1/2)\).]
HLQ: Exam-Style Question with Worked Solution
A random sample of size \(n = 4\) is drawn from the distribution with PDF \(f(x) = 2/x^3\) for \(x > 1\) (and 0 elsewhere). Let \(X_{(1)} < X_{(2)} < X_{(3)} < X_{(4)}\) be the order statistics.
(a) Find the CDF of the parent distribution. (2 marks)
(b) Find the PDF of the sample maximum \(X_{(4)}\). (3 marks)
(c) Find the expected value of the sample minimum \(X_{(1)}\). (3 marks)
(d) Find the joint PDF of \(X_{(1)}\) and \(X_{(4)}\). (3 marks)
(e) The probability that all four observations are less than 2. (2 marks)
(f) The probability that at least two observations exceed 1.5. (2 marks)
\(F(1) = 0, \quad F(\infty) = 1. \quad \checkmark\)
\(= \dfrac{8}{x^3}\left(1 - \dfrac{1}{x^2}\right)^3\) for \(x > 1\).
\(F_{X_{(1)}}(x) = 1 - [1 - F(x)]^4 = 1 - \left(\dfrac{1}{x^2}\right)^4 = 1 - x^{-8}\)
\(f_{X_{(1)}}(x) = 8\,x^{-9}\) for \(x > 1\).
\(E[X_{(1)}] = \displaystyle\int_1^{\infty} x \cdot 8 x^{-9}\,dx = 8\int_1^{\infty} x^{-8}\,dx = 8 \cdot \dfrac{1}{7} = \dfrac{8}{7} \)
\(= 12\left[\left(1 - \frac{1}{v^2}\right) - \left(1 - \frac{1}{u^2}\right)\right]^2 \cdot \frac{2}{u^3} \cdot \frac{2}{v^3}\)
\(= 12\left[\frac{1}{u^2} - \frac{1}{v^2}\right]^2 \cdot \frac{4}{u^3 v^3}\)
\(= \dfrac{48}{u^3 v^3}\left(\dfrac{v^2 - u^2}{u^2 v^2}\right)^2 = \dfrac{48(v^2 - u^2)^2}{u^7 v^7}\) for \(1 < u < v < \infty\).
The number of observations exceeding 1.5 is \(Y \sim \text{Binomial}(4, 4/9)\).
\(P(Y \geq 2) = 1 - P(Y = 0) - P(Y = 1)\)
\(= 1 - \left(\frac{5}{9}\right)^4 - 4\left(\frac{4}{9}\right)\left(\frac{5}{9}\right)^3\)
\(= 1 - \dfrac{625}{6561} - 4 \cdot \dfrac{500}{6561} = 1 - \dfrac{625 + 2000}{6561} = 1 - \dfrac{2625}{6561}\)
\(= \dfrac{3936}{6561} \approx 0.600\)