PRESTIGE ED
N5: Order Statistics
Node N5 — Section 1

Why This Concept Exists

Order statistics arise whenever we care not about individual observations, but about the smallest, largest, or k-th extreme value in a sample. This topic appears in roughly 30-40% of PTS2 exams and carries significant mark weight because it synthesises multiple univariate and bivariate concepts into a single calculation pipeline.

The practical motivation is clear: in engineering, the weakest link determines system survival; in insurance, the largest claim drives reserve requirements; in finance, Value-at-Risk is fundamentally an order statistic; in reliability engineering, the minimum lifetime of components in series determines failure. Examiners love order statistics because they test whether you can derive distributions from first principles rather than simply recognising named distributions.

Leverage: N5 alone is worth 8-12 marks in any exam where it appears. The machinery (min/max CDFs, k-th order statistic formula, PDF derivation) is compact, highly testable, and examiners rarely deviate from established patterns.

This node covers the two extremes (minimum and maximum) separately as warm-ups, then delivers the general k-th order statistic formula (the single most important result), joint distributions of order statistics (which examiners love), and worked problems where you must build the entire apparatus from scratch given only a parent PDF.


Node N5 — Section 2

Prerequisites

Before engaging with this node, you should be comfortable with:

  • Cumulative Distribution Functions (CDFs): \(F(x) = P(X \leq x)\), the connection between PDF and CDF via differentiation/integration, and the property \(F(-\infty) = 0\), \(F(+\infty) = 1\).
  • I.i.d. assumption: Independent and identically distributed random variables. The key consequence: \(P(X_1 \leq x, X_2 \leq x, \ldots, X_n \leq x) = [F(x)]^n\) under independence.
  • I.i.d. random samples: Understanding that \(X_1, \ldots, X_n\) is a random sample from \(f(x)\) means each \(X_i\) has PDF \(f(x)\) and all are mutually independent.
  • Binomial probability: \(P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}\) — this appears in the derivation of the general k-th order statistic.
  • Integration and differentiation: Differentiating CDFs to get PDFs, integrating PDFs to get CDFs, computing expectations from PDFs.
  • Transformations (basic): If \(Y = g(X)\) and you know the CDF of \(X\), you can find the CDF of \(Y\).
Key idea:** The entire CDF approach to order statistics rests on a trick: instead of working directly with the PDF of the ordered variable, first compute its CDF using probability reasoning about the original sample, then differentiate to get the PDF.

Node N5 — Section 3

Core Exposition

3.1 Setup and Notation

Suppose \(X_1, X_2, \ldots, X_n\) is an i.i.d. sample from a continuous distribution with PDF \(f(x)\) and CDF \(F(x)\). Let \(X_{(1)} \leq X_{(2)} \leq \cdots \leq X_{(n)}\) denote the same values sorted into ascending order. Then:\(X_{(1)}\) is the sample minimum, \(X_{(n)}\) is the sample maximum, and \(X_{(k)}\) is the k-th order statistic.

3.2 Distribution of the Maximum: \(X_{(n)}\)

The maximum is at most \(x\) if and only if every observation is at most \(x\):

\(F_{X_{(n)}}(x) = P(X_{(n)} \leq x) = P(X_1 \leq x, X_2 \leq x, \ldots, X_n \leq x)\)
By independence: \(= P(X_1 \leq x) \cdot P(X_2 \leq x) \cdots P(X_n \leq x) = [F(x)]^n\)

Differentiating gives the PDF:

\(f_{X_{(n)}}(x) = \dfrac{d}{dx}[F(x)]^n = n\,[F(x)]^{n-1}\, f(x)\)

3.3 Distribution of the Minimum: \(X_{(1)}\)

The most powerful trick: use the complement. The minimum exceeds \(x\) if and only if every observation exceeds \(x\):

\(P(X_{(1)} > x) = P(X_1 > x, \ldots, X_n > x) = [1 - F(x)]^n\)
So: \(F_{X_{(1)}}(x) = 1 - [1 - F(x)]^n\)

Differentiating:

\(f_{X_{(1)}}(x) = n\,[1 - F(x)]^{n-1}\, f(x)\)

3.4 The General k-th Order Statistic

This is the single most important result. For \(X_{(k)}\) (the k-th smallest of \(n\)):

\(f_{X_{(k)}}(x) = \dfrac{n!}{(k-1)!\, 1!\, (n-k)!} \,[F(x)]^{k-1}\,[1 - F(x)]^{n-k}\, f(x)\)

More compactly: \(f_{X_{(k)}}(x) = n \binom{n-1}{k-1}\,[F(x)]^{k-1}\,[1 - F(x)]^{n-k}\, f(x)\)
Intuition behind the formula: Think of placing \(n\) balls into three bins around \(x\):\((k-1)\) balls must be below \(x\) (probability \([F(x)]^{k-1}\)), one ball at exactly \(x\) (density \(f(x)\,dx\)), and \((n-k)\) balls above \(x\) (probability \([1-F(x)]^{n-k}\)). The multinomial coefficient counts the ways to assign observations to these three bins.

3.5 Joint Distribution of Two Order Statistics

The joint PDF of \(X_{(i)}\) and \(X_{(j)}\) for \(i < j\):

\(f_{X_{(i)}, X_{(j)}}(u, v) = \dfrac{n!}{(i-1)!\,(j-i-1)!\,(n-j)!}\, [F(u)]^{i-1}\, f(u)\, [F(v) - F(u)]^{j-i-1}\, f(v)\, [1-F(v)]^{n-j}\)
for \(-\infty < u < v < \infty\)
Key structural insight: Think of \(n\) observations partitioned into five bins: below \(u\), at \(u\), between \(u\) and \(v\), at \(v\), above \(v\). The counts are \((i-1)\), 1, \((j-i-1)\), 1, \((n-j)\) respectively.

3.6 Joint Distribution of Min and Max

A very common special case with \(i=1\) and \(j = n\):

\(f_{X_{(1)}, X_{(n)}}(u, v) = n(n-1)\,[F(v) - F(u)]^{n-2}\, f(u)\, f(v)\)   for \(u < v\)

From this, the range \(R = X_{(n)} - X_{(1)}\) can be derived by transformation.

3.7 Expected Values of Order Statistics

The expectation formula is always:

\(E[X_{(k)}] = \displaystyle\int_{-\infty}^{\infty} x \cdot f_{X_{(k)}}(x)\, dx\)

For the uniform distribution on \([0,1]\), there is a beautiful closed form:

\(X_i \sim U(0,1) \quad \Rightarrow \quad E[X_{(k)}] = \dfrac{k}{n+1}\)

For general distributions, use the probability integral transform: if \(X \sim F\), then \(U = F(X) \sim U(0,1)\). So:\(X_{(k)} = F^{-1}(U_{(k)})\) and \(E[X_{(k)}] = E[F^{-1}(U_{(k)})]\) where \(U_{(k)} \sim \text{Beta}(k, n-k+1)\).


Node N5 — Section 4

Worked Examples

Example 1: Min and Max from \(f(x) = 2/x^3\), \(x > 1\)

Let \(X_1, \ldots, X_n\) be i.i.d. with PDF \(f(x) = 2/x^3\) for \(x > 1\). Find the distributions of \(X_{(1)}\) and \(X_{(n)}\).

Step 1: Find the CDF \(F(x)\) \(F(x) = \displaystyle\int_1^x \frac{2}{t^3}\,dt = \left[-\frac{1}{t^2}\right]_1^x = 1 - \frac{1}{x^2}\)   for \(x > 1\)
Check: \(F(1^+) = 0\), \(F(+\infty) = 1\). \(\checkmark\)
Step 2: CDF of the maximum \(F_{X_{(n)}}(x) = [F(x)]^n = \left(1 - \dfrac{1}{x^2}\right)^n\)   for \(x > 1\)
\(f_{X_{(n)}}(x) = n\left(1 - \dfrac{1}{x^2}\right)^{n-1} \cdot \dfrac{2}{x^3}\)   for \(x > 1\)
Step 3: CDF of the minimum \(F_{X_{(1)}}(x) = 1 - [1 - F(x)]^n = 1 - \left(\dfrac{1}{x^2}\right)^n = 1 - x^{-2n}\)   for \(x > 1\)
\(f_{X_{(1)}}(x) = 2n \cdot x^{-2n-1}\)   for \(x > 1\)
Notice that \(X_{(1)}\) follows a Pareto distribution with parameters \((2n, 1)\).
Step 4: Expectation of the minimum \(E[X_{(1)}] = \displaystyle\int_1^{\infty} x \cdot 2n \cdot x^{-2n-1}\,dx = 2n\int_1^{\infty} x^{-2n}\,dx\)
When does this converge? Need \(2n > 1\), i.e. \(n \geq 1\). \(\checkmark\)
\(= 2n \cdot \dfrac{1}{2n-1} = \dfrac{2n}{2n-1}\)
As \(n \to \infty\), \(E[X_{(1)}] \to 1\), which makes sense: the minimum of a large sample from a right-skewed distribution concentrates near the lower bound.

Example 2: Median of Uniform — Finding \(E[X_{(k)}]\)

Let \(X_1, \ldots, X_5\) be i.i.d. \(U(0, 2)\). Find the expected value of the sample median \(X_{(3)}\).

Step 1: Parent CDF and PDF \(F(x) = x/2\) for \(0 < x < 2\),   \(f(x) = 1/2\) for \(0 < x < 2\).
Step 2: Apply the k-th order statistic formula \(f_{X_{(3)}}(x) = 5\binom{4}{2}\,\left(\dfrac{x}{2}\right)^2 \left(1 - \dfrac{x}{2}\right)^2 \cdot \dfrac{1}{2}\)
\(= 5 \cdot 6 \cdot \dfrac{1}{2} \cdot \dfrac{x^2}{4} \cdot \dfrac{(2-x)^2}{4} = \dfrac{15}{16}\,x^2(2-x)^2\)   for \(0 < x < 2\)
Step 3: Compute expectation \(E[X_{(3)}] = \displaystyle\int_0^2 x \cdot \dfrac{15}{16}\,x^2(2-x)^2\,dx = \dfrac{15}{16} \int_0^{2} x^3(4 - 4x + x^2)\,dx\)
\(= \dfrac{15}{16}\left[\dfrac{4x^4}{4} - \dfrac{4x^5}{5} + \dfrac{x^6}{6}\right]_0^2 = \dfrac{15}{16}\left[4 - \dfrac{128}{5} + \dfrac{64}{6}\right]\)
\(= \dfrac{15}{16}\left[4 - 25.6 + 10.667\right] = \dfrac{15}{16}[-10.933]\)\ Wait — this is wrong! Let's expand more carefully:
\(E[X_{(3)}] = \displaystyle\int_0^2 \dfrac{15}{16}\,x^3(4-4x+x^2)\,dx\)
\(= \dfrac{15}{16}\left[4\cdot\dfrac{16}{4} - 4\cdot\dfrac{32}{5} + \dfrac{64}{6}\right] = \dfrac{15}{16}\left[16 - \dfrac{128}{5} + \dfrac{32}{3}\right]\)
\(= \dfrac{15}{16} \cdot \dfrac{240 - 384 + 160}{15} = \dfrac{15}{16} \cdot \dfrac{16}{15} = 1\)
Cross-check: We can also use the probability integral transform. \(X \sim U(0,2) \Rightarrow X = 2U\) where \(U \sim U(0,1)\). \(E[U_{(3)}] = 3/6 = 1/2\), so \(E[X_{(3)}] = 2 \cdot \frac{1}{2} = 1\). \(\checkmark\)

Example 3: Range from Exponential Distribution

Let \(X_1, X_2, X_3\) be i.i.d. \(\text{Exp}(\lambda)\). Find the PDF of the range \(R = X_{(3)} - X_{(1)}\).

Step 1: PDF and CDF \(f(x) = \lambda e^{-\lambda x}\),   \(F(x) = 1 - e^{-\lambda x}\) for \(x > 0\).
Step 2: Joint PDF of min and max \(f_{X_{(1)},X_{(3)}}(u, v) = 3(2)\,[F(v)-F(u)]^{1}\,f(u)\,f(v) = 6\,[e^{-\lambda u} - e^{-\lambda v}]\,\lambda e^{-\lambda u}\,\lambda e^{-\lambda v}\)
for \(0 < u < v < \infty\).
Step 3: Transform to \(R = V - U\) and \(S = U\) \(u = s\), \(v = r + s\). Jacobian: \(J = 1\).
\(f_{R,S}(r, s) = 6\,[e^{-\lambda s} - e^{-\lambda(r+s)}]\,\lambda^2 e^{-\lambda s}\,e^{-\lambda(r+s)} \)
\(= 6\lambda^2\,e^{-3\lambda s}\,e^{-\lambda r}\,[1 - e^{-\lambda r}]\)   for \(r > 0\), \(s > 0\).
Step 4: Integrate out \(s\) \(f_R(r) = \displaystyle\int_0^{\infty} 6\lambda^2\,e^{-\lambda r}[1 - e^{-\lambda r}]\,e^{-3\lambda s}\,ds = 6\lambda^2\,e^{-\lambda r}[1 - e^{-\lambda r}] \cdot \dfrac{1}{3\lambda}\)
\(= 2\lambda\, e^{-\lambda r}(1 - e^{-\lambda r})\)   for \(r > 0\)

Node N5 — Section 5

Pattern Recognition & Examiner Traps

Trap 1: Confusing the CDF of the minimum Students often write \(F_{X_{(1)}}(x) = 1 - [F(x)]^n\) by misremembering the formula. The correct version is \(\mathbf{1 - [1-F(x)]^n}\). The complement of "minimum exceeds x" uses the survival function, not the CDF itself.
WRONG \(F_{X_{(1)}}(x) = 1 - [F(x)]^n\) — uses the CDF, not the survival function.
RIGHT \(F_{X_{(1)}}(x) = 1 - [1 - F(x)]^n\) — because \(P(\min > x) = P(\text{all} > x) = [1-F(x)]^n\).
Trap 2: Forgetting to establish the support / domain If the parent variable has support \(x > 1\), the order statistics also have support \(x > 1\). But examiners often give truncated or shifted distributions where the support matters for integration limits in expectations.
Trap 3: Misapplying the k-th order statistic formula for edge cases When \(k=1\), the formula should reduce to the minimum formula. When \(k=n\), it should reduce to the maximum formula. Always check these boundary conditions. If your simplified formula doesn't match, there's an algebraic error.
Trap 4: Ignoring the independence requirement The entire derivation relies on the \(X_i\) being independent. If the problem states "a sample" without saying "i.i.d.", you must explicitly state the assumption that underpins your work.
Examiner patterns to recognise:
  • "Given \(f(x) = \ldots\), find the PDF of the sample maximum" — immediately signals: find CDF first, then raise to power \(n\), then differentiate.
  • "Find the probability that the range exceeds \(r\)" — use joint PDF of min/max or work from the CDF approach.
  • "Show that the sample median is an unbiased estimator of the population mean" — compute \(E[X_{((n+1)/2)}]\) using the k-th order formula.
  • The Pareto/power-law example (\(f(x) = c/x^k\)) appears repeatedly because the CDF has a clean closed form and the resulting order statistic distributions are tractable.

Node N5 — Section 6

Connections

How N5 connects to neighbouring nodes:
  • ← N4 (Transformations): The change-of-variable technique from N4 is used to derive the distribution of the range and other functions of order statistics.
  • → N6 (Sampling Distributions): Order statistics are themselves sampling statistics. The chi-squared, t, and F distributions covered in N6 can be connected to order statistics in multivariate settings.
  • → N8-N12 (Inference): The sample maximum/minimum are often used as sufficient statistics (e.g., for the Uniform distribution). MLE in N7 sometimes involves order statistics (e.g., MLE of the upper bound of a Uniform distribution is the sample maximum).

Order statistics also connect to non-parametric methods (beyond the PTS2 syllabus), where the entire inference is based on ranks rather than raw values.


Node N5 — Section 7

Summary Table

QuantityCDFPDFKey Idea
Maximum \(X_{(n)}\)\([F(x)]^n\)\(n[F(x)]^{n-1}f(x)\)All must be \(\leq x\)
Minimum \(X_{(1)}\)\(1 - [1-F(x)]^n\)\(n[1-F(x)]^{n-1}f(x)\)Complement: all \(> x\)
k-th order \(X_{(k)}\)Integrate PDF\(n\binom{n-1}{k-1} F^{k-1}(1-F)^{n-k}f\)Multinomial binning
Min and Max joint\(n(n-1)[F(v)-F(u)]^{n-2}f(u)f(v)\)5-bin partition
Uniform \(X_{(k)}\)Beta CDFBeta\((k, n-k+1)\) shape\(E[X_{(k)}] = \frac{k}{n+1}\)
Golden Rule Always find the CDF first using probability reasoning, then differentiate to get the PDF. Never try to derive the PDF of an order statistic directly.
Boundary Check When in doubt about the general k-th order formula, set \(k = 1\) and \(k = n\) to verify you recover the minimum and maximum results.
Support Matters Order statistics inherit the support of the parent distribution. Always state the domain explicitly.
Integral Trick When computing \(E[X_{(k)}]\) for complicated distributions, try the probability integral transform: convert to Uniform(0,1) order statistics first.

Node N5 — Section 8

Self-Assessment

Test your understanding by working through these before moving to N6:

Can you do all of these?
  • Given any parent CDF, write down the CDF and PDF of the sample minimum and maximum.
  • Derive the k-th order statistic PDF from first principles using the binomial argument.
  • Given \(f(x) = 3x^2\) on \(0 < x < 1\), find the distribution of the sample median for \(n = 5\).
  • Compute \(E[X_{(1)}]\) and \(E[X_{(n)}]\) for an i.i.d. exponential sample.
  • Write the joint PDF of \(X_{(1)}\) and \(X_{(n)}\) for a given parent distribution.
  • Use the probability integral transform to find \(E[X_{(k)}]\) for a given distribution.
Practice Problems
  • If \(X_1, \ldots, X_n\) are i.i.d. from \(f(x) = \theta x^{\theta - 1}\) on \((0,1)\), find the PDF of \(X_{(n)}\). [Answer: \(f_{X_{(n)}}(x) = n\theta x^{n\theta-1}\) on (0,1).]
  • Show that for \(X_i \sim \text{Exp}(1)\), the minimum \(X_{(1)} \sim \text{Exp}(n)\).
  • If \(X_1, X_2, X_3\) are i.i.d. \(U(0,1)\), find \(P(X_{(2)} > 1/2)\). [Answer: Use \(1 - F_{X_{(2)}}(1/2)\).]

High-Leverage Questions

HLQ: Exam-Style Question with Worked Solution

15 MARKS ORDER STATISTICS / DERIVATION MULTI-PART

A random sample of size \(n = 4\) is drawn from the distribution with PDF \(f(x) = 2/x^3\) for \(x > 1\) (and 0 elsewhere). Let \(X_{(1)} < X_{(2)} < X_{(3)} < X_{(4)}\) be the order statistics.

(a) Find the CDF of the parent distribution. (2 marks)

(b) Find the PDF of the sample maximum \(X_{(4)}\). (3 marks)

(c) Find the expected value of the sample minimum \(X_{(1)}\). (3 marks)

(d) Find the joint PDF of \(X_{(1)}\) and \(X_{(4)}\). (3 marks)

(e) The probability that all four observations are less than 2. (2 marks)

(f) The probability that at least two observations exceed 1.5. (2 marks)


Part (a): Parent CDF \(F(x) = \displaystyle\int_1^x \frac{2}{t^3}\,dt = \left[-\frac{1}{t^2}\right]_1^x = 1 - \frac{1}{x^2}\)   for \(x > 1\).
\(F(1) = 0, \quad F(\infty) = 1. \quad \checkmark\)
Part (b): PDF of the Maximum \(f_{X_{(4)}}(x) = 4\,[F(x)]^3\,f(x) = 4\left(1 - \frac{1}{x^2}\right)^3 \cdot \frac{2}{x^3}\)
\(= \dfrac{8}{x^3}\left(1 - \dfrac{1}{x^2}\right)^3\)   for \(x > 1\).
Part (c): Expectation of the Minimum We need \(f_{X_{(1)}}(x)\) first:
\(F_{X_{(1)}}(x) = 1 - [1 - F(x)]^4 = 1 - \left(\dfrac{1}{x^2}\right)^4 = 1 - x^{-8}\)
\(f_{X_{(1)}}(x) = 8\,x^{-9}\)   for \(x > 1\).

\(E[X_{(1)}] = \displaystyle\int_1^{\infty} x \cdot 8 x^{-9}\,dx = 8\int_1^{\infty} x^{-8}\,dx = 8 \cdot \dfrac{1}{7} = \dfrac{8}{7} \)
Part (d): Joint PDF of Min and Max \(f_{X_{(1)}, X_{(4)}}(u, v) = 4(3)\,[F(v) - F(u)]^{2}\,f(u)\,f(v)\)
\(= 12\left[\left(1 - \frac{1}{v^2}\right) - \left(1 - \frac{1}{u^2}\right)\right]^2 \cdot \frac{2}{u^3} \cdot \frac{2}{v^3}\)
\(= 12\left[\frac{1}{u^2} - \frac{1}{v^2}\right]^2 \cdot \frac{4}{u^3 v^3}\)
\(= \dfrac{48}{u^3 v^3}\left(\dfrac{v^2 - u^2}{u^2 v^2}\right)^2 = \dfrac{48(v^2 - u^2)^2}{u^7 v^7}\)   for \(1 < u < v < \infty\).
Part (e): All Four Less Than 2 \(P(X_{(4)} \leq 2) = [F(2)]^4 = \left(1 - \frac{1}{4}\right)^4 = \left(\frac{3}{4}\right)^4 = \dfrac{81}{256} \)
Part (f): At Least Two Exceed 1.5 Let \(p = P(X > 1.5) = 1 - F(1.5) = \dfrac{1}{(1.5)^2} = \dfrac{1}{2.25} = \dfrac{4}{9}\).
The number of observations exceeding 1.5 is \(Y \sim \text{Binomial}(4, 4/9)\).
\(P(Y \geq 2) = 1 - P(Y = 0) - P(Y = 1)\)
\(= 1 - \left(\frac{5}{9}\right)^4 - 4\left(\frac{4}{9}\right)\left(\frac{5}{9}\right)^3\)
\(= 1 - \dfrac{625}{6561} - 4 \cdot \dfrac{500}{6561} = 1 - \dfrac{625 + 2000}{6561} = 1 - \dfrac{2625}{6561}\)
\(= \dfrac{3936}{6561} \approx 0.600\)
Summary of answers: (a) \(F(x) = 1 - 1/x^2\), \(x > 1\). (b) \(f_{X_{(4)}}(x) = \dfrac{8}{x^3}\left(1 - \dfrac{1}{x^2}\right)^3\), \(x > 1\). (c) \(E[X_{(1)}] = 8/7\). (d) \(f_{X_{(1)}, X_{(4)}}(u, v) = \dfrac{48(v^2 - u^2)^2}{u^7 v^7}\), \(1 < u < v < \infty\). (e) \(81/256\). (f) \(3936/6561 \approx 0.600\).