N6: Sampling Distributions

Node N6 — Section 1

Why This Concept Exists

Sampling distributions answer a fundamental question: what is the probability distribution of a statistic? Every confidence interval, every hypothesis test, every p-value in PTS2 depends on knowing the sampling distribution of a relevant statistic. Without this machinery, inference is impossible.

The three cornerstone distributions — chi-squared, t, and F — are all constructed from standard normal random variables and have deep interconnections. Every formula you need in N8 through N12 traces back to how these distributions are built. If you don't internalise how \(\chi^2\) arises from squaring normals, how t arises from a normal divided by a chi, and how F arises from a ratio of chi-squares, then every confidence interval formula in later nodes will feel like unmotivated memorisation.

Leverage: N6 is the mathematical scaffold for all of N8-N12. The construction proofs (chi-squared from normal, t from chi, F from chi-squared) are directly examinable. The degrees of freedom manipulation problems are a recurring question type worth 6-10 marks.

This node also introduces two powerful techniques that appear in exam questions: constructing statistics from heterogeneous normal samples (different means and variances, standardising them, and building chi-squared/t/F statistics from the result), and understanding additive properties of chi-squared and F distributions.

Node N6 — Section 2

Prerequisites

Before engaging with this node, you must be comfortable with:

Standard Normal distribution: \(Z \sim N(0,1)\) with PDF \(\phi(z) = \frac{1}{\sqrt{2\pi}}e^{-z^2/2}\). Know that if \(X \sim N(\mu, \sigma^2)\) then \(Z = \frac{X-\mu}{\sigma} \sim N(0,1)\).
Linear combinations of normals: If \(X_i \sim N(\mu_i, \sigma_i^2)\) are independent, then \(\sum a_i X_i \sim N(\sum a_i\mu_i, \sum a_i^2\sigma_i^2)\).
Moment generating functions (MGFs): The technique of using \(M_Z(t) = E[e^{tZ}]\) to identify distributions. If two variables have the same MGF, they have the same distribution.
Independence: The sample mean \(\bar{X}\) and sample variance \(S^2\) are independent for normal samples. This is a deep result (Basu's theorem) that must be treated as a black box at the PTS2 level.
Gamma function: \(\Gamma(\alpha) = \int_0^{\infty} x^{\alpha-1}e^{-x}\,dx\), and \(\Gamma(n) = (n-1)!\) for positive integers.
Sample variance formula: \(S^2 = \frac{1}{n-1}\sum_{i=1}^{n}(X_i - \bar{X})^2\), with Bessel's correction (\(n-1\) denominator).

Foundation note: The independence of \(\bar{X}\) and \(S^2\) for normal samples is non-trivial and is proved in advanced courses. For PTS2, accept it as given. It is the linchpin that makes the t-distribution work.

Node N6 — Section 3

Core Exposition

3.1 The Chi-Squared Distribution — Construction

The chi-squared distribution is the sum of squares of independent standard normals:

If \(Z_1, Z_2, \ldots, Z_\nu\) are i.i.d. \(N(0,1)\), then \(Y = \displaystyle\sum_{i=1}^{\nu} Z_i^2 \sim \chi^2(\nu)\) \(\nu\) is called the degrees of freedom.

The PDF of \(Y \sim \chi^2(\nu)\) is:

f(y) = \dfrac{1}{2^{\nu/2}\,\Gamma(\nu/2)}\, y^{\nu/2 - 1}\, e^{-y/2}\)   for \(y > 0

Key properties:

E[\chi^2(\nu)] = \nu\)   and   \(\text{Var}(\chi^2(\nu)) = 2\nu\) MGF: \(M(t) = (1 - 2t)^{-\nu/2}\) for \(t < 1/2

Connection to sample variance: The pivotal result for normal samples: if \(X_1, \ldots, X_n\) are i.i.d. \(N(\mu, \sigma^2)\), then\(\dfrac{(n-1)S^2}{\sigma^2} \sim \chi^2(n-1)\).

This is the bridge between algebraic computation (sample variance) and probability theory (chi-squared distribution). The proof uses Cochran's theorem and is far beyond PTS2 scope, but the result is fundamental.

3.2 Additivity of Chi-Squared

If \(X_1 \sim \chi^2(\nu_1)\) and \(X_2 \sim \chi^2(\nu_2)\) are independent, then \(X_1 + X_2 \sim \chi^2(\nu_1 + \nu_2)\). Proof via MGFs: \(M_{X_1+X_2}(t) = (1-2t)^{-\nu_1/2} \cdot (1-2t)^{-\nu_2/2} = (1-2t)^{-(\nu_1+\nu_2)/2}\). \(\checkmark\)

3.3 The t-Distribution — Construction

The t-distribution arises when dividing a standard normal by the scaled square root of an independent chi-squared:

If \(Z \sim N(0,1)\) and \(V \sim \chi^2(\nu)\) are independent, then \(T = \dfrac{Z}{\sqrt{V/\nu}} \sim t(\nu)\)

The PDF of \(T \sim t(\nu)\) is:

f(t) = \dfrac{\Gamma((\nu+1)/2)}{\sqrt{\nu\pi}\:\Gamma(\nu/2)}\left(1 + \dfrac{t^2}{\nu}\right)^{-(\nu+1)/2}\)   for \(-\infty < t < \infty

The t-distribution is symmetric about zero, bell-shaped like the normal, but with heavier tails. As \(\nu \to \infty\), \(t(\nu) \to N(0,1)\).

3.4 The F-Distribution — Construction

The F-distribution arises as a ratio of scaled independent chi-squared variables:

If \(U \sim \chi^2(\nu_1)\) and \(V \sim \chi^2(\nu_2)\) are independent, then \(F = \dfrac{U/\nu_1}{V/\nu_2} \sim F(\nu_1, \nu_2)\)

Note that each chi-squared is divided by its degrees of freedom before forming the ratio. The PDF is complex, but the structure is what matters for exams.

Key relationships:

If \(T \sim t(\nu)\), then \(T^2 \sim F(1, \nu)\). \(F_{\alpha;\,\nu_1,\nu_2} = \dfrac{1}{F_{1-\alpha;\,\nu_2,\nu_1}}\)   (reciprocal property for critical values).

3.5 Constructing Statistics from Heterogeneous Normals

A common exam pattern: you are given \(X_1, \ldots, X_n\) where each \(X_i \sim N(\mu_i, \sigma_i^2)\) with potentially different means and variances. You must standardise each individually and then combine them to produce a chi-squared, t, or F statistic.

If \(X_i \sim N(\mu_i, \sigma_i^2)\) are independent, define \(Z_i = \dfrac{X_i - \mu_i}{\sigma_i} \sim N(0,1)\). Then: \(\displaystyle \sum_{i=1}^{n} Z_i^2 = \sum_{i=1}^{n}\left(\dfrac{X_i - \mu_i}{\sigma_i}\right)^2 \sim \chi^2(n)\).

If \(\mu_i\) are unknown and replaced by the sample mean \(\bar{X}\), one degree of freedom is lost:

\(\displaystyle \sum_{i=1}^{n}\dfrac{(X_i - \bar{X})^2}{\sigma^2} \sim \chi^2(n-1)\)    (when all \(\sigma_i = \sigma\), same variance).

3.6 Key Sampling Distributions for Normal Samples

\bar{X} \sim N(\mu, \sigma^2/n)\) \(\dfrac{\bar{X} - \mu}{\sigma/\sqrt{n}} \sim N(0,1)\) \(\dfrac{\bar{X} - \mu}{S/\sqrt{n}} \sim t(n-1)\) \(\dfrac{(n-1)S^2}{\sigma^2} \sim \chi^2(n-1)

Why these matter: These four results are used in virtually every inference calculation in PTS2. The t-result (third line) is needed for confidence intervals when \(\sigma\) is unknown. The chi-squared result (fourth line) is needed for variance inference and for understanding F-tests.

Node N6 — Section 4

Worked Examples

Example 1: Building a Chi-Squared Statistic from Different Normals

Let \(X_1, \ldots, X_5\) be independent random variables where \(X_i \sim N(i, i)\) for \(i = 1, 2, \ldots, 5\). Let \(Q = \sum_{i=1}^{5}\left(\dfrac{X_i - i}{\sqrt{i}}\right)^2\).

Find the distribution of \(Q\) and compute \(P(Q > 11.07)\).

Step 1: Standardise each X_i Define \(Z_i = \dfrac{X_i - i}{\sqrt{i}}\). Since \(X_i \sim N(i, i)\), we have \(Z_i \sim N(0, 1)\).
The \(Z_i\) are independent because the \(X_i\) are independent.

Step 2: Recognise Q as a sum of squared standard normals \(Q = \displaystyle\sum_{i=1}^{5} Z_i^2 \sim \chi^2(5)\).
We have 5 independent \(N(0,1)\) variables squared and summed, so 5 degrees of freedom.

Step 3: Look up the probability From chi-squared tables, with 5 d.f., the value 11.070 corresponds to the upper critical value at \(\alpha = 0.05\):
\(P(\chi^2(5) > 11.07) = 0.05\).

Example 2: Constructing a t-Statistic

Let \(X \sim N(3, 4)\) and let \(S^2\) be the sample variance of an independent sample of size 11 from any normal population. Find the distribution of \(T = \dfrac{X - 3}{2 S}\).

Step 1: Identify components \(Z = \dfrac{X - 3}{2} \sim N(0, 1)\) because \(X \sim N(3, 4)\) and \(\sigma = \sqrt{4} = 2\).

Step 2: Work with the sample variance The sample \(S^2\) comes from a sample of size \(n = 11\). Assuming it comes from a normal population, we know:\(\dfrac{10S^2}{\sigma^2} \sim \chi^2(10)\).
But the problem states \(S\) without specifying \(\sigma\). If the underlying population has \(\sigma = 1\) (standardised), then:\(10S^2 \sim \chi^2(10)\).

Step 3: Form the ratio \(T = \dfrac{X-3}{2S} = \dfrac{Z}{S}\).
If \(\sigma = 1\), then \(10S^2 \sim \chi^2(10)\), so \(S = \sqrt{\chi^2(10)/10}\), and:
\(T = \dfrac{Z}{\sqrt{\chi^2(10)/10}} \sim t(10)\).

If the underlying variance is not 1, then we need to rescale accordingly. In practice, exam questions of this type always specify or imply \(\sigma = 1\) for the sample variance population.

Example 3: F-Distribution from Two Independent Sample Variances

Two independent samples are drawn from normal populations with the same variance \(\sigma^2\):
Sample 1: \(n_1 = 5\) observations, sample variance \(S_1^2\).
Sample 2: \(n_2 = 7\) observations, sample variance \(S_2^2\).

Find the distribution of \(F = \dfrac{S_1^2}{S_2^2}\).

Step 1: Individual chi-squared distributions \(\dfrac{4 S_1^2}{\sigma^2} \sim \chi^2(4)\) and \(\dfrac{6 S_2^2}{\sigma^2} \sim \chi^2(6)\).

Step 2: Form the F-statistic \(F = \dfrac{S_1^2}{S_2^2} = \dfrac{[\chi^2(4)/4] \cdot \sigma^2}{[\chi^2(6)/6] \cdot \sigma^2} = \dfrac{\chi^2(4)/4}{\chi^2(6)/6} \sim F(4, 6)\).

Key insight: The \(\sigma^2\) cancels out. The ratio of two sample variances from normal populations with the same variance follows an F-distribution with degrees of freedom \((n_1 - 1, n_2 - 1)\). This is the foundation of the F-test for equal variances.

Node N6 — Section 5

Pattern Recognition & Examiner Traps

Trap 1: Forgetting to standardise before squaring \(X^2\) is NOT chi-squared unless \(X \sim N(0,1)\). If \(X \sim N(\mu, \sigma^2)\), you must first compute \(Z = (X-\mu)/\sigma\), then \(Z^2 \sim \chi^2(1)\). The most common error is squaring non-standardised normals.

WRONG If \(X \sim N(5, 3)\), then \(X^2 \sim \chi^2(1)\).

RIGHT If \(X \sim N(5, 3)\), then \(\dfrac{(X-5)^2}{3} \sim \chi^2(1)\). Standardise first, then square.

Trap 2: Missing degrees of freedom adjustments When the mean is estimated from the sample (replaced by \(\bar{X}\)), one degree of freedom is consumed. \(\frac{(n-1)S^2}{\sigma^2} \sim \chi^2(n-1)\), not \(\chi^2(n)\). This is the single most tested degrees-of-freedom concept.

Trap 3: Confusing the F distribution's numerator and denominator degrees of freedom \(F(\nu_1, \nu_2) \neq F(\nu_2, \nu_1)\). The order matters. The reciprocal property is: \(F_{\alpha; \nu_1, \nu_2} = 1/F_{1-\alpha; \nu_2, \nu_1}\).

Trap 4: Assuming independence without justification The t-distribution construction requires \(Z\) and \(V\) to be independent. For the sample mean and sample variance case, independence holds only for normal populations. For non-normal populations, \(\bar{X}\) and \(S^2\) are generally not independent.

Examiner patterns to recognise:

"Find the distribution of \(Y = \sum a_i X_i^2\)" — standardise each \(X_i\) individually to \(Z_i\), then the sum is \(\chi^2(\# \text{terms})\). Adjust for constraints (estimated parameters).
"Show that \(T \sim t(\nu)\)" — explicitly identify the numerator as \(N(0,1)\), the denominator as \(\sqrt{\chi^2(\nu)/\nu}\), and verify independence.
"Find the probability that \(F > c\)" — recognise as F-distribution, count d.f. in numerator and denominator, look up tables.
"Find E[T²]" or "Var(F)" — use known formulas for the moments: \(E[t(\nu)^2] = \nu/(\nu-2)\) for \(\nu > 2\), \(E[F(\nu_1,\nu_2)] = \nu_2/(\nu_2 - 2)\) for \(\nu_2 > 2\).

Node N6 — Section 6

Connections

How N6 connects to the rest of PTS2:

← N4 (Transformations): The derivations of chi-squared, t, and F PDFs from first principles use transformation methods and Jacobian techniques.
→ N7 (Point Estimation): Understanding sampling distributions is essential for evaluating estimators. The MSE of an estimator is computed with respect to its sampling distribution.
→ N8 (Confidence Intervals, One-Sample): The \(t\)-interval uses the t-distribution result. The variance interval uses the chi-squared result. The foundational sampling distributions are used directly in CI construction.
→ N9-N12 (Two-Sample Inference): The F-distribution is used to test equality of variances. Two-sample t-intervals depend on pooled variance having a chi-squared distribution.

In summary: N6 provides the probability distributions, and N8-N12 apply them to inference problems. The link is direct and non-negotiable.

Node N6 — Section 7

Summary Table

Distribution	Construction	Key Stats	Support	Key Use
\(\chi^2(\nu)\)	\(\sum_{i=1}^\nu Z_i^2, \; Z_i \sim N(0,1)\)	\(E = \nu, \; \text{Var} = 2\nu\)	\((0, \infty)\)	Variance inference
\(t(\nu)\)	\(\dfrac{Z}{\sqrt{V/\nu}}\)	\(E = 0 \;(\nu > 1)\)	\((-\infty, \infty)\)	Unknown \(\sigma\) CI
\(F(\nu_1,\nu_2)\)	\(\dfrac{U/\nu_1}{V/\nu_2}\)	\(E = \dfrac{\nu_2}{\nu_2-2} \;(\nu_2 > 2)\)	\((0, \infty)\)	Variance ratio test

The Three Pillars Chi-squared, t, and F are all built from N(0,1). Chi-squared = sum of squares. t = normal / sqrt(chi²/d.f.). F = ratio of scaled chi-squares.

Degrees of Freedom Rule Each estimated parameter costs 1 d.f. If you estimate \(\mu\) with \(\bar{X}\), you lose 1 d.f. from chi-squared.

Independence is Essential The t-distribution requires Z and V to be independent. The F-distribution requires the two chi-squares to be independent.

t → N as ν → ∞ As degrees of freedom increase, the t-distribution converges to the standard normal. For \(\nu > 30\), the difference is negligible in practice.

Node N6 — Section 8

Self-Assessment

Test your understanding before moving to N7:

Can you do all of these?

Construct a \(\chi^2\), \(t\), and \(F\) statistic from given independent normal variables.
State the PDF, expectation, and variance of each of the three distributions.
Explain why \(\frac{(n-1)S^2}{\sigma^2} \sim \chi^2(n-1)\) and what each component means.
Given heterogeneous normals, compute \(\sum\frac{(X_i - \mu_i)^2}{\sigma_i^2}\) and identify its distribution.
Use the reciprocal property of the F-distribution: \(F_{\alpha;\nu_1,\nu_2} = 1/F_{1-\alpha;\nu_2,\nu_1}\).
Compute \(E[T^2]\) for \(T \sim t(\nu)\). [Answer: \(\nu/(\nu-2)\) if \(\nu > 2\).]

Practice Problems

If \(X \sim N(0,4)\), what is the distribution of \(X^2/4\)? [Answer: \(\chi^2(1)\).]
Prove that if \(Y \sim \chi^2(1)\), then \(E[Y] = 1\) and \(\text{Var}(Y) = 2\).
If \(X_1, X_2, X_3 \sim N(2,1)\), find \(P(\sum (X_i - 2)^2 < 7.815)\). [Answer: Use \(\chi^2(3)\), 7.815 is the 5% upper critical value, so answer is 0.95.]

High-Leverage Questions

HLQ: Exam-Style Question with Worked Solution

14 MARKS CONSTRUCTION PROOF MULTI-PART

Let \(X_1, X_2, X_3, X_4\) be independent random variables with \(X_i \sim N(0, i)\) for \(i = 1, 2, 3, 4\).

(a) Find the distribution of \(Y = \dfrac{X_1^2}{1} + \dfrac{X_2^2}{2} + \dfrac{X_3^2}{3}\). (3 marks)

(b) Find the distribution of \(W = \dfrac{X_1^2}{1} + \dfrac{X_2^2}{2} + \dfrac{X_3^2}{3} + \dfrac{X_4^2}{4}\). (3 marks)

(c) Let \(T = \dfrac{X_1}{\sqrt{W/4}}\). What is the distribution of \(T\)? (4 marks)

(d) Compute \(E[W]\) and \(\text{Var}(W)\). (4 marks)

Part (a): Distribution of Y Standardise: \(Z_i = \dfrac{X_i}{\sqrt{i}} \sim N(0,1)\) for \(i = 1, 2, 3\).
Then \(Z_i^2 \sim \chi^2(1)\) and the \(Z_i\) are independent.
So \(Y = Z_1^2 + Z_2^2 + Z_3^2 \sim \chi^2(3)\).

Part (b): Distribution of W Similarly, \(Z_4 = \dfrac{X_4}{\sqrt{4}} = \dfrac{X_4}{2} \sim N(0,1)\), so \(Z_4^2 \sim \chi^2(1)\).
\(W = Z_1^2 + Z_2^2 + Z_3^2 + Z_4^2 \sim \chi^2(4)\).

Part (c): Distribution of T \(T = \dfrac{X_1}{\sqrt{W/4}} = \dfrac{Z_1}{\sqrt{\chi^2(4)/4}}\).
Here \(Z_1 \sim N(0,1)\) and \(W \sim \chi^2(4)\) are independent (\(W\) contains \(Z_1^2\) as part of the sum, but the t-distribution result still applies because the numerator variable \(Z_1\) is part of the denominator's chi-squared sum — however, this requires caution).

Actually, we need independence between numerator and denominator. Here \(Z_1\) appears in both the numerator and in \(W\), so they are not independent. This means \(T\) does not follow a standard t-distribution.

Alternative exam-sensible interpretation: If the question intended the numerator to be independent, we would use a different component. For instance, \(T = \dfrac{X_1}{\sqrt{(X_2^2/2 + X_3^2/3 + X_4^2/4)/3}}\) would be \(t(3)\) because \(Z_1\) is not in the denominator sum.

Part (d): Expectation and Variance of W Since \(W \sim \chi^2(4)\):
\(E[W] = 4\).
\(\text{Var}(W) = 2 \times 4 = 8\).

Alternatively, from linearity:
\(E[Z_i^2] = 1\) for each \(i\), so \(E[W] = 1 + 1 + 1 + 1 = 4\).
\(\text{Var}(Z_i^2) = 2\) for each \(i\), so \(\text{Var}(W) = 2 + 2 + 2 + 2 = 8\).

Summary of answers: (a) \(Y \sim \chi^2(3)\). (b) \(W \sim \chi^2(4)\). (c) T is NOT t-distributed because numerator and denominator are not independent (\(Z_1\) appears in both). (d) \(E[W] = 4\), \(\text{Var}(W) = 8\). \(\checkmark\)