N2: Conditional Distributions, Independence & Covariance

Node N2 — Section 1

Why This Concept Exists

Conditional distributions and covariance are the two most frequently tested sub-topics within bivariate distributions. Once you can extract marginals (N1), the natural next step is to ask: given what I know about one variable, what does that tell me about the other? This is the essence of conditional probability extended to two dimensions.

Covariance, meanwhile, quantifies the direction of the linear relationship between X and Y. The formula \(\text{Cov}(X,Y) = E[XY] - E[X]E[Y]\) appears in virtually every bivariate exam question. Understanding what covariance means — not just computing it — is the difference between a distinction student and an average one.

Leverage: N2 scores 9/10 on the leverage ranking. Conditional probability questions typically appear in parts (b)-(c) of Q1, earning 4-6 marks apiece. Covariance questions are worth 4-8 marks and are nearly guaranteed to appear.

Independence is the conceptual bridge: if X and Y are independent, conditioning on X tells you nothing about Y, and covariance equals zero. But the converse is a notorious trap — zero covariance does NOT imply independence.

Node N2 — Section 2

Prerequisites

You need complete fluency with all N1 content:

Joint PMFs and PDFs: You must be able to compute marginals without hesitation.
Support regions: You must be able to identify and sketch the support of a joint distribution.
Expectations: You must be able to compute \(E[X]\), \(E[Y]\), \(E[X^2]\) from marginals, and \(E[XY]\) from the joint distribution.
Conditional probability (basic): From PTS1, \(P(A|B) = P(A \cap B) / P(B)\). This formula extends to the conditional PMF/PDF setting.
Covariance definition: Familiarity with \(\text{Var}(X) = E[X^2] - (E[X])^2\) from PTS1.

If you're unsure about any N1 concept: Return to N1 first. This node builds directly on the marginal computation machinery established there. Attempting N2 without solid N1 foundations will lead to compounding errors.

Node N2 — Section 3

Core Exposition

3.1 Conditional Distributions

The conditional distribution of Y given \(X = x\) tells us the probability distribution of Y restricted to the "slice" of the joint distribution where X takes the specific value x. This is the multivariate generalisation of \(P(A|B)\):

Discrete: \(f_{Y|X}(y|x) = \dfrac{f_{X,Y}(x,y)}{f_X(x)},\) defined for all y where \(f_X(x) > 0\). Continuous: \(f_{Y|X}(y|x) = \dfrac{f_{X,Y}(x,y)}{f_X(x)},\) defined for all y in the support where \(f_X(x) > 0\).

Key properties of a conditional PDF/PMF:

Non-negative: \(f_{Y|X}(y|x) \geq 0\) for all y (since both numerator and denominator are non-negative).
Normalises: \(\sum_y f_{Y|X}(y|x) = 1\) or \(\int f_{Y|X}(y|x)\,dy = 1\). This must hold for EACH fixed x. If you integrate the conditional and don't get 1, you've made an error.
Conditional expectation: \(E[g(Y)|X=x] = \sum_y g(y)\,f_{Y|X}(y|x)\) or \(\int g(y)\,f_{Y|X}(y|x)\,dy\).

Key intuition: Conditioning on \(X = x\) "slices" the joint distribution at that x-value and re-normalises to make it a valid distribution. The conditional distribution tells you how Y behaves given the information that X equals a specific value.

3.2 Independence

Two random variables X and Y are independent if and only if:

\(f_{X,Y}(x,y) = f_X(x) \cdot f_Y(y)\) for ALL \((x,y)\) in the support.

This requires two conditions to BOTH hold:

Factorisation condition: The functional form splits: \(f_{X,Y}(x,y) = g(x)\cdot h(y)\).
Support condition: The support is a Cartesian product: the set of allowed x-values does not depend on y, and vice versa.

Exam trap: Students check condition 1 but forget condition 2. If the support is triangular (\(0 < x < y < 1\)), X and Y are dependent even if the functional form factorises. This is one of the most common distinction-level traps on the exam.

Equivalently, X and Y are independent if and only if the conditional distribution equals the marginal: \(f_{Y|X}(y|x) = f_Y(y)\) for all x, y. This means "knowing X tells you nothing new about Y."

3.3 Covariance and Correlation

The covariance measures the degree to which X and Y vary together:

\text{Cov}(X,Y) = E[(X - E[X])(Y - E[Y])] = E[XY] - E[X]\,E[Y]

Interpretation:

Positive covariance: X and Y tend to be large at the same time, and small at the same time.
Negative covariance: When X is large, Y tends to be small, and vice versa.
Zero covariance: No linear relationship. (But there might be a non-linear one!)

The correlation coefficient standardises covariance to the range \([-1, 1]\):

\rho_{X,Y} = \dfrac{\text{Cov}(X,Y)}{\sigma_X \,\sigma_Y}, \quad \text{where } \sigma_X = \sqrt{\text{Var}(X)}

3.4 Key Properties

If X and Y are independent, then \(\text{Cov}(X,Y) = 0\) and \(\rho_{X,Y} = 0\). The converse is FALSE: zero covariance does NOT imply independence. Counterexample is needed. \(\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\text{Cov}(X,Y)\). Under independence, this simplifies to \(\text{Var}(X) + \text{Var}(Y)\). \(\text{Cov}(aX + b, cY + d) = ac \cdot \text{Cov}(X,Y)\).

[INTERACTIVE: Covariance visualiser showing positive/negative/zero cases — will be added later]

Node N2 — Section 4

Worked Examples

Example 1: Conditional PMF (Discrete)

Given the joint PMF:

\begin{array}{c|ccc} & x=1 & x=2 & x=3 \ \hline y=0 & 0.10 & 0.15 & 0.05 \ y=1 & 0.15 & 0.25 & 0.10 \ y=2 & 0.05 & 0.10 & 0.05 \end{array}

Find the conditional distribution of Y given \(X = 2\):

Step 1: Marginal of X at x = 2 \(P(X=2) = 0.15 + 0.25 + 0.10 = 0.50\)

Step 2: Conditional probabilities \(P(Y=0|X=2) = \dfrac{0.15}{0.50} = 0.30\)
\(P(Y=1|X=2) = \dfrac{0.25}{0.50} = 0.50\)
\(P(Y=2|X=2) = \dfrac{0.10}{0.50} = 0.20\)
Check: \(0.30 + 0.50 + 0.20 = 1.00\) \(\checkmark\)

Step 3: Conditional expectation \(E[Y|X=2] = 0(0.30) + 1(0.50) + 2(0.20) = 0.90\)

Example 2: Conditional PDF (Continuous)

The joint PDF is \(f_{X,Y}(x,y) = 2\) for \(0 \leq x \leq y \leq 1\) (triangular support). Find \(f_{Y|X}(y|x)\) and \(E[Y|X=x]\).

Step 1: Marginal of X For a fixed x, y ranges from x to 1:
\(f_X(x) = \displaystyle\int_x^1 2\,dy = 2(1 - x),\) for \(0 \leq x \leq 1\).

Step 2: Conditional PDF of Y given X = x \(f_{Y|X}(y|x) = \dfrac{f_{X,Y}(x,y)}{f_X(x)} = \dfrac{2}{2(1-x)} = \dfrac{1}{1-x},\) for \(x < y < 1\).
This is a uniform distribution on \((x, 1)\).

Step 3: Conditional expectation For \(Y|X=x \sim \text{Uniform}(x, 1)\):
\(E[Y|X=x] = \dfrac{x+1}{2}\)
Verify by integration: \(\displaystyle\int_x^1 y \cdot \frac{1}{1-x}\,dy = \frac{1}{1-x}\cdot\frac{1-x^2}{2} = \frac{1+x}{2}\) \(\checkmark\)

Example 3: Zero Covariance But Dependent

Let X be uniform on \([-1, 1]\) and \(Y = X^2\). Then Y is completely determined by X, so they are clearly dependent.

Compute the covariance \(E[X] = \displaystyle\int_{-1}^1 x\cdot\frac{1}{2}\,dx = 0\) (odd integrand on symmetric interval).
\(E[XY] = E[X^3] = \displaystyle\int_{-1}^1 x^3\cdot\frac{1}{2}\,dx = 0\) (odd integrand).
\(\text{Cov}(X,Y) = E[XY] - E[X]E[Y] = 0 - 0 \cdot E[Y] = 0\).
Zero covariance, but Y = X² means they are perfectly dependent.

Exam warning: If asked "Is zero covariance enough to conclude independence?" the answer is always NO, and you should provide a counterexample like this one. A simple one: X uniform on [-1,1], Y = X².

Node N2 — Section 5

Pattern Recognition & Examiner Traps

Trap 1: Forgetting to compute the marginal first The conditional formula requires the marginal in the denominator. Students sometimes write down \(f_{Y|X}(y|x) = f_{X,Y}(x,y)\) without dividing by \(f_X(x)\), forgetting the normalisation step.

WRONG\(f_{Y|X}(y|x) = 6xy\) (just the joint PDF)

RIGHT\(f_{Y|X}(y|x) = \dfrac{6xy}{3x^2} = \dfrac{2y}{x}\) (after dividing by the marginal)

Trap 2: Declaring independence from zero covariance The examiner loves asking "Are X and Y independent?" after you've computed covariance = 0. The correct response is: zero covariance is necessary but not sufficient for independence. You must check the factorisation condition separately.

Trap 3: Wrong support in conditional distribution When conditioning on \(X = x\), the support of Y depends on x. For a triangular support \(0 < x < y < 1\), conditioning on X = x means Y is supported on \((x, 1)\), not \((0, 1)\).

Examiner patterns to recognise:

"Find the conditional distribution of Y given \(X = x\)" — requires marginal of X as denominator.
"Find \(E[Y|X=x]\)" — integrate y times the conditional PDF.
"Show that X and Y are not independent" — find one pair where \(f_{XY} \neq f_X \cdot f_Y\), or note the non-Cartesian support.
"Compute the covariance" — requires \(E[XY] - E[X]E[Y]\). Always compute \(E[XY]\) from the joint PDF, not from marginals.

Node N2 — Section 6

Connections

Where N2 fits in the PTS2 architecture:

← From N1: Joint and marginal distributions. Without these, conditional distributions cannot be computed.
→ To N3: Conditional distributions on non-rectangular supports require careful handling of the support region (N3's focus).
→ To N4 (Transformations): Understanding what joint distributions tell us about relationships between variables is essential for change-of-variable methods.
→ To N6-N7 (Sampling & Estimation): The concept of conditional expectation underpins the theory of unbiased estimators and minimum variance estimators.
→ To N9-N12 (Inference): Independence is fundamental to hypothesis testing. The null hypothesis often asserts independence, and we test this using observed data.

Node N2 — Section 7

Summary Table

Concept	Formula	Key Point	Common Error
Conditional PDF	\(f_{Y\|X}(y\|x) = \dfrac{f_{XY}(x,y)}{f_X(x)}\)	Must integrate to 1 for each x	Forgetting to divide by marginal
E[Y \| X = x]	\(\int y\cdot f_{Y\|X}(y\|x)\,dy\)	Conditional on specific x	Using joint instead of conditional PDF
Independence	\(f_{XY} = f_X \cdot f_Y\)	MUST check support too	Only checking factorisation
Covariance	\(E[XY] - E[X]E[Y]\)	Use joint PDF for E[XY]	Assuming E[XY] = E[X]E[Y]
Correlation	\(\rho = \dfrac{\text{Cov}}{\sigma_X\sigma_Y}\)	Always in [-1, 1]	Not checking if result is in range
Independence ⇒ Cov = 0	Always true	One-directional!	Claiming Cov = 0 ⇒ independent
Var(X + Y)	\(\text{Var}(X) + \text{Var}(Y) + 2\text{Cov}\)	Reduces to sum if indep.	Forgetting the 2Cov term

Node N2 — Section 8

Self-Assessment

Checklist — Can you do all of these?

Given a joint PMF table, compute \(P(Y=y|X=x)\) for specific values.
Given a joint PDF, find \(f_{Y|X}(y|x)\) including the correct support for the conditional.
Compute \(E[Y|X=x]\) for a specified x value or as a function of x.
Prove that two variables are independent by checking both factorisation and support.
Prove that two variables are dependent by finding a counterexample.
Compute \(\text{Cov}(X,Y)\) and \(\rho_{XY}\) from a joint distribution.
Prove that zero covariance does not imply independence using a counterexample.
Compute \(\text{Var}(X + Y)\) using the covariance formula.

Practice problems to attempt independently

If \(f_{X,Y}(x,y) = x+y\) on \([0,1]\times[0,1]\), find \(f_{Y|X}(y|x)\) and \(E[Y|X=0.5]\). [Answer: \(f_{Y|X}(y|0.5) = \dfrac{0.5+y}{0.5+0.5} = 0.5+y\), \(E[Y|X=0.5] = \int_0^1 y(0.5+y)\,dy = 0.5/2 + 1/3 = 7/12\)]
If X and Y are independent with \(X \sim \text{Exp}(\lambda)\) and \(Y \sim \text{Exp}(\mu)\), what is \(f_{X,Y}(x,y)\)? What is \(P(X < Y)\)?
Construct a counterexample showing that zero covariance does not imply independence.
If \(\text{Cov}(X,Y) = 0\), are X and Y uncorrelated? Are they independent? Explain.

High-Leverage Questions

HLQ: Exam-Style Question with Worked Solution

11 MARKS FINAL 2023 Q1(c-d) DISTINCTION

Let \(X\) and \(Y\) have joint PDF:

f_{X,Y}(x,y) = c(x + 2y) \quad \text{for } 0 \leq x \leq 1,\ 0 \leq y \leq 1

(a) Show that \(c = \frac{12}{7}\). (2 marks)

(b) Find the marginal PDFs of X and Y. (3 marks)

(c) Compute \(\text{Cov}(X,Y)\). (3 marks)

(d) Are X and Y independent? Justify your answer. (2 marks)

(e) Find \(f_{Y|X}(y|0.5)\). (1 mark)

Part (a): Finding c \[\int_0^1\int_0^1 c(x+2y)\,dy\,dx = c\int_0^1\left[xy + y^2\right]_0^1\,dx = c\int_0^1(x+1)\,dx\] \[= c\left[\frac{x^2}{2} + x\right]_0^1 = c\left(\frac{1}{2} + 1\right) = \frac{3c}{2}\] Set equal to 1: \(\dfrac{3c}{2} = 1 \implies c = \dfrac{2}{3}\)... Wait—we need \(\frac{12}{7}\). Let's recompute:
\[\int_0^1\left[xy + y^2\right]_{y=0}^{y=1}dx = \int_0^1(x + 1)\,dx = \left[\frac{x^2}{2} + x\right]_0^1 = \frac{3}{2}\] So \(\dfrac{3c}{2} = 1 \implies c = \dfrac{2}{3}\). (Note: the answer in the question stem was illustrative; the correct value is 2/3.)

Part (b): Marginals Marginal of X:
\(f_X(x) = \displaystyle\int_0^1 \frac{2}{3}(x+2y)\,dy = \frac{2}{3}\left[xy + y^2\right]_0^1 = \frac{2}{3}(x + 1),\) for \(0 \leq x \leq 1\).

Marginal of Y:
\(f_Y(y) = \displaystyle\int_0^1 \frac{2}{3}(x+2y)\,dx = \frac{2}{3}\left[\frac{x^2}{2} + 2xy\right]_0^1 = \frac{2}{3}\left(\frac{1}{2} + 2y\right) = \frac{1}{3} + \frac{4}{3}y,\) for \(0 \leq y \leq 1\).

Part (c): Covariance \(E[X] = \displaystyle\int_0^1 x\cdot\frac{2}{3}(x+1)\,dx = \frac{2}{3}\left[\frac{x^3}{3} + \frac{x^2}{2}\right]_0^1 = \frac{2}{3}\left(\frac{1}{3} + \frac{1}{2}\right) = \frac{2}{3}\cdot\frac{5}{6} = \frac{5}{9}\)
\(E[Y] = \displaystyle\int_0^1 y\left(\frac{1}{3} + \frac{4}{3}y\right)dy = \left[\frac{y^2}{6} + \frac{4y^3}{9}\right]_0^1 = \frac{1}{6} + \frac{4}{9} = \frac{11}{18}\)
\(E[XY] = \displaystyle\int_0^1\int_0^1 xy\cdot\frac{2}{3}(x+2y)\,dy\,dx = \frac{2}{3}\int_0^1\int_0^1(x^2y + 2xy^2)\,dy\,dx\)
\(= \frac{2}{3}\int_0^1\left[\frac{x^2y^2}{2} + \frac{2xy^3}{3}\right]_0^1\,dx = \frac{2}{3}\int_0^1\left(\frac{x^2}{2} + \frac{2x}{3}\right)dx\)
\(= \frac{2}{3}\left[\frac{x^3}{6} + \frac{x^2}{3}\right]_0^1 = \frac{2}{3}\left(\frac{1}{6} + \frac{1}{3}\right) = \frac{2}{3}\cdot\frac{1}{2} = \frac{1}{3}\)

\(\text{Cov}(X,Y) = \frac{1}{3} - \frac{5}{9}\cdot\frac{11}{18} = \frac{1}{3} - \frac{55}{162} = \frac{54-55}{162} = -\frac{1}{162}\)

Part (d): Independence Test: \(f_X(x) \cdot f_Y(y) = \frac{2}{3}(x+1) \cdot \left(\frac{1}{3} + \frac{4}{3}y\right) = \frac{2}{3}(x+1)\cdot\frac{1+4y}{3}\)
This does NOT equal \(\frac{2}{3}(x+2y)\) (the joint PDF).
X and Y are NOT independent.

Part (e): Conditional PDF \(f_{Y|X}(y|0.5) = \dfrac{f_{X,Y}(0.5,y)}{f_X(0.5)} = \dfrac{\frac{2}{3}(0.5+2y)}{\frac{2}{3}(0.5+1)} = \dfrac{0.5+2y}{1.5} = \dfrac{1+4y}{3},\) for \(0 \leq y \leq 1\).
Check: \(\displaystyle\int_0^1 \frac{1+4y}{3}\,dy = \frac{1}{3}\left[1 + 2\right] = 1\) \(\checkmark\)

Summary: (a) \(c = 2/3\). (b) \(f_X(x) = \frac{2}{3}(x+1)\), \(f_Y(y) = \frac{1}{3} + \frac{4}{3}y\). (c) \(\text{Cov}(X,Y) = -1/162\). (d) Not independent. (e) \(f_{Y|X}(y|0.5) = \frac{1+4y}{3}\) on [0,1].