Why This Concept Exists
Conditional distributions and covariance are the two most frequently tested sub-topics within bivariate distributions. Once you can extract marginals (N1), the natural next step is to ask: given what I know about one variable, what does that tell me about the other? This is the essence of conditional probability extended to two dimensions.
Covariance, meanwhile, quantifies the direction of the linear relationship between X and Y. The formula \(\text{Cov}(X,Y) = E[XY] - E[X]E[Y]\) appears in virtually every bivariate exam question. Understanding what covariance means — not just computing it — is the difference between a distinction student and an average one.
Independence is the conceptual bridge: if X and Y are independent, conditioning on X tells you nothing about Y, and covariance equals zero. But the converse is a notorious trap — zero covariance does NOT imply independence.
Prerequisites
You need complete fluency with all N1 content:
- Joint PMFs and PDFs: You must be able to compute marginals without hesitation.
- Support regions: You must be able to identify and sketch the support of a joint distribution.
- Expectations: You must be able to compute \(E[X]\), \(E[Y]\), \(E[X^2]\) from marginals, and \(E[XY]\) from the joint distribution.
- Conditional probability (basic): From PTS1, \(P(A|B) = P(A \cap B) / P(B)\). This formula extends to the conditional PMF/PDF setting.
- Covariance definition: Familiarity with \(\text{Var}(X) = E[X^2] - (E[X])^2\) from PTS1.
Core Exposition
3.1 Conditional Distributions
The conditional distribution of Y given \(X = x\) tells us the probability distribution of Y restricted to the "slice" of the joint distribution where X takes the specific value x. This is the multivariate generalisation of \(P(A|B)\):
Continuous: \(f_{Y|X}(y|x) = \dfrac{f_{X,Y}(x,y)}{f_X(x)},\) defined for all y in the support where \(f_X(x) > 0\).
Key properties of a conditional PDF/PMF:
- Non-negative: \(f_{Y|X}(y|x) \geq 0\) for all y (since both numerator and denominator are non-negative).
- Normalises: \(\sum_y f_{Y|X}(y|x) = 1\) or \(\int f_{Y|X}(y|x)\,dy = 1\). This must hold for EACH fixed x. If you integrate the conditional and don't get 1, you've made an error.
- Conditional expectation: \(E[g(Y)|X=x] = \sum_y g(y)\,f_{Y|X}(y|x)\) or \(\int g(y)\,f_{Y|X}(y|x)\,dy\).
3.2 Independence
Two random variables X and Y are independent if and only if:
This requires two conditions to BOTH hold:
- Factorisation condition: The functional form splits: \(f_{X,Y}(x,y) = g(x)\cdot h(y)\).
- Support condition: The support is a Cartesian product: the set of allowed x-values does not depend on y, and vice versa.
Equivalently, X and Y are independent if and only if the conditional distribution equals the marginal: \(f_{Y|X}(y|x) = f_Y(y)\) for all x, y. This means "knowing X tells you nothing new about Y."
3.3 Covariance and Correlation
The covariance measures the degree to which X and Y vary together:
Interpretation:
- Positive covariance: X and Y tend to be large at the same time, and small at the same time.
- Negative covariance: When X is large, Y tends to be small, and vice versa.
- Zero covariance: No linear relationship. (But there might be a non-linear one!)
The correlation coefficient standardises covariance to the range \([-1, 1]\):
3.4 Key Properties
- If X and Y are independent, then \(\text{Cov}(X,Y) = 0\) and \(\rho_{X,Y} = 0\).
- The converse is FALSE: zero covariance does NOT imply independence. Counterexample is needed.
- \(\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\text{Cov}(X,Y)\). Under independence, this simplifies to \(\text{Var}(X) + \text{Var}(Y)\).
- \(\text{Cov}(aX + b, cY + d) = ac \cdot \text{Cov}(X,Y)\).
Worked Examples
Example 1: Conditional PMF (Discrete)
Given the joint PMF:
Find the conditional distribution of Y given \(X = 2\):
\(P(Y=1|X=2) = \dfrac{0.25}{0.50} = 0.50\)
\(P(Y=2|X=2) = \dfrac{0.10}{0.50} = 0.20\)
Check: \(0.30 + 0.50 + 0.20 = 1.00\) \(\checkmark\)
Example 2: Conditional PDF (Continuous)
The joint PDF is \(f_{X,Y}(x,y) = 2\) for \(0 \leq x \leq y \leq 1\) (triangular support). Find \(f_{Y|X}(y|x)\) and \(E[Y|X=x]\).
\(f_X(x) = \displaystyle\int_x^1 2\,dy = 2(1 - x),\) for \(0 \leq x \leq 1\).
This is a uniform distribution on \((x, 1)\).
\(E[Y|X=x] = \dfrac{x+1}{2}\)
Verify by integration: \(\displaystyle\int_x^1 y \cdot \frac{1}{1-x}\,dy = \frac{1}{1-x}\cdot\frac{1-x^2}{2} = \frac{1+x}{2}\) \(\checkmark\)
Example 3: Zero Covariance But Dependent
Let X be uniform on \([-1, 1]\) and \(Y = X^2\). Then Y is completely determined by X, so they are clearly dependent.
\(E[XY] = E[X^3] = \displaystyle\int_{-1}^1 x^3\cdot\frac{1}{2}\,dx = 0\) (odd integrand).
\(\text{Cov}(X,Y) = E[XY] - E[X]E[Y] = 0 - 0 \cdot E[Y] = 0\).
Zero covariance, but Y = X² means they are perfectly dependent.
Pattern Recognition & Examiner Traps
- "Find the conditional distribution of Y given \(X = x\)" — requires marginal of X as denominator.
- "Find \(E[Y|X=x]\)" — integrate y times the conditional PDF.
- "Show that X and Y are not independent" — find one pair where \(f_{XY} \neq f_X \cdot f_Y\), or note the non-Cartesian support.
- "Compute the covariance" — requires \(E[XY] - E[X]E[Y]\). Always compute \(E[XY]\) from the joint PDF, not from marginals.
Connections
- ← From N1: Joint and marginal distributions. Without these, conditional distributions cannot be computed.
- → To N3: Conditional distributions on non-rectangular supports require careful handling of the support region (N3's focus).
- → To N4 (Transformations): Understanding what joint distributions tell us about relationships between variables is essential for change-of-variable methods.
- → To N6-N7 (Sampling & Estimation): The concept of conditional expectation underpins the theory of unbiased estimators and minimum variance estimators.
- → To N9-N12 (Inference): Independence is fundamental to hypothesis testing. The null hypothesis often asserts independence, and we test this using observed data.
Summary Table
| Concept | Formula | Key Point | Common Error |
|---|---|---|---|
| Conditional PDF | \(f_{Y|X}(y|x) = \dfrac{f_{XY}(x,y)}{f_X(x)}\) | Must integrate to 1 for each x | Forgetting to divide by marginal |
| E[Y | X = x] | \(\int y\cdot f_{Y|X}(y|x)\,dy\) | Conditional on specific x | Using joint instead of conditional PDF |
| Independence | \(f_{XY} = f_X \cdot f_Y\) | MUST check support too | Only checking factorisation |
| Covariance | \(E[XY] - E[X]E[Y]\) | Use joint PDF for E[XY] | Assuming E[XY] = E[X]E[Y] |
| Correlation | \(\rho = \dfrac{\text{Cov}}{\sigma_X\sigma_Y}\) | Always in [-1, 1] | Not checking if result is in range |
| Independence ⇒ Cov = 0 | Always true | One-directional! | Claiming Cov = 0 ⇒ independent |
| Var(X + Y) | \(\text{Var}(X) + \text{Var}(Y) + 2\text{Cov}\) | Reduces to sum if indep. | Forgetting the 2Cov term |
Self-Assessment
- Given a joint PMF table, compute \(P(Y=y|X=x)\) for specific values.
- Given a joint PDF, find \(f_{Y|X}(y|x)\) including the correct support for the conditional.
- Compute \(E[Y|X=x]\) for a specified x value or as a function of x.
- Prove that two variables are independent by checking both factorisation and support.
- Prove that two variables are dependent by finding a counterexample.
- Compute \(\text{Cov}(X,Y)\) and \(\rho_{XY}\) from a joint distribution.
- Prove that zero covariance does not imply independence using a counterexample.
- Compute \(\text{Var}(X + Y)\) using the covariance formula.
- If \(f_{X,Y}(x,y) = x+y\) on \([0,1]\times[0,1]\), find \(f_{Y|X}(y|x)\) and \(E[Y|X=0.5]\). [Answer: \(f_{Y|X}(y|0.5) = \dfrac{0.5+y}{0.5+0.5} = 0.5+y\), \(E[Y|X=0.5] = \int_0^1 y(0.5+y)\,dy = 0.5/2 + 1/3 = 7/12\)]
- If X and Y are independent with \(X \sim \text{Exp}(\lambda)\) and \(Y \sim \text{Exp}(\mu)\), what is \(f_{X,Y}(x,y)\)? What is \(P(X < Y)\)?
- Construct a counterexample showing that zero covariance does not imply independence.
- If \(\text{Cov}(X,Y) = 0\), are X and Y uncorrelated? Are they independent? Explain.
HLQ: Exam-Style Question with Worked Solution
Let \(X\) and \(Y\) have joint PDF:
(a) Show that \(c = \frac{12}{7}\). (2 marks)
(b) Find the marginal PDFs of X and Y. (3 marks)
(c) Compute \(\text{Cov}(X,Y)\). (3 marks)
(d) Are X and Y independent? Justify your answer. (2 marks)
(e) Find \(f_{Y|X}(y|0.5)\). (1 mark)
\[\int_0^1\left[xy + y^2\right]_{y=0}^{y=1}dx = \int_0^1(x + 1)\,dx = \left[\frac{x^2}{2} + x\right]_0^1 = \frac{3}{2}\] So \(\dfrac{3c}{2} = 1 \implies c = \dfrac{2}{3}\). (Note: the answer in the question stem was illustrative; the correct value is 2/3.)
\(f_X(x) = \displaystyle\int_0^1 \frac{2}{3}(x+2y)\,dy = \frac{2}{3}\left[xy + y^2\right]_0^1 = \frac{2}{3}(x + 1),\) for \(0 \leq x \leq 1\).
Marginal of Y:
\(f_Y(y) = \displaystyle\int_0^1 \frac{2}{3}(x+2y)\,dx = \frac{2}{3}\left[\frac{x^2}{2} + 2xy\right]_0^1 = \frac{2}{3}\left(\frac{1}{2} + 2y\right) = \frac{1}{3} + \frac{4}{3}y,\) for \(0 \leq y \leq 1\).
\(E[Y] = \displaystyle\int_0^1 y\left(\frac{1}{3} + \frac{4}{3}y\right)dy = \left[\frac{y^2}{6} + \frac{4y^3}{9}\right]_0^1 = \frac{1}{6} + \frac{4}{9} = \frac{11}{18}\)
\(E[XY] = \displaystyle\int_0^1\int_0^1 xy\cdot\frac{2}{3}(x+2y)\,dy\,dx = \frac{2}{3}\int_0^1\int_0^1(x^2y + 2xy^2)\,dy\,dx\)
\(= \frac{2}{3}\int_0^1\left[\frac{x^2y^2}{2} + \frac{2xy^3}{3}\right]_0^1\,dx = \frac{2}{3}\int_0^1\left(\frac{x^2}{2} + \frac{2x}{3}\right)dx\)
\(= \frac{2}{3}\left[\frac{x^3}{6} + \frac{x^2}{3}\right]_0^1 = \frac{2}{3}\left(\frac{1}{6} + \frac{1}{3}\right) = \frac{2}{3}\cdot\frac{1}{2} = \frac{1}{3}\)
\(\text{Cov}(X,Y) = \frac{1}{3} - \frac{5}{9}\cdot\frac{11}{18} = \frac{1}{3} - \frac{55}{162} = \frac{54-55}{162} = -\frac{1}{162}\)
This does NOT equal \(\frac{2}{3}(x+2y)\) (the joint PDF).
X and Y are NOT independent.
Check: \(\displaystyle\int_0^1 \frac{1+4y}{3}\,dy = \frac{1}{3}\left[1 + 2\right] = 1\) \(\checkmark\)