Skip to content

Statistics

This page reviews concepts from statistics that are useful for econometrics.

Related pages: Probability

§1. Sampling

Calibration

(Sample reweighting)

Methods

§2. Statistics

Statistics for random scalars

Let \(X^n = (X_i)_{i=1}^n\) denote an i.i.d. random sample of size \(n\).

Sample mean (an unbiased estimator of true mean):

\[\begin{equation} \overline{X^n} = \frac{1}{n}\sum_{i=1}^n X_i \end{equation}\]

Sample variance (an unbiased estimator of true variance):

\[\begin{equation} \widehat{Var}(X^n) = \frac{1}{n-1}\sum_{i=1}^n (X_i - \overline{X^n })^2 \end{equation}\]

Sample standard deviation: \begin{equation} \widehat{SD}(X^n) = \sqrt{\frac{1}{n-1}\sum_{i=1}^n (X_i - \overline{X^n })^2} \end{equation}

Sampling distribution of statistics

Sampling variance of the sample mean: \begin{align} \operatorname{Var}(\overline{X^n}) & = \operatorname{Var}( \frac{1}{n}\sum_{i=1}^{n}X_i ) \nonumber \\ & = \frac{1}{n^2} \operatorname{Var}(\sum_{i=1}^{n}X_i) \nonumber \\ & = \frac{1}{n^2} \sum_{i=1}^{n}\operatorname{Var}(X_i) \nonumber \\ & = \frac{n \sigma^2 }{n^2} \nonumber \\ & = \frac{ \sigma^2 }{n} \end{align}

where \(\sigma^2\) is the true variance of \(X_i\).

Standard error of the sample mean: \begin{equation} \operatorname{SE}(\overline{X^n}) = \frac{\sigma}{\sqrt{n}} \end{equation}

Statistics for a weighted sample

Let \(X_w^n\) denote an i.i.d. random sample \((X_i)_{i=1}^n\) with analytical weights \((w_i)_{i=1}^n\).

Weighted sample mean: \begin{equation} \overline{X_w^n } = \frac{ \sum_{i=1}^n w_i X_i }{ \sum_{i=1}^n w_i } \end{equation}

Weighted sample variance: \begin{equation} \widehat{\operatorname{Var}}(X_w^n) = \frac{ \sum_{i=1}^n w_i (X_i-\overline{X_w^n })^2 }{ \sum_{i=1}^n w_i - \frac{\sum_{i=1}^n w_i^2 }{\sum_{i=1}^n w_i} } \end{equation}

Sampling variance of the weighted sample mean: \begin{align} \operatorname{Var}(\overline{X_w^n}) &= \operatorname{Var}(\frac{ \sum_{i=1}^n w_i X_i }{ \sum_{i=1}^n w_i }) \nonumber \\ &= \frac{1}{(\sum_{i=1}^n w_i)^2} \operatorname{Var}(\sum_{i=1}^{n}w_iX_i) \nonumber \\ &= \frac{1}{(\sum_{i=1}^n w_i)^2} \sum_{i=1}^{n} w_i^2 \operatorname{Var}(X_i) \nonumber \\ &= \frac{\sum_{i=1}^n w_i^2 }{(\sum_{i=1}^n w_i)^2} \sigma^2 \nonumber \\ &= \frac{ \sigma^2 }{n_\text{eff}} \end{align}

where the effective sample size \(n_\text{eff}\) follows:

\[\begin{equation*} n_\text{eff} = \frac{(\sum_{i=1}^n w_i)^2}{\sum_{i=1}^n w_i^2 } \end{equation*}\]

Statistics for random vectors

Let \((X_i)_{i=1}^n\) be i.i.d. random vectors in \(\mathbb{R}^k\). Stack the observations row-wise into the data matrix

\[\begin{equation} X = \begin{pmatrix} X_1^\top \\ X_2^\top \\ \vdots \\ X_n^\top \end{pmatrix} \in \mathbb{R}^{n \times k}. \end{equation}\]

Sample variance-covariance matrix estimator:

\[\begin{equation} \hat{\Sigma} = \frac{1}{n} X^\top X - \bar X \bar X^\top = \frac{1}{n} \sum_{i=1}^n (X_i - \bar X)(X_i - \bar X)^\top. \end{equation}\]

§3. Basic asymptotics

Convergence in probability

TBD

Law of Large Numbers (LLN)

Convergence in distribution

TBD

Central Limit Theory (CLT)

§4. Parameter estimation

Unbiased estimators

TBD

Consistent estimators

TBD

Confidence intervals

§5. Hypothesis testing

  1. Define a hypothesis.
  2. Select a test statistic \(T\).
  3. Derive the distribution of the test statistic \(T\) under the null hypothesis (e.g., a t distribution with known degrees of freedom; a normal distribution with know mean and variance).
  4. Select a significance level \(\alpha\) (e.g., 5%).
  5. Compute from the observations the observed value \(t\) of the test statistic T.
  6. Decide to either reject the null hypothesis in favor of the alternative or not reject it.

Null and alternative hypotheses

\(H_0\) and \(H_1\)

Significance levels

TBD

P-value

Type I and type II errors

False positive and false negative.

§6. Order statistics

TBD