Statistics
This page reviews concepts from statistics that are useful for econometrics.
Related pages: Probability
§1. Sampling¶
Calibration¶
(Sample reweighting)
Methods
- Post-stratification
- Raking
- Entropy-balancing
§2. Statistics¶
Statistics for random scalars¶
Let \(X^n = (X_i)_{i=1}^n\) denote an i.i.d. random sample of size \(n\).
Sample mean (an unbiased estimator of true mean):
Sample variance (an unbiased estimator of true variance):
Sample standard deviation: \begin{equation} \widehat{SD}(X^n) = \sqrt{\frac{1}{n-1}\sum_{i=1}^n (X_i - \overline{X^n })^2} \end{equation}
Sampling distribution of statistics¶
Sampling variance of the sample mean: \begin{align} \operatorname{Var}(\overline{X^n}) & = \operatorname{Var}( \frac{1}{n}\sum_{i=1}^{n}X_i ) \nonumber \\ & = \frac{1}{n^2} \operatorname{Var}(\sum_{i=1}^{n}X_i) \nonumber \\ & = \frac{1}{n^2} \sum_{i=1}^{n}\operatorname{Var}(X_i) \nonumber \\ & = \frac{n \sigma^2 }{n^2} \nonumber \\ & = \frac{ \sigma^2 }{n} \end{align}
where \(\sigma^2\) is the true variance of \(X_i\).
Standard error of the sample mean: \begin{equation} \operatorname{SE}(\overline{X^n}) = \frac{\sigma}{\sqrt{n}} \end{equation}
Statistics for a weighted sample¶
Let \(X_w^n\) denote an i.i.d. random sample \((X_i)_{i=1}^n\) with analytical weights \((w_i)_{i=1}^n\).
Weighted sample mean: \begin{equation} \overline{X_w^n } = \frac{ \sum_{i=1}^n w_i X_i }{ \sum_{i=1}^n w_i } \end{equation}
Weighted sample variance: \begin{equation} \widehat{\operatorname{Var}}(X_w^n) = \frac{ \sum_{i=1}^n w_i (X_i-\overline{X_w^n })^2 }{ \sum_{i=1}^n w_i - \frac{\sum_{i=1}^n w_i^2 }{\sum_{i=1}^n w_i} } \end{equation}
Sampling variance of the weighted sample mean: \begin{align} \operatorname{Var}(\overline{X_w^n}) &= \operatorname{Var}(\frac{ \sum_{i=1}^n w_i X_i }{ \sum_{i=1}^n w_i }) \nonumber \\ &= \frac{1}{(\sum_{i=1}^n w_i)^2} \operatorname{Var}(\sum_{i=1}^{n}w_iX_i) \nonumber \\ &= \frac{1}{(\sum_{i=1}^n w_i)^2} \sum_{i=1}^{n} w_i^2 \operatorname{Var}(X_i) \nonumber \\ &= \frac{\sum_{i=1}^n w_i^2 }{(\sum_{i=1}^n w_i)^2} \sigma^2 \nonumber \\ &= \frac{ \sigma^2 }{n_\text{eff}} \end{align}
where the effective sample size \(n_\text{eff}\) follows:
Statistics for random vectors¶
Let \((X_i)_{i=1}^n\) be i.i.d. random vectors in \(\mathbb{R}^k\). Stack the observations row-wise into the data matrix
Sample variance-covariance matrix estimator:
§3. Basic asymptotics¶
Convergence in probability¶
TBD
Law of Large Numbers (LLN)¶
Convergence in distribution¶
TBD
Central Limit Theory (CLT)¶
§4. Parameter estimation¶
Unbiased estimators¶
TBD
Consistent estimators¶
TBD
Confidence intervals¶
§5. Hypothesis testing¶
- Define a hypothesis.
- Select a test statistic \(T\).
- Derive the distribution of the test statistic \(T\) under the null hypothesis (e.g., a t distribution with known degrees of freedom; a normal distribution with know mean and variance).
- Select a significance level \(\alpha\) (e.g., 5%).
- Compute from the observations the observed value \(t\) of the test statistic T.
- Decide to either reject the null hypothesis in favor of the alternative or not reject it.
Null and alternative hypotheses¶
\(H_0\) and \(H_1\)
Significance levels¶
TBD
P-value¶
Type I and type II errors¶
False positive and false negative.
§6. Order statistics¶
TBD