| μ | Population Mean |
| x̄ | Sample Mean |
| σ | Population Std Dev |
| s | Sample Std Dev |
| α | Alpha (Type I Error Rate) |
| β | Beta (Type II Error Rate) |
| χ² | Chi-Squared Statistic |
| F | F-Statistic (ANOVA/Reg) |
| r | Pearson Correlation |
| R² | Variance Explained |
| η² | Eta Squared (Effect) |
| d | Cohen's d (Effect) |
| n | Sample Size |
| df | Degrees of Freedom |
As sample size (n ≥ 30) increases, the sampling distribution of the mean becomes normally distributed, regardless of the population's shape.
Range containing the true parameter with X% confidence.
| Level | Z-Crit | Interpretation |
|---|---|---|
| 90% | 1.645 | Narrow, higher error |
| 95% | 1.960 | Standard baseline |
| 99% | 2.576 | Wide, conservative |
| Decision | H0 True | H0 False |
|---|---|---|
| Reject H0 | Type I Error (α) | True Pos (Power) |
| Fail Reject | True Negative | Type II Error (β) |
Most software prints a cryptic block. Here is how to decode the matrix.
STANDARD RULE:
MASTER CUSTOM RULE:
Parametric tests require normality and equal variance. If assumptions fail, use non-parametric equivalents (Rank-based).
| Wanted Test | Broken Assumption | Use Instead |
|---|---|---|
| Indep. t-Test | Unequal Variance | Welch t-Test |
| Indep. t-Test | Non-Normal (Outliers) | Mann-Whitney U |
| Paired t-Test | Non-Normal | Wilcoxon Signed Rank |
| ANOVA | Non-Normal | Kruskal-Wallis |
| Pearson (r) | Non-Normal / Non-Linear | Spearman (ρ) |
H0: μ1 = μ2.
Compares means of 2 independent groups. Doesn't assume equal variance (Standard!).
H0: μ_diff = 0.
Compares means from same group at different times (Before vs After).
H0: μ1 = μ2 = μ3...
Compares means of 3+ groups. If p<0.05, at least one group differs. (Needs Post-Hoc!).
H0: Variables are independent.
Tests association between 2 categorical variables (e.g. Pet Type x City).
H0: r = 0.
Measures linear strength between 2 continuous variables (-1 to 1).
H0: Distributions equal.
Non-parametric alternative to independent t-test (compares ranks, not means).
H0: Median diff = 0.
Non-parametric alternative to paired t-test.
H0: Medians equal.
Non-parametric alternative to ANOVA (3+ groups).
y = β0 + β1x1 + β2x2 + ... + ε
Models the relationship between predictors (x) and a continuous outcome (y).
Predicts a Binary outcome (0/1, Yes/No).
STATISTICAL SIGNIFICANCE ≠ PRACTICAL SIGNIFICANCE
A p-value only tells you an effect exists. Effect size tells you if you should care.
Always report these elements in APA format:
| Concept | Python | R |
|---|---|---|
| Mean | df.mean() | mean(x) |
| SD | df.std() | sd(x) |
| Summary | df.describe() | summary(x) |
| Crosstab | pd.crosstab() | table() |
| Plot | seaborn | ggplot2 |
Never test a model on the data it trained on (Data Leakage / Overfitting).
Splits data into 'k' chunks. Trains on k-1, tests on the 1 remaining. Repeats k times. Averages the scores. More robust than a single split.
Compares Actual Truth vs. Model Prediction for binary classification.
| Predicted Condition | |||
|---|---|---|---|
| Pred Positive (1) | Pred Negative (0) | ||
| Actual | Pos (1) | True Positive (TP) |
False Negative (FN) - Type II Error |
| Neg (0) | False Positive (FP) - Type I Error |
True Negative (TN) |
|
If 99% of emails are normal and 1% is spam, a model predicting "normal" every time has 99% accuracy but is completely useless. Do not use Accuracy for imbalanced data!
Fixes: SMOTE (Oversampling), Undersampling, Class Weights.ROC (Receiver Operating Characteristic): Plots True Positive Rate (Recall) vs False Positive Rate across different probability thresholds.
AUC (Area Under Curve): A single metric evaluating model performance independent of threshold.