Machine Learning Cheat Sheet
The core ideas of Machine Learning distilled into a single, scannable reference — perfect for review or quick lookup.
Quick Reference
Supervised Learning
A machine learning paradigm in which models are trained on labeled datasets containing input-output pairs. The algorithm learns a mapping function from inputs to outputs, enabling it to predict correct labels for previously unseen data. Common tasks include classification and regression.
Unsupervised Learning
A machine learning approach where algorithms learn patterns from unlabeled data without predefined output categories. The system discovers inherent structure, groupings, or relationships within the data on its own. Key techniques include clustering, dimensionality reduction, and anomaly detection.
Neural Networks
Computational models inspired by the biological neural networks of the human brain, consisting of interconnected layers of artificial neurons (nodes). Each connection has a weight that is adjusted during training, and neurons apply activation functions to produce outputs. Deep neural networks with many hidden layers can learn complex, hierarchical representations of data.
Gradient Descent
An iterative optimization algorithm used to minimize a model's loss function by updating parameters in the direction of the steepest decrease of the loss. The learning rate controls the step size, and variants like stochastic gradient descent (SGD) and Adam improve efficiency by using subsets of data or adaptive learning rates.
Overfitting
A modeling error that occurs when a machine learning model learns the training data too well, capturing noise and random fluctuations rather than the underlying pattern. An overfit model performs excellently on training data but poorly on unseen test data. Techniques like regularization, dropout, cross-validation, and early stopping help prevent overfitting.
Bias-Variance Tradeoff
A fundamental concept describing the tension between two sources of error in machine learning models. Bias is error from overly simplistic assumptions causing the model to miss relevant patterns (underfitting), while variance is error from excessive sensitivity to training data fluctuations (overfitting). The optimal model balances both to minimize total error.
Feature Engineering
The process of using domain knowledge to create, transform, or select input variables (features) that improve a machine learning model's predictive performance. Good feature engineering can dramatically boost model accuracy and is often more impactful than choosing a more complex algorithm. It includes tasks like normalization, encoding categorical variables, and creating interaction terms.
Decision Trees
A supervised learning algorithm that makes predictions by recursively splitting data based on feature values, forming a tree-like structure of decisions. Each internal node represents a test on a feature, each branch represents an outcome of that test, and each leaf node holds a prediction. They are intuitive and interpretable but prone to overfitting without pruning or ensemble methods.
Ensemble Methods
Techniques that combine multiple individual models to produce a single, stronger predictive model. By aggregating the predictions of several base learners, ensembles reduce variance, bias, or both, and generally outperform any single constituent model. Major approaches include bagging (e.g., Random Forests), boosting (e.g., XGBoost, AdaBoost), and stacking.
Transfer Learning
A technique where a model trained on one task is repurposed as the starting point for a different but related task. Instead of training from scratch, the pre-trained model's learned representations are fine-tuned on a smaller, task-specific dataset. This approach significantly reduces the data and computation required and is especially powerful in deep learning for computer vision and natural language processing.
Key Terms at a Glance
Get study tips in your inbox
We'll send you evidence-based study strategies and new cheat sheets as they're published.
We'll notify you about updates. No spam, unsubscribe anytime.