Part 1: Understanding Observability in Machine Learning
What is Observability? Overview of observability in a general context and its specific importance in ML. Discussion on how it helps in monitoring and troubleshooting ML systems. [Link]
ML Observability: The Essentials Dive into the core components of ML observability, including data logging, model performance tracking, and anomaly detection. [Link]
The Value of Performance Tracing in Machine Learning Explanation of performance tracing and its role in identifying bottlenecks and inefficiencies in ML workflows. [Link]
Part 2: Core Machine Learning Evaluation Metrics
Precision, Recall, and F1 Score: Balancing Accuracy and Completeness Definitions and importance of precision, recall, and their harmonic mean—the F1 score—in classification tasks. [Link][Link][Link]
Understanding AUC and PR AUC The significance of the Area Under the Receiver Operating Characteristic (AUC) and Precision-Recall (PR AUC) curves in evaluating model performance. [Link][Link]
Calibration Curves: Ensuring Reliable Probabilities The role of calibration curves in assessing the reliability of probability predictions from classifiers. [Link]
Part 3: Advanced Evaluation Techniques and Metrics
Mean Absolute Percentage Error (MAPE) and R-Squared: Measuring Prediction Accuracy An explanation of MAPE and R-squared metrics for evaluating regression models. [Link][Link]
Normalized Discounted Cumulative Gain (NDCG): Ranking Model Evaluation Discussion on NDCG for evaluating ranking and recommendation systems. [Link]
BLEU, BERT, and ROUGE Scores: Evaluating Natural Language Processing Models Overview of language-specific evaluation metrics used in assessing the performance of NLP models. [Link]
Part 4: Statistical Measures and Tests for Deep Understanding
PSI, KL Divergence, and Jensen-Shannon Divergence: Measuring Distribution Changes Insight into statistical measures used to detect shifts in data distribution or model outputs over time. [Link][Link][Link]
Kolmogorov-Smirnov Test: Statistical Hypothesis Testing Description of the KS test for comparing two samples, with applications in model validation and A/B testing. [Link]