Course Information
Description
Advances knowledge by working with real-world data sets and the problems they bring, such as missing data, mismatched formats and ambiguity. Learn several machine learning algorithms such as logistic regression and Random Forest, the trade-offs of bias and variance and how to find the best solution through cross validation, sensitivity and specificity. Using algorithms properly, learn to distinguish between them and how to analyze and prepare the data for their usage, beyond data engineering.
Total Credits
3
Course Competencies
-
Differentiate between supervised and unsupervised learning paradigms to determine the appropriate modeling strategy for a given business problem
-
Construct linear and multiple regression models to quantify relationships between variables and predict continuous outcomes
-
Evaluate the integrity of regression models by testing the core assumptions of linearity and independence
-
Perform feature selection using Pearson’s R to identify and prioritize variables with significant linear relationships
-
Execute data preprocessing tasks—including one-hot encoding, normalization, and standardization—to engineer "model-ready" features
-
Apply classification algorithms, such as logistic regression and random forest, to categorize data and solve predictive problems
-
Assess predictive performance using metrics like R^2, cross-validation, and confusion matrices to ensure model reliability and accuracy
-
Optimize model outcomes by implementing data balancing techniques and managing the trade-off between bias and variance
-
Describe the theoretical impact of normality and homoscedasticity on regression results to identify potential model limitations
This Outline is under development.