![]() |
Supervised Machine Learning: Classification | Coursera (IBM) |
This course provides an introduction to one of the primary families of supervised Machine Learning models: Classification. It covers training predictive models to classify categorical outcomes and using error metrics to compare different models. The practical portion of the course emphasizes best practices for classification, such as train-test splits and managing datasets with unbalanced classes. Upon completion, you will be able to:
- Recognize the applications and advantages of classification and ensemble classification methods
- Understand and implement logistic regression models
- Understand and implement decision trees and tree-ensemble models
- Understand and implement other ensemble methods for classification
- Utilize various error metrics to evaluate and select the most suitable classification model for your data
- Apply techniques like oversampling and undersampling to address unbalanced class issues in datasets
This course is designed for aspiring data scientists who want hands-on experience with supervised Machine Learning classification techniques in a business context. To benefit fully from the course, you should be familiar with Python programming and have a foundational knowledge of Data Cleaning, Exploratory Data Analysis, Calculus, Linear Algebra, Probability, and Statistics.
Notice!
Always refer to the module on your course for the most accurate and up-to-date information.
Attention!
If you have any questions that are not covered in this post, please feel free to leave them in the comments section below. Thank you for your engagement.
WEEK 1 QUIZ
1. The output of a logistic regression model applied to a data sample _____________.- is the probability of the sample being in a certain class.
- Use a one-versus all technique, where for each class you fit a binary classifier to that class versus all of the other classes.
- The precision-recall curve.
- False
- False
You are evaluating a binary classifier. There are 50 positive outcomes in the test data, and 100 observations. Using a 50% threshold, the classifier predicts 40 positive outcomes, of which 10 are incorrect.
What is the classifier’s Precision on the test sample?
- 75%
You are evaluating a binary classifier. There are 50 positive outcomes in the test data, and 100 observations. Using a 50% threshold, the classifier predicts 40 positive outcomes, of which 10 are incorrect.
What is the classifier’s Recall on the test sample?
- 60%
You are evaluating a binary classifier. There are 50 positive outcomes in the test data, and 100 observations. Using a 50% threshold, the classifier predicts 40 positive outcomes, of which 10 are incorrect.
What is the classifier’s F1 score on the test sample?
- 66.7%
You are evaluating a binary classifier. There are 50 positive outcomes in the test data, and 100 observations. Using a 50% threshold, the classifier predicts 40 positive outcomes, of which 10 are incorrect.
Increasing the threshold to 60% results in 5 additional positive predictions, all of which are correct. Which of the following statements about this new model (compared with the original model that had a 50% threshold) is TRUE?
- The area under the ROC curve would remain the same.
You are evaluating a binary classifier. There are 50 positive outcomes in the test data, and 100 observations. Using a 50% threshold, the classifier predicts 40 positive outcomes, of which 10 are incorrect.
The threshold is now increased further, to 70%. Which of the following statements is TRUE?
- The Recall of the classifier would increase or remain the same.
WEEK 2 QUIZ
1. Which one of the following statements is true regarding K Nearest Neighbors?- K Nearest Neighbors (KNN) assumes that points which are close together are similar.
- K nearest neighbors (KNN) needs to remember the entire training dataset in order to classify a new data sample.
- KNN can be used for both classification and regression.
- False
- True
- True
- Ensure that features have similar influence on the distance calculation
- It is sensitive to the curse of dimensionality
WEEK 3 QUIZ
- SVMs do not use a cost function. They use regularization instead of a cost function.
- SVMs use a loss function that penalizes vectors prone to misclassification
- SVMs use same loss function as logistic regression
- SVMs use the Hinge Loss function as a cost function
- Support Vector Machine models are non-linear.
- Support Vector Machine models rarely overfit on training data.
- Support Vector Machine models can be used for regression but not for classification.
- Support Vector Machine models can be used for classification but not for regression.
- True
- False
- smooth the input data to reduce the chance of overfitting
- lessen the impact that some minor misclassifications have on the cost function
- encourage the model to ignore outliers during training
- bring all features to a common scale to ensure they have equal weight
- modifying the standard sigmoid function
- using the kernel trick
- projecting the feature space onto a lower dimensional space
- incorporating polynomial regression

- Support Version Machine
- Super Vector Machine
- Support Vector Machine
- Machine Learning
- Nystroem
- RBF Sampler
- Regularization
- Linear SVC
- Add features, or Logistic
- Simple, Logistic or LinearSVC
- SVC with RBF
- LinearSVC, or Kernal Approximation

WEEK 4 QUIZ
1. These are all characteristics of decision trees, EXCEPT:- They have well rounded decision boundaries
How are leaf values calculated for regression decision trees?
- average value of the predicted variable
- They are very visual and easy to interpret
- Find the split that minimizes the gini impurity.
- Decrease the max depth.
- They tend to overfit.
- Tune the hyperparameters of your model using cross-validation.
- Training data is the model
- The model is just parameters, fitting can be slow, prediction is fast, and the decision boundary is simple and less flexible
- Greedy Search
WEEK 5 QUIZ
1. The term Bagging stands for bootstrap aggregating.- True
- Tune number of trees as a hyperparameter that needs to be optimized
- Boosting methods The Pasting method of Bootstrap aggregation
- Random Forest
- Models need to output predicted probabilities
- Random Forest
- Random Trees, Random Forest, Bagging
- Boosting
- Fits entire data set
- 0-1 Loss Function
WEEK 6 QUIZ
1.Which of the following statements about Downsampling is TRUE?- Downsampling is likely to decrease Precision.
- Random Upsampling results in excessive focus on the more frequently-occurring class.
- Synthetic Upsampling generates observations that were not part of the original data.
- Model Explanations
- Model-Agnostic Explanations
- Large inconsistency between surrogate models and black-box models
- Stratify the samples
- Oversampling
- Synthetic Oversampling
- Blagging