You have a request ? Contact Us Join Us

Supervised Machine Learning: Classification | Coursera Quiz Answers

Supervised Machine Learning: Classification Answer.IBM Machine Learning Professional Certificate. IBM Introduction to Machine Learning Specialization.
Estimated read time: 14 min
Coursera:Supervised Machine Learning: Classification Answers
Supervised Machine Learning: Classification | Coursera (IBM)

This course provides an introduction to one of the primary families of supervised Machine Learning models: Classification. It covers training predictive models to classify categorical outcomes and using error metrics to compare different models. The practical portion of the course emphasizes best practices for classification, such as train-test splits and managing datasets with unbalanced classes. Upon completion, you will be able to:

  • Recognize the applications and advantages of classification and ensemble classification methods
  • Understand and implement logistic regression models
  • Understand and implement decision trees and tree-ensemble models
  • Understand and implement other ensemble methods for classification
  • Utilize various error metrics to evaluate and select the most suitable classification model for your data
  • Apply techniques like oversampling and undersampling to address unbalanced class issues in datasets

This course is designed for aspiring data scientists who want hands-on experience with supervised Machine Learning classification techniques in a business context. To benefit fully from the course, you should be familiar with Python programming and have a foundational knowledge of Data Cleaning, Exploratory Data Analysis, Calculus, Linear Algebra, Probability, and Statistics.


Notice!
Always refer to the module on your course for the most accurate and up-to-date information.

Attention!
If you have any questions that are not covered in this post, please feel free to leave them in the comments section below. Thank you for your engagement.

WEEK 1 QUIZ

1. The output of a logistic regression model applied to a data sample _____________.
  • is the probability of the sample being in a certain class.
2. Describe how any binary classification model can be extended from its basic form on two classes, to work on multiple classes.
  • Use a one-versus all technique, where for each class you fit a binary classifier to that class versus all of the other classes.
3. Which tool is most appropriate for measuring the performance of a classifier on unbalanced classes?
  • The precision-recall curve.
4. (True/False) One of the requirements of logistic regression is that you need a variable with two classes.
  • False
5. (True/False) The shape of ROC curves are the leading indicator of an overfitted logistic regression.
  • False
6. Consider this scenario for Questions 3 to 7.
You are evaluating a binary classifier. There are 50 positive outcomes in the test data, and 100 observations. Using a 50% threshold, the classifier predicts 40 positive outcomes, of which 10 are incorrect.
What is the classifier’s Precision on the test sample?
  • 75%
7. Consider this scenario for Questions 3 to 7.
You are evaluating a binary classifier. There are 50 positive outcomes in the test data, and 100 observations. Using a 50% threshold, the classifier predicts 40 positive outcomes, of which 10 are incorrect.
What is the classifier’s Recall on the test sample?
  • 60%
8. Consider this scenario for Questions 3 to 7.
You are evaluating a binary classifier. There are 50 positive outcomes in the test data, and 100 observations. Using a 50% threshold, the classifier predicts 40 positive outcomes, of which 10 are incorrect.
What is the classifier’s F1 score on the test sample?
  • 66.7%
9. Consider this scenario for Questions 3 to 7.
You are evaluating a binary classifier. There are 50 positive outcomes in the test data, and 100 observations. Using a 50% threshold, the classifier predicts 40 positive outcomes, of which 10 are incorrect.
Increasing the threshold to 60% results in 5 additional positive predictions, all of which are correct. Which of the following statements about this new model (compared with the original model that had a 50% threshold) is TRUE?
  • The area under the ROC curve would remain the same.
10. Consider this scenario for Questions 3 to 7.
You are evaluating a binary classifier. There are 50 positive outcomes in the test data, and 100 observations. Using a 50% threshold, the classifier predicts 40 positive outcomes, of which 10 are incorrect.
The threshold is now increased further, to 70%. Which of the following statements is TRUE?
  • The Recall of the classifier would increase or remain the same.

WEEK 2 QUIZ

1. Which one of the following statements is true regarding K Nearest Neighbors?
  • K Nearest Neighbors (KNN) assumes that points which are close together are similar.
2. Which one of the following statements is most accurate?
  • K nearest neighbors (KNN) needs to remember the entire training dataset in order to classify a new data sample.
3. Which one of the following statements is most accurate about K Nearest Neighbors (KNN)?
  • KNN can be used for both classification and regression.
4. (True/False) K Nearest Neighbors with large k tend to be the best classifiers.
  • False
5. When building a KNN classifier for a variable with 2 classes, it is advantageous to set the neighbor count k to an odd number.
  • True
6. The Euclidean distance between two points will always be shorter than the Manhattan distance:
  • True
7. The main purpose of scaling features before fitting a k nearest neighbor model is to:
  • Ensure that features have similar influence on the distance calculation
8. These are all pros of the k nearest neighbor algorithm EXCEPT:
  • It is sensitive to the curse of dimensionality


WEEK 3 QUIZ

1. Select the TRUE statement regarding the cost function for SVMs:
  •  SVMs do not use a cost function. They use regularization instead of a cost function.
  •  SVMs use a loss function that penalizes vectors prone to misclassification
  •  SVMs use same loss function as logistic regression
  •  SVMs use the Hinge Loss function as a cost function
2. Which statement about Support Vector Machines is TRUE?
  •  Support Vector Machine models are non-linear.
  •  Support Vector Machine models rarely overfit on training data.
  •  Support Vector Machine models can be used for regression but not for classification.
  •  Support Vector Machine models can be used for classification but not for regression.
3. (True/False) A large c term will penalize the SVM coefficients more heavily.
  •  True
  •  False
4. Regularization in the context of support vector machine (SVM) learning is meant to _________________.
  •  smooth the input data to reduce the chance of overfitting
  •  lessen the impact that some minor misclassifications have on the cost function
  •  encourage the model to ignore outliers during training
  •  bring all features to a common scale to ensure they have equal weight
5. Support vector machines can be extended to work with nonlinear classification boundaries by ___________________.
  •  modifying the standard sigmoid function
  •  using the kernel trick
  •  projecting the feature space onto a lower dimensional space
  •  incorporating polynomial regression
6. Select the image that displays the line at the optimal point in the phone usage that the data can be split to create a decision boundary.
Select the image that displays the line at the optimal point in the phone
  usage that the data can be split to create a decision boundary.
7. The below image shows the decision boundary with a clear margin, such decision boundary belongs to what type machine learning model?
  •  Support Version Machine
  •  Super Vector Machine
  •  Support Vector Machine
  •  Machine Learning
8. SVM with kernals can be very slow on large datasets. To speed up SVM training, which methods may you perform to map low dimensional data into high dimensional beforehand?
  •  Nystroem
  •  RBF Sampler
  •  Regularization
  •  Linear SVC
9. Concerning the Machine Learning workflow what model choice would you pick if you have "Few" features and a "Medium" amount of data?
  •  Add features, or Logistic
  •  Simple, Logistic or LinearSVC
  •  SVC with RBF
  •  LinearSVC, or Kernal Approximation
10. Select the image that best displays the line that separates the classes.
Select the image that best displays the line that separates the
  classes.

WEEK 4 QUIZ

1. These are all characteristics of decision trees, EXCEPT:
  • They have well rounded decision boundaries
2. Decision trees used as classifiers compute the value assigned to a leaf by calculating the ratio: number of observations of one class divided by the number of observations in that leaf E.g. number of customers that are younger than 50 years old divided by the total number of customers.
How are leaf values calculated for regression decision trees?
  • average value of the predicted variable
3. These are two main advantages of decision trees:
  • They are very visual and easy to interpret
4. How can you determine the split for each node of a decision tree?
  • Find the split that minimizes the gini impurity.
5. Which of the following describes a way to regularize a decision tree to address overfitting?
  • Decrease the max depth.
6. What is a disadvantage of decision trees?
  • They tend to overfit.
7. What method can you use to minimize overfitting of a machine learning model?
  • Tune the hyperparameters of your model using cross-validation.
8. Concerning Classification algorithms, what is a characteristic of K-Nearest Neighbors?
  • Training data is the model
9. Concerning Classification algorithms, what are the characteristics of Logistic Regression?
  • The model is just parameters, fitting can be slow, prediction is fast, and the decision boundary is simple and less flexible
10. When evaluating all possible splits of a decision tree what can be used to find the best split regardless of what happened in prior or future steps?
  • Greedy Search

WEEK 5 QUIZ

1. The term Bagging stands for bootstrap aggregating.
  • True
2. This is the best way to choose the number of trees to build on a Bagging ensemble.
  • Tune number of trees as a hyperparameter that needs to be optimized
3. Which type of Ensemble modeling approach is NOT a special case of model averaging?
  • Boosting methods The Pasting method of Bootstrap aggregation
4. What is an ensemble model that needs you to look at out of bag error?
  • Random Forest
5. What is the main condition to use stacking as ensemble method?
  • Models need to output predicted probabilities
6. This tree ensemble method only uses a subset of the features for each tree:
  • Random Forest
7. Order these tree ensembles in order of most randomness to least randomness:
  • Random Trees, Random Forest, Bagging
8. This is an ensemble model that does not use bootstrapped samples to fit the base trees, takes residuals into account, and fits the base trees iteratively:
  • Boosting
9. When comparing the two ensemble methods Bagging and Boosting, what is one characteristic of Boosting?
  • Fits entire data set
10. What is the most frequently discussed loss function in boosting algorithms?
  • 0-1 Loss Function

WEEK 6 QUIZ

1.Which of the following statements about Downsampling is TRUE?
  • Downsampling is likely to decrease Precision.
2. Which of the following statements about Random Upsampling is TRUE?
  • Random Upsampling results in excessive focus on the more frequently-occurring class.
3. Which of the following statements about Synthetic Upsampling is TRUE?
  • Synthetic Upsampling generates observations that were not part of the original data.
4. What can help humans to interpret the behaviors and methods of Machine Learning models more easily?
  • Model Explanations
5. What type of explanation method can be used to explain different types of Machine Learning models no matter the model structures and complexity?
  • Model-Agnostic Explanations
6. What reason might a Global Surrogate model fail?
  • Large inconsistency between surrogate models and black-box models
7. When working with unbalanced sets, what should be done to the samples so the class balance remains consistent in both the train and test set?
  • Stratify the samples
8. What approach are you using when trying to increase the size of a minority class so that it is similar to the size of the majority class?
  • Oversampling
9. What approach are you using when you create a new sample of a minority class that does not yet exist?
  • Synthetic Oversampling
10.What intuitive technique is used for unbalanced datasets that ensures a continuous downsample for each of the bootstrap samples?
  • Blagging

Related Articles

1 comment

  1. second ago
    Very helpful. I just sent a donation to the team. Thanks.
Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.