You have a request ? Contact Us Join Us

Unsupervised Machine Learning | Coursera Quiz Answers

Answers of Unsupervised Machine Learning. IBM Machine Learning Professional Certificate, IBM Introduction to Machine Learning Specialization.
Estimated read time: 8 min
Coursera: Unsupervised Machine Learning Answers
Unsupervised Machine Learning | Coursera (IBM)

This course introduces you to a key aspect of Machine Learning: Unsupervised Learning. It covers how to extract insights from datasets that lack a target or labeled variable. You will explore various clustering and dimension reduction algorithms, and learn how to choose the most appropriate one for your data. The practical component emphasizes best practices for applying unsupervised learning techniques.

By the end of the course, you will be able to identify problems suited for Unsupervised Learning, understand the challenge of high-dimensional data in clustering, describe and implement common clustering and dimensionality-reduction algorithms, perform clustering where applicable, compare the performance of models within clusters, and comprehend relevant metrics for evaluating clusters.

This course is designed for aspiring data scientists seeking practical experience with Unsupervised Machine Learning techniques in a business context. To maximize your learning, you should be familiar with programming in a Python environment and have a basic understanding of Data Cleaning, Exploratory Data Analysis, Calculus, Linear Algebra, Probability, and Statistics.


Notice!
Always refer to the module on your course for the most accurate and up-to-date information.

Attention!
If you have any questions that are not covered in this post, please feel free to leave them in the comments section below. Thank you for your engagement.

WEEK 1 QUIZ

1. What is the implication of a small standard deviation of the clusters?
  • The standard deviation of the cluster defines how tightly around each one of the centroids are. With a small standard deviation, the points will be closer to the centroids.
2. After we plot our elbow and we find the inflection point, what does that point indicate to us?
  • The ideal number of clusters.
3. What is one of the most suitable ways to choose K when the number of clusters is unclear?
  • By evaluating Clustering performance such as Inertia and Distortion.
4. Which statement describes correctly the use of distortion and inertia?
  • When the similarity of the points in the cluster are more important, you should use distortion, and if you are more concerned that clusters have similar numbers of points, then you should use inertia.
5. Which method is commonly used to select the right number of clusters?
  • The elbow method.

WEEK 2 QUIZ

1. What is the other name we can give to the L2 distance?
  • Euclidean Distance
2. Which of the following statements is a business case for the use of the Manhattan distance (L1)?
  • We use it in business cases where there is very high dimensionality.
3. What is the key feature for the Cosine Distance?
  • The Cosine Distance, which takes into acount the angle between 2 points.
4. The following statement is an example of a business case where we can use the Cosine Distance?
  • Cosine is better for data such as text where location of occurrence is less important.
5. Which distance metric is useful when we have text documents and we want to group similar topics together?
  • Jaccard

WEEK 3 QUIZ

1. When using DBSCAN, how does the algorithm determine that a cluster is complete and is time to move to a different point of the data set and potentially start a new cluster?
  • When no point is left unvisited by the chain reaction.
2. Which of the following statements correctly defines the strengths of the DBSCAN algorithm?
  • No need to specify the number of clusters (cf. K-means), allows for noise, and can handle arbitrary-shaped clusters.
3. Which of the following statements correctly defines the weaknesses of the DBSCAN algorithm?
  • It needs two parameters as input, finding appropriate values of Ɛ and n_clu can be difficult, and it does not do well with clusters of different density.
4. (True/false) Does complete linkage refers to the maximum pairwise distance between clusters?
  • True
5. Which of the following measure methods computes the inertia and pick the pair that is going to ultimately minimize the inertia value?
  • Ward linkage

WEEK 4 QUIZ

1. Select the option that best completes the following sentence: For data with many features, principal components analysis
  • generates new features that are linear combinations of the original features.
2. Which option correctly lists the steps for implementing PCA in Python?
1. Fit PCA to data
2. Scale the data
3. Determine the desired number of components based on total explained variance
4. Define a PCA object
  • 2, 4, 1, 3
3. Given the following matrix for lengths of singular vectors, how do we rank the vectors in terms of importance?
  [11 0 0 0
   0 3 0 0
   0 0 2 0
   0 0 0 1]
  • v1, v2, v3, v4
4. Given two principal components v1, v2, let's say that feature f1 contributed 0.15 to v1 and 0.25 to v2. Feature f2 contributed -0.11 to v1 and 0.4 to v2. Which feature is more important according to their total contribution to the components?
  • v2 because |-0.11| + |0.4| > |0.15| + |0.25|
5. (True/False) In PCA, the first principal component represents the most important feature in the dataset.
  • False

WEEK 5 QUIZ

1. What is the main difference between kernel PCA and linear PCA?
  • Kernel PCA tend to uncover non-linearity structure within the dataset by increasing the dimensionality of the space thanks to the kernel trick.
2. (True/False) Multi-Dimensional Scaling (MDS) focuses on maintaining the geometric distances between points
  • True
3. Which of the following data types is more suitable for Kernel PCA than PCA?
  • Data where the classes are not linearly separable.
4. By applying MDS, you are able to:
  • Find embeddings for points so that their distance is the most similar to the original distance.
5. Which one of the following hyperparameters is NOT considered when using GridSearchCV for Kernel PCA?
  • n_clusters

WEEK 6 QUIZ

1. (True/False) In some applications, NMF can make for more human interpretable latent features.
  • True
2. Which of the following set of features is the least adapted to NMF?
  • Monthly returns of a set of stock portfolios.
3. (True/False) The NMF can produce different outputs depending on its initialization.
  • True
4. Which option is the sparse representation of the matrix below?
[(1, 1, 2), (1, 2, 3), (3, 4, 1), (2, 4, 4), (4, 3, 1)]
[[2 0 0 0],     ✔
 [0 3 0 0],
 [0 0 0 1],
 [0 4 1 0]]

 [[0 0 0 1],
 [0 2 0 0],
 [0 0 0 3],
 [0 4 1 0]]

 [[1 0 0 0],
 [0 3 0 0],
 [0 2 0 0],     WRONG
 [0 0 4 2]]

 [[0 0 0 2],
 [0 3 4 0],
 [0 0 0 0],
 [0 0 1 0]]

5. In Practice lab: Non-Negative Matrix Factorization, why did we use "pairwise_distances" from scikit-learn?
  • To calculate the pairwise distance between NMF encoded version of the original dataset and the encoded query dataset.

Related Articles

Post a Comment

Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.