![]() |
Unsupervised Machine Learning | Coursera (IBM) |
This course introduces you to a key aspect of Machine Learning: Unsupervised Learning. It covers how to extract insights from datasets that lack a target or labeled variable. You will explore various clustering and dimension reduction algorithms, and learn how to choose the most appropriate one for your data. The practical component emphasizes best practices for applying unsupervised learning techniques.
By the end of the course, you will be able to identify problems suited for Unsupervised Learning, understand the challenge of high-dimensional data in clustering, describe and implement common clustering and dimensionality-reduction algorithms, perform clustering where applicable, compare the performance of models within clusters, and comprehend relevant metrics for evaluating clusters.
This course is designed for aspiring data scientists seeking practical experience with Unsupervised Machine Learning techniques in a business context. To maximize your learning, you should be familiar with programming in a Python environment and have a basic understanding of Data Cleaning, Exploratory Data Analysis, Calculus, Linear Algebra, Probability, and Statistics.
Notice!
Always refer to the module on your course for the most accurate and up-to-date information.
Attention!
If you have any questions that are not covered in this post, please feel free to leave them in the comments section below. Thank you for your engagement.
WEEK 1 QUIZ
1. What is the implication of a small standard deviation of the clusters?- The standard deviation of the cluster defines how tightly around each one of the centroids are. With a small standard deviation, the points will be closer to the centroids.
- The ideal number of clusters.
- By evaluating Clustering performance such as Inertia and Distortion.
- When the similarity of the points in the cluster are more important, you should use distortion, and if you are more concerned that clusters have similar numbers of points, then you should use inertia.
- The elbow method.
WEEK 2 QUIZ
1. What is the other name we can give to the L2 distance?- Euclidean Distance
- We use it in business cases where there is very high dimensionality.
- The Cosine Distance, which takes into acount the angle between 2 points.
- Cosine is better for data such as text where location of occurrence is less important.
- Jaccard
WEEK 3 QUIZ
1. When using DBSCAN, how does the algorithm determine that a cluster is complete and is time to move to a different point of the data set and potentially start a new cluster?- When no point is left unvisited by the chain reaction.
- No need to specify the number of clusters (cf. K-means), allows for noise, and can handle arbitrary-shaped clusters.
- It needs two parameters as input, finding appropriate values of Ɛ and n_clu can be difficult, and it does not do well with clusters of different density.
- True
- Ward linkage
WEEK 4 QUIZ
1. Select the option that best completes the following sentence: For data with many features, principal components analysis- generates new features that are linear combinations of the original features.
1. Fit PCA to data
2. Scale the data
3. Determine the desired number of components based on total explained variance
4. Define a PCA object
- 2, 4, 1, 3
[11 0 0 0
0 3 0 0
0 0 2 0
0 0 0 1]
- v1, v2, v3, v4
- v2 because |-0.11| + |0.4| > |0.15| + |0.25|
- False
WEEK 5 QUIZ
1. What is the main difference between kernel PCA and linear PCA?- Kernel PCA tend to uncover non-linearity structure within the dataset by increasing the dimensionality of the space thanks to the kernel trick.
- True
- Data where the classes are not linearly separable.
- Find embeddings for points so that their distance is the most similar to the original distance.
- n_clusters
WEEK 6 QUIZ
1. (True/False) In some applications, NMF can make for more human interpretable latent features.- True
- Monthly returns of a set of stock portfolios.
- True
[(1, 1, 2), (1, 2, 3), (3, 4, 1), (2, 4, 4), (4, 3, 1)]
[[2 0 0 0], ✔
[0 3 0 0],
[0 0 0 1],
[0 4 1 0]]
[[0 0 0 1],
[0 2 0 0],
[0 0 0 3],
[0 4 1 0]]
[[1 0 0 0],
[0 3 0 0],
[0 2 0 0], WRONG
[0 0 4 2]]
[[0 0 0 2],
[0 3 4 0],
[0 0 0 0],
[0 0 1 0]]
5. In Practice lab: Non-Negative Matrix Factorization, why did we use "pairwise_distances" from scikit-learn?
- To calculate the pairwise distance between NMF encoded version of the original dataset and the encoded query dataset.