Assume, if we want to cluster 7 observations into 3 clusters using K-Means clustering algorithm. After first iteration the clusters: C1, C2, C3 has the following observations:

C1: {(2,2), (5,5), (8,8)}

C2: {(1,5), (5,1)}

C3: {(6,6), (10,10)}

What will be the cluster centroids if you want to proceed for a second iteration?

C1: (2,2), C2: (0,0), C3: (5,5)
C1: (5,5), C2: (3,3), C3: (8,8)
C1: (6,6), C2: (4,4), C3: (9,9)
C1: (6,6), C2: (4,4), C3: (9,9)
Difficulty Level: 1
Positive Marks: 1.00
Negative Marks: 0.33
LDA on a dataset with 6 classes and 20 features. What is the maximum number of dimensions the data can be reduced to using LDA?
5
Difficulty Level: 1
Positive Marks: 1.00
Negative Marks: 0.00
In single linkage hierarchical clustering, how many distance computations are required for a dataset of 100 points before any clusters are merged?
4590
Difficulty Level: 1
Positive Marks: 1.00
Negative Marks: 0.00
K-means clustering to a dataset with 1500 points and set K=5. After 10 iterations, the algorithm converges. What is the time complexity of the K-means algorithm in this scenario?
75000
Difficulty Level: 1
Positive Marks: 1.00
Negative Marks: 0.00
Suppose we have three data points x1=[3,3] x2​=[6,6], and x3​=[8,8]. we start with initial centroids c1=[4,4] and c2=[7,7] After one iteration, what will be the new centroid c1​ if the cluster assignments are recalculated?
[4,4]
[3,3]
[4.5,4.5]
[6.5,6.5]
Difficulty Level: 1
Positive Marks: 1.00
Negative Marks: 0.33
A point (1,1) is chosen to be the first centroid while running the KMeans++ algorithm on a dataset.

Suppose, other points in the dataset are:

(2,3), (1,4), (5,6), (-1,-1), (-1,4), (3,-5), (-9,7), (8,7), (1,-4), (5,-6), (-4,7)

Then which of the above-mentioned points is most likely to be chosen as the next centroid for the given data?

(-1,4)
(-1,-1)
(-9,7)
(8,7)
Difficulty Level: 1
Positive Marks: 2.00
Negative Marks: 0.66
Which of the following is a key advantage of K-medoids over K-means?
K-medoids is faster than K-means.
K-medoids is more sensitive to outliers.
K-medoids is less sensitive to outliers
K-medoids requires fewer iterations to converge
Difficulty Level: 1
Positive Marks: 2.00
Negative Marks: 0.66

In hierarchical clustering, which method tends to create elongated clusters by linking the closest points between clusters?

Complete Linkage

Average Linkage

Ward's Method

Single Linkage

Difficulty Level: 1
Positive Marks: 2.00
Negative Marks: 0.66
Unlike K-means, the K-medoids algorithm is known for its robustness to outliers. If you apply K-medoids to a dataset containing several outliers, how does the choice of medoids (as opposed to centroids in K-means) contribute to this robustness?
Medoids are less sensitive to outliers because they are the mean of the points in a cluster, which minimizes the influence of outliers.
Medoids are actual data points, which ensures that extreme outliers do not skew the central point of the cluster as much as a centroid would.
Medoids are recalculated in each iteration to minimize the influence of outliers, unlike centroids which remain fixed.
Medoids ignore outliers altogether, focusing only on the central data points in the cluster.
Difficulty Level: 1
Positive Marks: 2.00
Negative Marks: 0.66
You have a dataset with three classes, each with its own covariance matrix. However, when applying LDA, you decide to assume that all three classes share the same covariance matrix. How does this assumption affect the resulting decision boundaries, and under what conditions is this assumption justified?
The assumption results in quadratic decision boundaries, justified when the covariance matrices are significantly different.
The assumption results in linear decision boundaries, justified when the covariance matrices are similar.
The assumption has no effect on the decision boundaries.
The assumption results in circular decision boundaries, justified when the covariance matrices are identical.
Difficulty Level: 1
Positive Marks: 2.00
Negative Marks: 0.66
Dataset with four points [1,2],[2,3],[6,8],[7,9]you apply single-linkage hierarchical clustering. What is the distance between the first two clusters formed?(ans to two decimals)
1.41
Difficulty Level: 1
Positive Marks: 2.00
Negative Marks: 0.00
Dataset with four points [2,2],[4,4],[5,5],[8,8]Using complete linkage hierarchical clustering, what is the distance between the first two clusters merged?
6
Difficulty Level: 1
Positive Marks: 2.00
Negative Marks: 0.00
Consider a dataset that is to be clustered using the single-linkage hierarchical clustering algorithm. This algorithm defines the distance between two clusters as the minimum distance between any single data point in the first cluster and any single data point in the second cluster. If this dataset contains outliers, which of the following statements best describes the potential impact on the resultant clustering structure?
Outliers will have no impact on the clustering structure as single-linkage clustering is robust to outliers.
Outliers may lead to the formation of clusters that are smaller in size compared to the clusters formed without outliers, as the algorithm tends to merge clusters with close proximity.
Outliers may cause the algorithm to merge separate clusters that would not have been merged in the absence of outliers, as the minimum distance between clusters can be significantly reduced by the presence of an outlier.
Outliers may cause the algorithm to create larger clusters, as the maximum distance between clusters can be significantly increased by the presence of an outlier.

Difficulty Level: 1
Positive Marks: 2.00
Negative Marks: 0.66
In bottom-up clustering, if you start with 5 individual points, how many merges are required to form a single cluster?
3
4
5
6
Difficulty Level: 1
Positive Marks: 2.00
Negative Marks: 0.66
Which of the following metrics do we have for finding dissimilarity between two clusters in hierarchical clustering?

  1. Single-link
  2. Complete-link
  3. Average-link
1 and 3
1 and 2
2 and 3
1, 2 and 3
Difficulty Level: 1
Positive Marks: 2.00
Negative Marks: 0.66