Matlab Project Computer Software for Sciences COSC2836 XXXXXXXXXXMarch 2021 The goal of this project is: 1. learning a clustering algorithm called K-means clustering 2. Implement this algorithm using...

1 answer below »
Please help stressing


Matlab Project Computer Software for Sciences COSC2836 March 2021 The goal of this project is: 1. learning a clustering algorithm called K-means clustering 2. Implement this algorithm using Matlab 3. Use our implemented code to cluster sample data 4. Change different parameters of our code and experiment effect of each parameter in the final clustering result 5. Write a report to describe how the code was implemented, how the experiments were performed and what were the results Description In this part, we learn what clustering is and what K-means clustering is. There are thousands of resources that describe this algorithm on the internet. Please make sure that you search for more descriptions if the following definition was not enough for you to understand this algorithm. Clustering is the task of dividing the population, a set of objects or data points into a number of groups such that data points in the same groups (called a cluster ) are more similar to other data points in the same group than those in other groups. In simple words, the aim is to segregate groups with similar traits and assign them into clusters. Clustering itself is not one specific algorithm but the general task to be solved. It can be achieved by various algorithms that differ significantly in their understanding of what constitutes a cluster and how to find them efficiently. Every methodology follows a different set of rules for defining the ‘similarity’ among data points. There are more than 100 clustering algorithms known. But few of the algorithms are used popularly. One type of clustering models are iterative algorithms in which the notion of similarity is derived by the closeness of a data point to the centroid of the clusters. K-Means clustering algorithm is a popular algorithm that falls into this category. In these models, the number of clusters required has to be mentioned beforehand, making it essential to have prior knowledge of the dataset. K-means is an iterative clustering algorithm that aims clustering aims to group n data points (x 1 , x 2 , ..., x n ) into k (≤ n) clusters S = {S 1 , S 2 , ..., S k } in which each data point belongs to the cluster with the nearest mean . This algorithm works in these five steps : https://en.wikipedia.org/wiki/Algorithm https://en.wikipedia.org/wiki/Cluster_(statistics) https://en.wikipedia.org/wiki/Mean 1. Specify the desired number of clusters K: Let us choose k=2 for 5 data points in 2-D space. 2. Randomly assign each data point to a cluster (Initialization step) : Let’s assign three points in cluster 1 shown using red colour and two points in cluster 2 shown using grey colour. 3. Compute cluster centroids: The centroid of data points in the red cluster is shown using the red cross and those in the grey cluster using the grey cross. 4. Re-assign each point to the closest cluster centroid(assignment step) : Note that only the data point at the bottom is assigned to the red cluster even though it’s closer to the centroid of the grey cluster. Thus, we assign that data point into the grey cluster. 5. Re-compute cluster centroids(Update step) : Now, re-computing the centroids for both clusters. 6. Repeat steps 4 and 5 until no improvements are possible: Similarly, we’ll repeat the 4th and 5th steps until we’ll reach global optima when there is no further switching of data points between two clusters for two successive repeats. It will mark the termination of the algorithm if not explicitly mentioned. The K-means algorithm’s objective is to find a set of clusters S = {S 1 , S 2 , ..., S k } for data points (x 1 , x 2 , ..., x n ) that minimizes the within-cluster variances. The following equation formulates the 1 mentioned objective: where μ i is the mean of data points in the cluster S i . 1 I t finds local minima
Answered 3 days AfterMar 31, 2021

Answer To: Matlab Project Computer Software for Sciences COSC2836 XXXXXXXXXXMarch 2021 The goal of this project...

Kshitij answered on Apr 03 2021
148 Votes
knns/computeCentroids.m
function centroids = computeCentroids(X, idx, K)
[m n] = size(X);
centroids = zeros(K, n);

for i=
1:K
xi = X(idx==i,:);
ck = size(xi,1);
% centroids(i, :) = (1/ck) * sum(xi);
centroids(i, :) = (1/ck) * [sum(xi(:,1)) sum(xi(:,2))];
end
end
knns/getClosestCentroids.m
function indices = getClosestCentroids(X, centroids)
K = size(centroids, 1);
indices = zeros(size(X,1), 1);
m = size(X,1);
for i=1:m
k = 1;
min_dist = sum((X(i,:) - centroids(1,:)) .^ 2);
for j=2:K
dist = sum((X(i,:) - centroids(j,:)) .^ 2);
if(dist < min_dist)
min_dist = dist;
k = j;
end
end
indices(i) = k;
end
end
knns/initCentroids.m
function centroids = initCentroids(X, K)
centroids = zeros(K,size(X,2));
randidx = randperm(size(X,1));
centroids = X(randidx(1:K), :);
end
knns/reportKNN.docx
1- Implement above mentioned K-means algorithm using MATLAB(it is highly recommended that you write a separate function for each step,
1. The inputs of your code will be:
a. The number of maximum allowed iterations
b. K (number of clusters)
2. Use the Fogy method in the initialization step
3. Use Squared Euclidean distance in the assignment step to finding the nearest mean to a data point.
4....
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here