Module 4
Clustering:-
Introduction - Similarity measures - Clustering criteria - Distance functio
means clustering, Hierarchical Clustering, Density based clustering (DBSCA
Combining Multiple Learners:- Voting, Bagging, Boosting
By:
Sherry O. Panicker
MCA, M. Phil
MCA@NirmalaCollegeMuvattupuzha
,Introduction
• Clustering or cluster analysis is a machine learning technique,
groups the unlabelled dataset. It can be defined as "A wa
grouping the data points into different clusters, consistin
similar data points. The objects with the possible simila
remain in a group that has less or no similarities with an
group.“
• Clustering is the process of grouping a set of data objects
multiple groups or clusters so that objects within a cluster have
similarity, but are very dissimilar to objects in other clusters.
• Dissimilarities and similarities are assessed based on the att
2 values and often involve distance measures.
,Introduction
Application areas of Clustering Techniques:
as a data mining tool in biology, security, business intelligence, an
search.
Market Segmentation
Customer Segmentation
Image segmentation
Statistical data analysis
Social network analysis
Anomaly detection, etc.
Search Engines
In Identification of Cancer Cells
In Land Use
In Biology
3
,Introduction
Clustering is used by the Amazon in its recommendation syste
provide the recommendations as per the past search of products.
Netflix also uses this technique to recommend the movies and
series to its users as per the watch history.
4
Clustering:-
Introduction - Similarity measures - Clustering criteria - Distance functio
means clustering, Hierarchical Clustering, Density based clustering (DBSCA
Combining Multiple Learners:- Voting, Bagging, Boosting
By:
Sherry O. Panicker
MCA, M. Phil
MCA@NirmalaCollegeMuvattupuzha
,Introduction
• Clustering or cluster analysis is a machine learning technique,
groups the unlabelled dataset. It can be defined as "A wa
grouping the data points into different clusters, consistin
similar data points. The objects with the possible simila
remain in a group that has less or no similarities with an
group.“
• Clustering is the process of grouping a set of data objects
multiple groups or clusters so that objects within a cluster have
similarity, but are very dissimilar to objects in other clusters.
• Dissimilarities and similarities are assessed based on the att
2 values and often involve distance measures.
,Introduction
Application areas of Clustering Techniques:
as a data mining tool in biology, security, business intelligence, an
search.
Market Segmentation
Customer Segmentation
Image segmentation
Statistical data analysis
Social network analysis
Anomaly detection, etc.
Search Engines
In Identification of Cancer Cells
In Land Use
In Biology
3
,Introduction
Clustering is used by the Amazon in its recommendation syste
provide the recommendations as per the past search of products.
Netflix also uses this technique to recommend the movies and
series to its users as per the watch history.
4