Data mining is the method in which useful information is removed from the raw data. Data mining is applied to complete various tasks like clustering, prediction analysis and association rule generation with the help of various data mining tools and techniques. In the approaches of data mining, clustering is the most efficient technique which can be applied to extract helpful information from the raw data.
The clustering is the method in which similar and dissimilar type of data can be clustered to analyze helpful information from the dataset. The clustering is of many types like density based clustering, hierarchical clustering and partitioning based clustering. The k-mean algorithm is the most efficient algorithm which is widely used to cluster similar and dissimilar types of data from the input data set.
In the k-mean clustering, the centroid point in calculated by taking the arithmetic mean of the input dataset. The Euclidean distance is calculated from the centroid point to cluster similar and dissimilar points from the data set. The prediction analysis is the method which is applied on the input dataset to predict current and future situations according to the input dataset.
In the predictive analysis, the clustering is applied to cluster similar and dissimilar type of data and on the clustered data the technique of classification is applied which will classify the data for prediction analysis. There is an array of data mining techniques and tools that keep evolving to maintain pace with the modern innovations.