Performance Analysis of Different Data mining Techniques over Heart Disease dataset
Pages : 220-224
Download PDF
Abstract
Data Mining is an analytic process designed to explore data (usually large amounts of data – typically business or market related – also known as “big data”) in search of consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. The ultimate goal of data mining is prediction – and predictive data mining is the most common type of data mining and one that has the most direct business applications. Classification trees are used to predict membership of cases or objects in the classes of a categorical dependent variable from their measurements on one or more predictor variables. Classification tree analysis is one of the main techniques used in Data Mining. During my research, I had analyzed the various classification algorithms and compared the performance of classification algorithms on aspects for time taken to build the model, by using different distance function. The result is being tested on data set which is taken from UCI repositories. The aim is to judge the efficiency of different data mining algorithms on Heart Disease dataset and determine the optimum algorithm. The performance analysis depends on many factors encompassing validation mode, distance function, different nature of dataset.
Keywords: Data Mining, Classification, Classification Techniques, Distance function, KEEL Tool, Performance Analysis.
Article published in International Journal of Current Engineering and Technology, Vol.4,No.1 (Feb- 2014)