Attribute and Instance based Data Reduction Using Important Labeling
Pages : 651-654
Download PDF
Abstract
A bulk data is generated from various sources. The sources In Data reduction technique, the size of dataset is reduced by preserving important representative data. Lot of work has been done on data reduction techniques using machine learning. Data can be reduced in terms of attributes and instances. In existing approaches, the instance reduction and attribute reduction techniques are studied independently. The proposed system reduces the data in terms of attribute and instances. For attribute reduction, feature selection technique is used. Feature selection is a filter method that keeps only important attributes in dataset. For instance reduction, mapping based granulation and important instance labeling technique is used. Mapping technique reduces the multi dimensional data to the single dimensional data and then granules of one dimensional data are created using k means algorithm. Based on Hausdorff distance and data crowding degree, unimportant instances filtered from the dataset. The system will be tested on multiple UCI repository datasets. The efficiency of this system will be measured with the help of classification accuracy and execution time.
Keywords: Data reduction, dimension reduction, feature selection, granulation, important labelling, knn