Mahalanobis Distance-based Over-Sampling Technique
Pages : 287-291
Download PDF
Abstract
In data classification technique data is distributed among multiple classes. The varying structure of data distribution over multiple classes generates the skewness in data. The skewness in data represents the data imbalance. The imbalance dataset faces problem in data classification and hampers the classification accuracy. The major issue faced for minority class classification. Number of techniques has been proposed for balancing the dataset without hampering the classification accuracy of majority class. Adaptive Mahalanobis Distance-based Over-sampling (AMDO) is a over-sampling strategy. It works on mixed-type data sets. In the proposed approach the efficiency of AMDO technique is improved with the help of Principle Component Analysis (PCA) technique. This technique uses GSVD (Generalized Singular Value Decomposition) for mixed-type data. The experimental analysis will be performed on multiple multi-class imbalanced benchmarks datasets. The system performance is measured in terms of accuracy and execution time.
Keywords: Imbalance data, data skewness, oversampling, Mahalanobis distance, hybrid data