Local Deviation Coefficient based Outlier detection for Scattered Data
Pages : 863-867
Download PDF
Abstract
Outlier detection technique is applied in variety of domains like intrusion detection, health care monitoring, human gait analysis, etc. There are 2 main types of outliers: Global and local. Global outliers are the extreme data values in a dataset whereas local outliers are the data points within a range but much less or higher than other dataset values. Lot of work has been done in the domain of outlier detection. LOF, LOF with incremental approach, Memory efficient LOF with streaming data are well known outlier detection techniques. Existing approaches focuses on finding of the degree of deviation of local outlier points from the clustered data and failed to find degree of dispersion. The proposed system focuses on finding local outliers on scattered data. For outlier detection, degree of deviation and degree of dispersion is calculated. To make the system memory efficient, the dimensionality reduction and undersampling is applied. For dimension reduction principal component analysis is applied. A rough clustering based on multi-level queries algorithm is applied for safe non-outliers points’ elimination. This is undersampling method. This reduces the number of points for further processing. A density-based local outlier detection for scattered data (E2DLOS) algorithm is applied on non-safe points and top m outlier are identified. The system will be tested on various datasets downloaded from UCI repository.
Keywords: Outlier detection, rough clustering, scattered data, PCA, undersampling, feature extraction, dimension reduction