Disease detection for multilabel big dataset using MLAM, Naive Bayes, Adaboost classification
Pages : 86-91
Download PDF
Abstract
In Intensive Care Units (ICU) in the modern medical information scheme can keep the record of patient events in relational databases each second. Information mining from these enormous volumes of medical information is beneficial to equally caregivers as well as patients. Specified a set of electronic patient records, a scheme that efficiently gives the disease labels can enable medical database management as well as benefit other researchers, e.g. pathologists. Since, as data increasing day by day, thus to process on data is difficult. In this paper, a framework is proposed to achieve that goal by introducing Hadoop map reducing. Medical chart and note data of a patient are used to extract distinctive features. To encode patient features, a Bag-ofWords encoding method is applied for both chart and note data. This paper also proposes model that takes into account both global information and local correlations between diseases. Associated diseases are considered by a graph structure that is embedded in proposed sparsity-based structure. The proposed algorithm captures the disease relevance when labeling disease codes rather than making individual decision with respect to a specific disease. In addition for evaluation purpose Naive Bays and Adaboost classifier are used for disease classification.
Keywords: ICD code labeling, multi-label learning, sparsity-based regularization, disease correlation embedding, map Re-duce.