OUTLIER DETECTION ON HIGH DIMENSIONAL DATA USING MINIMUM VECTOR VARIANCE (MVV)
Abstract
High-dimensional data can occur in actual cases where the variable p is larger than the number of observations n. The problem that often occurs when adding data dimensions indicates that the data points will approach an outlier. Outliers are part of observations that do not follow the data distribution pattern and are located far from the data center. The existence of outliers needs to be detected because it can lead to deviations from the analysis results. One of the methods used to detect outliers is the Mahalanobis distance. To obtain a robust Mahalanobis distance, the Minimum Vector Variance (MVV) method is used. This study will compare the MVV method with the classical Mahalanobis distance method in detecting outliers in non-invasive blood glucose level data, both at p>n and n>p. The test results show that the MVV method is better for n>p. MVV shows more effective results in identifying the minimum data group and outlier data points than the classical method.
Downloads
References
M. Rochayani, “Hybrid Undersampling, Regularization, and Decision Tree Methods for Classification of High Dimensional Data with Unbalanced Classes,” 2020, Accessed: Mar. 27, 2022. [Online]. Available: http://repository.ub.ac.id/183689/.
T. RAHMATIKA, “Support Vector Machine for Multiclass Imbalanced on High Dimensional Data,” 2020, Accessed: Mar. 27, 2022. [Online]. Available: http://etd.repository.ug.ac.id/penelitian/detail/183304.
E. Herdiani, P. Sari, NS-J. of P. Conference, and undefined 2019, “Detection of Outliers in Multivariate Data using Minimum Vector Variance Method,” iopscience.iop.org , doi: 10.1088/1742-6596/1341/9/092004.
E. Wahyuni, SS- Science, undefined technology, undefined Engineering, and undefined 2020, “A Comparison of Outlier Detection Techniques in Data Mining,” seminar.uad.ac.id , Accessed: Mar. 26, 2022. [Online]. Available: http://seminar.uad.ac.id/index.php/STEEEM/article/download/2878/805.
GN-J. of AI System and undefined 2016, “Detection of Transaction Outliers Using Visualization-Olap in Private Higher Education Data Warehouses,” publications.dinus.ac.id , Accessed: Mar. 26, 2022. [Online]. Available: http://publikasi.dinus.ac.id/index.php/jais/article/view/1184.
J. Mei, M. Liu, Y. Wang, HG-I. transactions on, and undefined 2015, “Learning a Mahalanobis distance-based dynamic time warping measure for multivariate time series classification,” ieeexplore.ieee.org , Accessed: Mar. 28, 2022. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/7104107/.
M. FARUK, "Comparison of MVV and FMCD methods in detecting outliers in a normal multivariate data observation," 2008, Accessed: Mar. 27, 2022. [Online]. Available: http://etd.repository.ugm.ac.id/home/detail_pencarian/39086.
D. Juniardi, MM-BBI Mathematics, and undefined Statistika, “USE OF MINIMUM VECTOR VARIANCE (MVV) METHOD AND CONFIRMATION ANALYSIS IN DETECTING OUTLIER,” journal.untan.ac.id , vol. 01, no. 1, pp. 31–40, 2012, Accessed: Mar. 25, 2022. [Online]. Available: https://jurnal.untan.ac.id/index.php/jbmstr/article/view/5187.
Juniardi DKMNM, “USE OF MINIMUM VECTOR VARIANCE (MVV) METHOD AND CONFIRMATION ANALYSIS IN DETECTING OUTLIER,” Bimaster Bul. science. Matt. stats. and Ter. , vol. 3, no. 01, March. 2014, doi:10.26418/BBIMST.V3I01.5187.
K. Aurelia, “Non-invasive Estimation of Blood Glucose Levels Using Partial Least Square Regression with Multiple Summary Approaches,” 2020, [Online]. Available: https://repository.ipb.ac.id/handle/123456789/104399.
MP Boni et al. , “Mahalanobis Distance And Pca,” 2018.
C. Leys, O. Klein, Y. Dominicy, and C. Ley, “Detecting multivariate outliers: Use a robust variant of the Mahalanobis distance,” J. Exp. soc. Psychol. , vol. 74, pp. 150–156, Jan. 2018, doi:10.1016/J.JESP.2017.09.011.
R. Johnson, DW- Statistics, and undefined 2015, “Applied multivariate statistical analysis,” statistics.columbian.gwu.edu , Accessed: Mar. 28, 2022. [Online]. Available: https://statistics.columbian.gwu.edu/sites/g/files/zaxdzs1911/f/downloads/Syllabus Stat 6215.G Wang Fall 2015.pdf.
DE Herwindiati and SM Isa, “The Robust Principal Component Using Minimum Vector Variance,” Proc. World Congr. eng. , vol. 1, pp. 325–329, 2009.
N. Mukhtar, “ANALYSIS OF MAIN COMPONENTS OF ROBUST USING MINIMUM VECTOR VARIANCE METHOD NURHARDIANTI MUKHTAR'S thesis,” no. April, 2019.
Copyright (c) 2022 Andi Harismahyanti A., Indahwati Indahwati, Anwar Fitrianto, Erfiani Erfiani
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.