Paper
27 September 2024 Conformal prediction based on principal component analysis for high-dimensional outlier detection
Xiaoyu Qian, Jinru Wu, Youwu Lin
Author Affiliations +
Proceedings Volume 13281, International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2024); 1328106 (2024) https://doi.org/10.1117/12.3050772
Event: International Conference on Cloud Computing, Performance Computing, and Deep Learning, 2024, Zhengzhou, China
Abstract
Current outlier detection methods exhibit many limitations in high-dimensional settings. Traditional statistical approaches rely on strong assumptions and lack practicality and generality. Meanwhile, despite the better performance, machine learning methods suffer from low interpretability and reliability due to their complex mechanisms and the absence of confidence estimation. Although outlier detection methods based on principal component analysis (PCA) have shown some advantages by extracting important features from high-dimensional data, they have not entirely solved these problems. Conformal prediction, a finite-sample distribution-free uncertainty quantification method recently applied to outlier detection, produces a set-valued prediction with a Type-I error guarantee and false discovery rate control. This paper proposes a new distribution-free method for high-dimensional outlier detection, PCA-CP, combining principal component analysis and conformal prediction. PCA-CP overcomes the shortcomings of previous methods, demonstrating high generality and reliability while achieving high performance. Experiments on simulation and real data show that PCA-CP outperforms previous methods, achieving higher power and lower False Discovery Rate, thus proving its significant advantages.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Xiaoyu Qian, Jinru Wu, and Youwu Lin "Conformal prediction based on principal component analysis for high-dimensional outlier detection", Proc. SPIE 13281, International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2024), 1328106 (27 September 2024); https://doi.org/10.1117/12.3050772
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Principal component analysis

Calibration

Machine learning

Data modeling

Reliability

Matrices

Statistical methods

Back to Top