Meteorological modeling takes data captured from multiple sources that is then processed by data mining techniques to predict environmental changes. The most commonly used machine learning techniques for processing meteorological data are decision trees, rule-based methods, neural networks, naive Bayes, Bayesian belief networks, and support vector machines. These techniques require accurate data for effective models to be simulated. Meteorological datasets can contain outliers and errors that can significantly skew the accuracy of the generated models that are relied upon for many sectors of society including agriculture, natural disasters, and meteorological forecasting. This paper proposes a method to eliminate outliers from meteorological data to enhance the accuracy of models by applying a blind thresholding algorithm to the principal components (PCs) obtained from L1 and L2 norm Principal Component Analysis to identify and discard outliers in the dataset.
|