Laboratory measurements, paleontological data, and well-logs are often used to conduct mineralogical and chemical analyses to classify rock samples. Employing digital intelligence techniques may enhance the accuracy of classification predictions while simultaneously speeding up the whole classification process. We aim to develop a comprehensive approach for categorizing igneous rock types based on their global geochemical characteristics. Our strategy integrates advanced clustering, classification, data mining, and statistical methods employing worldwide geochemical data set of ~25,000 points from 15 igneous rock types. In this pioneering study, we employed hierarchical clustering, linear projection analysis, and multidimensional scaling to determine the frequency distribution and oxide content of igneous rock types globally. The study included eight classifiers: Logistic Regression (LR), Gradient Boosting (GB), Random Forest (RF), K-nearest Neighbors (KNN), Support Vector Machine (SVM), Artificial Neural Network (ANN), and two ensemble-based classifier models, EN-1 and EN-2. EN-1 consisted of LR, GB, and RF aggregates, whereas EN-2 comprised the predictions of all ML models used in our study. The accuracy of EN-2 was 99.2 %, EN-1 achieved 98 %, while ANN yielded 98.2 %. EN-2 provided the best performance with highest initial curve for longest time on the receiver operating characteristic (ROC) curve. Based on the ranking features, SiO2 was deemed most important followed by K2O and Na2O. Our findings indicate that the use of ensemble models enhances the accuracy and reliability of predictions by effectively capturing diverse patterns and correlations within the data. Consequently, this leads to more precise results in rock typing globally.
Keywords: Data mining; Ensemble-based method; Global distribution; Machine learning; Volcanic rocks.
Copyright © 2024 Elsevier B.V. All rights reserved.