Nikolaos A. Michael , Mustafa A. Al Ibrahim , Christian Scheibe , Neil Craigie
{"title":"化学地层学研究中主成分分析的最佳实践","authors":"Nikolaos A. Michael , Mustafa A. Al Ibrahim , Christian Scheibe , Neil Craigie","doi":"10.1016/j.apgeochem.2025.106355","DOIUrl":null,"url":null,"abstract":"<div><div>Principal Component Analysis (PCA) is a powerful tool and can be used to interpret the chemical composition of geological rock samples. However, what are the best practices when dealing with these data and workflows? A very important step to the analysis of PCA variables is the related eigenvector analysis to determine element-mineral links and relationships between elements and geological conditions (e.g. depositional environment, diagenesis, weathering). We present two examples to demonstrate the usefulness of the technique: one from carbonate sediments, the other from siliciclastics.</div><div>The second part of the paper focuses on the behavior of eigenvectors and principal components (PCs) with changing datasets. For this, 1000s of experiments were performed in different lithologies and subsets of data on carbonate, siliciclastic and mixed carbonate-siliciclastic sediments to understand the relative position of the elements in eigenvector space when the quantity of data is increased and decreased. This helped us deduce best practices for undertaking such analysis in the future.</div><div>From the experiments, we demonstrate that a stable model exists for PC1 and PC2 variables (i.e. the first and second most important sources of statistical variation) for only 100 samples. For higher orders of PCs (PC3-PC6) 1000s of samples are sometimes required for a stable model. This implies that it is not possible to expect the results of PCA analysis to be the same in each study with respect to higher orders of PCs. You can only transfer a geological interpretation from one study to the other if the eigenvectors from the reference dataset are applied to the next and only if the data are represented in the original study.</div></div>","PeriodicalId":8064,"journal":{"name":"Applied Geochemistry","volume":"186 ","pages":"Article 106355"},"PeriodicalIF":3.1000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Best practices of utilizing principal component analysis in chemostratigraphic studies\",\"authors\":\"Nikolaos A. Michael , Mustafa A. Al Ibrahim , Christian Scheibe , Neil Craigie\",\"doi\":\"10.1016/j.apgeochem.2025.106355\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Principal Component Analysis (PCA) is a powerful tool and can be used to interpret the chemical composition of geological rock samples. However, what are the best practices when dealing with these data and workflows? A very important step to the analysis of PCA variables is the related eigenvector analysis to determine element-mineral links and relationships between elements and geological conditions (e.g. depositional environment, diagenesis, weathering). We present two examples to demonstrate the usefulness of the technique: one from carbonate sediments, the other from siliciclastics.</div><div>The second part of the paper focuses on the behavior of eigenvectors and principal components (PCs) with changing datasets. For this, 1000s of experiments were performed in different lithologies and subsets of data on carbonate, siliciclastic and mixed carbonate-siliciclastic sediments to understand the relative position of the elements in eigenvector space when the quantity of data is increased and decreased. This helped us deduce best practices for undertaking such analysis in the future.</div><div>From the experiments, we demonstrate that a stable model exists for PC1 and PC2 variables (i.e. the first and second most important sources of statistical variation) for only 100 samples. For higher orders of PCs (PC3-PC6) 1000s of samples are sometimes required for a stable model. This implies that it is not possible to expect the results of PCA analysis to be the same in each study with respect to higher orders of PCs. You can only transfer a geological interpretation from one study to the other if the eigenvectors from the reference dataset are applied to the next and only if the data are represented in the original study.</div></div>\",\"PeriodicalId\":8064,\"journal\":{\"name\":\"Applied Geochemistry\",\"volume\":\"186 \",\"pages\":\"Article 106355\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Geochemistry\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0883292725000782\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOCHEMISTRY & GEOPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Geochemistry","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0883292725000782","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
Best practices of utilizing principal component analysis in chemostratigraphic studies
Principal Component Analysis (PCA) is a powerful tool and can be used to interpret the chemical composition of geological rock samples. However, what are the best practices when dealing with these data and workflows? A very important step to the analysis of PCA variables is the related eigenvector analysis to determine element-mineral links and relationships between elements and geological conditions (e.g. depositional environment, diagenesis, weathering). We present two examples to demonstrate the usefulness of the technique: one from carbonate sediments, the other from siliciclastics.
The second part of the paper focuses on the behavior of eigenvectors and principal components (PCs) with changing datasets. For this, 1000s of experiments were performed in different lithologies and subsets of data on carbonate, siliciclastic and mixed carbonate-siliciclastic sediments to understand the relative position of the elements in eigenvector space when the quantity of data is increased and decreased. This helped us deduce best practices for undertaking such analysis in the future.
From the experiments, we demonstrate that a stable model exists for PC1 and PC2 variables (i.e. the first and second most important sources of statistical variation) for only 100 samples. For higher orders of PCs (PC3-PC6) 1000s of samples are sometimes required for a stable model. This implies that it is not possible to expect the results of PCA analysis to be the same in each study with respect to higher orders of PCs. You can only transfer a geological interpretation from one study to the other if the eigenvectors from the reference dataset are applied to the next and only if the data are represented in the original study.
期刊介绍:
Applied Geochemistry is an international journal devoted to publication of original research papers, rapid research communications and selected review papers in geochemistry and urban geochemistry which have some practical application to an aspect of human endeavour, such as the preservation of the environment, health, waste disposal and the search for resources. Papers on applications of inorganic, organic and isotope geochemistry and geochemical processes are therefore welcome provided they meet the main criterion. Spatial and temporal monitoring case studies are only of interest to our international readership if they present new ideas of broad application.
Topics covered include: (1) Environmental geochemistry (including natural and anthropogenic aspects, and protection and remediation strategies); (2) Hydrogeochemistry (surface and groundwater); (3) Medical (urban) geochemistry; (4) The search for energy resources (in particular unconventional oil and gas or emerging metal resources); (5) Energy exploitation (in particular geothermal energy and CCS); (6) Upgrading of energy and mineral resources where there is a direct geochemical application; and (7) Waste disposal, including nuclear waste disposal.