化学地层学研究中主成分分析的最佳实践

IF 3.1 3区 地球科学 Q1 GEOCHEMISTRY & GEOPHYSICS
Nikolaos A. Michael , Mustafa A. Al Ibrahim , Christian Scheibe , Neil Craigie
{"title":"化学地层学研究中主成分分析的最佳实践","authors":"Nikolaos A. Michael ,&nbsp;Mustafa A. Al Ibrahim ,&nbsp;Christian Scheibe ,&nbsp;Neil Craigie","doi":"10.1016/j.apgeochem.2025.106355","DOIUrl":null,"url":null,"abstract":"<div><div>Principal Component Analysis (PCA) is a powerful tool and can be used to interpret the chemical composition of geological rock samples. However, what are the best practices when dealing with these data and workflows? A very important step to the analysis of PCA variables is the related eigenvector analysis to determine element-mineral links and relationships between elements and geological conditions (e.g. depositional environment, diagenesis, weathering). We present two examples to demonstrate the usefulness of the technique: one from carbonate sediments, the other from siliciclastics.</div><div>The second part of the paper focuses on the behavior of eigenvectors and principal components (PCs) with changing datasets. For this, 1000s of experiments were performed in different lithologies and subsets of data on carbonate, siliciclastic and mixed carbonate-siliciclastic sediments to understand the relative position of the elements in eigenvector space when the quantity of data is increased and decreased. This helped us deduce best practices for undertaking such analysis in the future.</div><div>From the experiments, we demonstrate that a stable model exists for PC1 and PC2 variables (i.e. the first and second most important sources of statistical variation) for only 100 samples. For higher orders of PCs (PC3-PC6) 1000s of samples are sometimes required for a stable model. This implies that it is not possible to expect the results of PCA analysis to be the same in each study with respect to higher orders of PCs. You can only transfer a geological interpretation from one study to the other if the eigenvectors from the reference dataset are applied to the next and only if the data are represented in the original study.</div></div>","PeriodicalId":8064,"journal":{"name":"Applied Geochemistry","volume":"186 ","pages":"Article 106355"},"PeriodicalIF":3.1000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Best practices of utilizing principal component analysis in chemostratigraphic studies\",\"authors\":\"Nikolaos A. Michael ,&nbsp;Mustafa A. Al Ibrahim ,&nbsp;Christian Scheibe ,&nbsp;Neil Craigie\",\"doi\":\"10.1016/j.apgeochem.2025.106355\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Principal Component Analysis (PCA) is a powerful tool and can be used to interpret the chemical composition of geological rock samples. However, what are the best practices when dealing with these data and workflows? A very important step to the analysis of PCA variables is the related eigenvector analysis to determine element-mineral links and relationships between elements and geological conditions (e.g. depositional environment, diagenesis, weathering). We present two examples to demonstrate the usefulness of the technique: one from carbonate sediments, the other from siliciclastics.</div><div>The second part of the paper focuses on the behavior of eigenvectors and principal components (PCs) with changing datasets. For this, 1000s of experiments were performed in different lithologies and subsets of data on carbonate, siliciclastic and mixed carbonate-siliciclastic sediments to understand the relative position of the elements in eigenvector space when the quantity of data is increased and decreased. This helped us deduce best practices for undertaking such analysis in the future.</div><div>From the experiments, we demonstrate that a stable model exists for PC1 and PC2 variables (i.e. the first and second most important sources of statistical variation) for only 100 samples. For higher orders of PCs (PC3-PC6) 1000s of samples are sometimes required for a stable model. This implies that it is not possible to expect the results of PCA analysis to be the same in each study with respect to higher orders of PCs. You can only transfer a geological interpretation from one study to the other if the eigenvectors from the reference dataset are applied to the next and only if the data are represented in the original study.</div></div>\",\"PeriodicalId\":8064,\"journal\":{\"name\":\"Applied Geochemistry\",\"volume\":\"186 \",\"pages\":\"Article 106355\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Geochemistry\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0883292725000782\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOCHEMISTRY & GEOPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Geochemistry","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0883292725000782","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0

摘要

主成分分析 (PCA) 是一种强大的工具,可用于解释地质岩石样本的化学成分。然而,在处理这些数据和工作流程时有哪些最佳做法?对 PCA 变量进行分析的一个非常重要的步骤是进行相关的特征向量分析,以确定元素与矿物之间的联系以及元素与地质条件(如沉积环境、成岩作用、风化作用)之间的关系。我们列举了两个例子来证明该技术的实用性:一个来自碳酸盐沉积物,另一个来自硅质塑料。论文的第二部分重点讨论了特征向量和主成分(PCs)在数据集变化时的行为。为此,我们在碳酸盐岩、硅质岩和碳酸盐-硅质岩混合沉积物的不同岩性和数据子集中进行了 1000 次实验,以了解当数据量增加或减少时,元素在特征向量空间中的相对位置。实验结果表明,对于 PC1 和 PC2 变量(即统计变异的第一和第二大来源),只需 100 个样本就能建立稳定的模型。而对于更高阶的 PC(PC3-PC6),有时需要 1000 个样本才能建立稳定的模型。这意味着不可能期望 PCA 分析的结果在每项研究中都与高阶 PC 相同。只有将参考数据集的特征向量应用到下一个研究中,并且只有在原始研究中的数据具有代表性的情况下,才能将地质解释从一个研究转移到另一个研究中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Best practices of utilizing principal component analysis in chemostratigraphic studies

Best practices of utilizing principal component analysis in chemostratigraphic studies
Principal Component Analysis (PCA) is a powerful tool and can be used to interpret the chemical composition of geological rock samples. However, what are the best practices when dealing with these data and workflows? A very important step to the analysis of PCA variables is the related eigenvector analysis to determine element-mineral links and relationships between elements and geological conditions (e.g. depositional environment, diagenesis, weathering). We present two examples to demonstrate the usefulness of the technique: one from carbonate sediments, the other from siliciclastics.
The second part of the paper focuses on the behavior of eigenvectors and principal components (PCs) with changing datasets. For this, 1000s of experiments were performed in different lithologies and subsets of data on carbonate, siliciclastic and mixed carbonate-siliciclastic sediments to understand the relative position of the elements in eigenvector space when the quantity of data is increased and decreased. This helped us deduce best practices for undertaking such analysis in the future.
From the experiments, we demonstrate that a stable model exists for PC1 and PC2 variables (i.e. the first and second most important sources of statistical variation) for only 100 samples. For higher orders of PCs (PC3-PC6) 1000s of samples are sometimes required for a stable model. This implies that it is not possible to expect the results of PCA analysis to be the same in each study with respect to higher orders of PCs. You can only transfer a geological interpretation from one study to the other if the eigenvectors from the reference dataset are applied to the next and only if the data are represented in the original study.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Applied Geochemistry
Applied Geochemistry 地学-地球化学与地球物理
CiteScore
6.10
自引率
8.80%
发文量
272
审稿时长
65 days
期刊介绍: Applied Geochemistry is an international journal devoted to publication of original research papers, rapid research communications and selected review papers in geochemistry and urban geochemistry which have some practical application to an aspect of human endeavour, such as the preservation of the environment, health, waste disposal and the search for resources. Papers on applications of inorganic, organic and isotope geochemistry and geochemical processes are therefore welcome provided they meet the main criterion. Spatial and temporal monitoring case studies are only of interest to our international readership if they present new ideas of broad application. Topics covered include: (1) Environmental geochemistry (including natural and anthropogenic aspects, and protection and remediation strategies); (2) Hydrogeochemistry (surface and groundwater); (3) Medical (urban) geochemistry; (4) The search for energy resources (in particular unconventional oil and gas or emerging metal resources); (5) Energy exploitation (in particular geothermal energy and CCS); (6) Upgrading of energy and mineral resources where there is a direct geochemical application; and (7) Waste disposal, including nuclear waste disposal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信