Effect of signal-to-noise ratio on the automatic clustering of X-ray diffraction patterns from combinatorial libraries

Yuanxun Zhou, Biao Wu, Jianhao Wang, Hong Wang
{"title":"Effect of signal-to-noise ratio on the automatic clustering of X-ray diffraction patterns from combinatorial libraries","authors":"Yuanxun Zhou,&nbsp;Biao Wu,&nbsp;Jianhao Wang,&nbsp;Hong Wang","doi":"10.1002/mgea.27","DOIUrl":null,"url":null,"abstract":"<p>Hierarchical clustering algorithm has been applied to identify the X-ray diffraction (XRD) patterns from a high-throughput characterization of the combinatorial materials chips. As data quality is usually correlated with acquisition time, it is important to study the hierarchical clustering performance as a function of data quality in order to optimize the efficiency of high-throughput experiments. This work investigated the effects of signal-to-noise ratio on the performance of hierarchical clustering using 29 distance metrics for the XRD patterns from Fe−Co−Ni ternary combinatorial materials chip. It is found that the clustering accuracies evaluated by the F1 score only fluctuate slightly with signal-to-noise ratio varying from 15.5 to 22.3 (dB) under the experimental condition. This suggests that although it may take 40–50 s to collect a visually high-quality diffraction pattern, the measurement time could be significantly reduced to as low as 4 s without substantial loss in phase identification accuracy by hierarchical clustering. Among the 29 distance metrics, Pearson χ<sup>2</sup> shows the highest mean F1 score of 0.77 and lowest standard deviation of 0.008. It shows that the distance matrixes calculated by Pearson χ<sup>2</sup> are mainly controlled by the XRD peak shifting characteristics and visualized by the metric multidimensional data scaling.</p>","PeriodicalId":100889,"journal":{"name":"Materials Genome Engineering Advances","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/mgea.27","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Materials Genome Engineering Advances","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/mgea.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Hierarchical clustering algorithm has been applied to identify the X-ray diffraction (XRD) patterns from a high-throughput characterization of the combinatorial materials chips. As data quality is usually correlated with acquisition time, it is important to study the hierarchical clustering performance as a function of data quality in order to optimize the efficiency of high-throughput experiments. This work investigated the effects of signal-to-noise ratio on the performance of hierarchical clustering using 29 distance metrics for the XRD patterns from Fe−Co−Ni ternary combinatorial materials chip. It is found that the clustering accuracies evaluated by the F1 score only fluctuate slightly with signal-to-noise ratio varying from 15.5 to 22.3 (dB) under the experimental condition. This suggests that although it may take 40–50 s to collect a visually high-quality diffraction pattern, the measurement time could be significantly reduced to as low as 4 s without substantial loss in phase identification accuracy by hierarchical clustering. Among the 29 distance metrics, Pearson χ2 shows the highest mean F1 score of 0.77 and lowest standard deviation of 0.008. It shows that the distance matrixes calculated by Pearson χ2 are mainly controlled by the XRD peak shifting characteristics and visualized by the metric multidimensional data scaling.

Abstract Image

信噪比对组合库 X 射线衍射图样自动聚类的影响
分层聚类算法被用于识别组合材料芯片高通量表征中的 X 射线衍射 (XRD) 图样。由于数据质量通常与采集时间相关,因此研究作为数据质量函数的分层聚类性能对于优化高通量实验的效率非常重要。本研究针对铁-铜-镍三元组合材料芯片的 XRD 图谱,使用 29 个距离指标研究了信噪比对分层聚类性能的影响。研究发现,在实验条件下,以 F1 分数评估的聚类精度仅随信噪比在 15.5 到 22.3 (dB) 之间的变化而略有波动。这表明,虽然采集一个可视的高质量衍射图样可能需要 40-50 秒,但通过分层聚类,测量时间可大幅缩短至 4 秒,而相位识别精度不会有实质性损失。在 29 个距离度量中,Pearson χ2 的平均 F1 分数最高,为 0.77,标准偏差最小,为 0.008。这表明用 Pearson χ2 计算的距离矩阵主要受 XRD 峰移动特征的控制,并通过度量多维数据缩放可视化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信