基于任务导向表示学习的单分子断结数据自动聚类

IF 9.6 1区 材料科学 Q1 MATERIALS SCIENCE, MULTIDISCIPLINARY
Yi-Heng Zhao, Shen-Wen Pang, Heng-Zhi Huang, Shao-Wen Wu, Shao-Hua Sun, Zhen-Bing Liu, Zhi-Chao Pan
{"title":"基于任务导向表示学习的单分子断结数据自动聚类","authors":"Yi-Heng Zhao,&nbsp;Shen-Wen Pang,&nbsp;Heng-Zhi Huang,&nbsp;Shao-Wen Wu,&nbsp;Shao-Hua Sun,&nbsp;Zhen-Bing Liu,&nbsp;Zhi-Chao Pan","doi":"10.1007/s12598-024-03089-7","DOIUrl":null,"url":null,"abstract":"<div><p>Clustering is a pivotal data analysis method for deciphering the charge transport properties of single molecules in break junction experiments. However, given the high dimensionality and variability of the data, feature extraction remains a bottleneck in the development of efficient clustering methods. In this regard, extensive research over the past two decades has focused on feature engineering and dimensionality reduction in break junction conductance. However, extracting highly relevant features without expert knowledge remains an unresolved challenge. To address this issue, we propose a deep clustering method driven by task-oriented representation learning (CTRL) in which the clustering module serves as a guide for the representation learning (RepL) module. First, we determine an optimal autoencoder (AE) structure through a neural architecture search (NAS) to ensure efficient RepL; second, the RepL process is guided by a joint training strategy that combines AE reconstruction loss with the clustering objective. The results demonstrate that CTRL achieves excellent performance on both the generated and experimental data. Further inspection of the RepL step reveals that joint training robustly learns more compact features than the unconstrained AE or traditional dimensionality reduction methods, significantly reducing misclustering possibilities. Our method provides a general end-to-end automatic clustering solution for analyzing single-molecule break junction data.</p><h3>Graphical Abstract</h3>\n<div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":749,"journal":{"name":"Rare Metals","volume":"44 5","pages":"3244 - 3257"},"PeriodicalIF":9.6000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic clustering of single-molecule break junction data through task-oriented representation learning\",\"authors\":\"Yi-Heng Zhao,&nbsp;Shen-Wen Pang,&nbsp;Heng-Zhi Huang,&nbsp;Shao-Wen Wu,&nbsp;Shao-Hua Sun,&nbsp;Zhen-Bing Liu,&nbsp;Zhi-Chao Pan\",\"doi\":\"10.1007/s12598-024-03089-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Clustering is a pivotal data analysis method for deciphering the charge transport properties of single molecules in break junction experiments. However, given the high dimensionality and variability of the data, feature extraction remains a bottleneck in the development of efficient clustering methods. In this regard, extensive research over the past two decades has focused on feature engineering and dimensionality reduction in break junction conductance. However, extracting highly relevant features without expert knowledge remains an unresolved challenge. To address this issue, we propose a deep clustering method driven by task-oriented representation learning (CTRL) in which the clustering module serves as a guide for the representation learning (RepL) module. First, we determine an optimal autoencoder (AE) structure through a neural architecture search (NAS) to ensure efficient RepL; second, the RepL process is guided by a joint training strategy that combines AE reconstruction loss with the clustering objective. The results demonstrate that CTRL achieves excellent performance on both the generated and experimental data. Further inspection of the RepL step reveals that joint training robustly learns more compact features than the unconstrained AE or traditional dimensionality reduction methods, significantly reducing misclustering possibilities. Our method provides a general end-to-end automatic clustering solution for analyzing single-molecule break junction data.</p><h3>Graphical Abstract</h3>\\n<div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>\",\"PeriodicalId\":749,\"journal\":{\"name\":\"Rare Metals\",\"volume\":\"44 5\",\"pages\":\"3244 - 3257\"},\"PeriodicalIF\":9.6000,\"publicationDate\":\"2025-02-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Rare Metals\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s12598-024-03089-7\",\"RegionNum\":1,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Rare Metals","FirstCategoryId":"88","ListUrlMain":"https://link.springer.com/article/10.1007/s12598-024-03089-7","RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

聚类是破译断结实验中单分子电荷输运性质的关键数据分析方法。然而,由于数据的高维性和可变性,特征提取仍然是开发高效聚类方法的瓶颈。在这方面,在过去的二十年中,广泛的研究集中在特征工程和断结电导的降维上。然而,在没有专家知识的情况下提取高度相关的特征仍然是一个未解决的挑战。为了解决这个问题,我们提出了一种由面向任务的表示学习(CTRL)驱动的深度聚类方法,其中聚类模块作为表示学习(RepL)模块的指南。首先,我们通过神经结构搜索(NAS)确定了最优的自编码器(AE)结构,以确保高效的RepL;其次,采用声发射重建损失与聚类目标相结合的联合训练策略指导RepL过程。结果表明,CTRL在生成数据和实验数据上都取得了良好的性能。对RepL步骤的进一步研究表明,联合训练比无约束声发射或传统降维方法鲁棒地学习了更多的紧凑特征,显著降低了错误聚类的可能性。我们的方法为分析单分子断裂结数据提供了一个通用的端到端自动聚类解决方案。图形抽象
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automatic clustering of single-molecule break junction data through task-oriented representation learning

Clustering is a pivotal data analysis method for deciphering the charge transport properties of single molecules in break junction experiments. However, given the high dimensionality and variability of the data, feature extraction remains a bottleneck in the development of efficient clustering methods. In this regard, extensive research over the past two decades has focused on feature engineering and dimensionality reduction in break junction conductance. However, extracting highly relevant features without expert knowledge remains an unresolved challenge. To address this issue, we propose a deep clustering method driven by task-oriented representation learning (CTRL) in which the clustering module serves as a guide for the representation learning (RepL) module. First, we determine an optimal autoencoder (AE) structure through a neural architecture search (NAS) to ensure efficient RepL; second, the RepL process is guided by a joint training strategy that combines AE reconstruction loss with the clustering objective. The results demonstrate that CTRL achieves excellent performance on both the generated and experimental data. Further inspection of the RepL step reveals that joint training robustly learns more compact features than the unconstrained AE or traditional dimensionality reduction methods, significantly reducing misclustering possibilities. Our method provides a general end-to-end automatic clustering solution for analyzing single-molecule break junction data.

Graphical Abstract

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Rare Metals
Rare Metals 工程技术-材料科学:综合
CiteScore
12.10
自引率
12.50%
发文量
2919
审稿时长
2.7 months
期刊介绍: Rare Metals is a monthly peer-reviewed journal published by the Nonferrous Metals Society of China. It serves as a platform for engineers and scientists to communicate and disseminate original research articles in the field of rare metals. The journal focuses on a wide range of topics including metallurgy, processing, and determination of rare metals. Additionally, it showcases the application of rare metals in advanced materials such as superconductors, semiconductors, composites, and ceramics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信