{"title":"基于任务导向表示学习的单分子断结数据自动聚类","authors":"Yi-Heng Zhao, Shen-Wen Pang, Heng-Zhi Huang, Shao-Wen Wu, Shao-Hua Sun, Zhen-Bing Liu, Zhi-Chao Pan","doi":"10.1007/s12598-024-03089-7","DOIUrl":null,"url":null,"abstract":"<div><p>Clustering is a pivotal data analysis method for deciphering the charge transport properties of single molecules in break junction experiments. However, given the high dimensionality and variability of the data, feature extraction remains a bottleneck in the development of efficient clustering methods. In this regard, extensive research over the past two decades has focused on feature engineering and dimensionality reduction in break junction conductance. However, extracting highly relevant features without expert knowledge remains an unresolved challenge. To address this issue, we propose a deep clustering method driven by task-oriented representation learning (CTRL) in which the clustering module serves as a guide for the representation learning (RepL) module. First, we determine an optimal autoencoder (AE) structure through a neural architecture search (NAS) to ensure efficient RepL; second, the RepL process is guided by a joint training strategy that combines AE reconstruction loss with the clustering objective. The results demonstrate that CTRL achieves excellent performance on both the generated and experimental data. Further inspection of the RepL step reveals that joint training robustly learns more compact features than the unconstrained AE or traditional dimensionality reduction methods, significantly reducing misclustering possibilities. Our method provides a general end-to-end automatic clustering solution for analyzing single-molecule break junction data.</p><h3>Graphical Abstract</h3>\n<div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":749,"journal":{"name":"Rare Metals","volume":"44 5","pages":"3244 - 3257"},"PeriodicalIF":9.6000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic clustering of single-molecule break junction data through task-oriented representation learning\",\"authors\":\"Yi-Heng Zhao, Shen-Wen Pang, Heng-Zhi Huang, Shao-Wen Wu, Shao-Hua Sun, Zhen-Bing Liu, Zhi-Chao Pan\",\"doi\":\"10.1007/s12598-024-03089-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Clustering is a pivotal data analysis method for deciphering the charge transport properties of single molecules in break junction experiments. However, given the high dimensionality and variability of the data, feature extraction remains a bottleneck in the development of efficient clustering methods. In this regard, extensive research over the past two decades has focused on feature engineering and dimensionality reduction in break junction conductance. However, extracting highly relevant features without expert knowledge remains an unresolved challenge. To address this issue, we propose a deep clustering method driven by task-oriented representation learning (CTRL) in which the clustering module serves as a guide for the representation learning (RepL) module. First, we determine an optimal autoencoder (AE) structure through a neural architecture search (NAS) to ensure efficient RepL; second, the RepL process is guided by a joint training strategy that combines AE reconstruction loss with the clustering objective. The results demonstrate that CTRL achieves excellent performance on both the generated and experimental data. Further inspection of the RepL step reveals that joint training robustly learns more compact features than the unconstrained AE or traditional dimensionality reduction methods, significantly reducing misclustering possibilities. Our method provides a general end-to-end automatic clustering solution for analyzing single-molecule break junction data.</p><h3>Graphical Abstract</h3>\\n<div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>\",\"PeriodicalId\":749,\"journal\":{\"name\":\"Rare Metals\",\"volume\":\"44 5\",\"pages\":\"3244 - 3257\"},\"PeriodicalIF\":9.6000,\"publicationDate\":\"2025-02-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Rare Metals\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s12598-024-03089-7\",\"RegionNum\":1,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Rare Metals","FirstCategoryId":"88","ListUrlMain":"https://link.springer.com/article/10.1007/s12598-024-03089-7","RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
Automatic clustering of single-molecule break junction data through task-oriented representation learning
Clustering is a pivotal data analysis method for deciphering the charge transport properties of single molecules in break junction experiments. However, given the high dimensionality and variability of the data, feature extraction remains a bottleneck in the development of efficient clustering methods. In this regard, extensive research over the past two decades has focused on feature engineering and dimensionality reduction in break junction conductance. However, extracting highly relevant features without expert knowledge remains an unresolved challenge. To address this issue, we propose a deep clustering method driven by task-oriented representation learning (CTRL) in which the clustering module serves as a guide for the representation learning (RepL) module. First, we determine an optimal autoencoder (AE) structure through a neural architecture search (NAS) to ensure efficient RepL; second, the RepL process is guided by a joint training strategy that combines AE reconstruction loss with the clustering objective. The results demonstrate that CTRL achieves excellent performance on both the generated and experimental data. Further inspection of the RepL step reveals that joint training robustly learns more compact features than the unconstrained AE or traditional dimensionality reduction methods, significantly reducing misclustering possibilities. Our method provides a general end-to-end automatic clustering solution for analyzing single-molecule break junction data.
期刊介绍:
Rare Metals is a monthly peer-reviewed journal published by the Nonferrous Metals Society of China. It serves as a platform for engineers and scientists to communicate and disseminate original research articles in the field of rare metals. The journal focuses on a wide range of topics including metallurgy, processing, and determination of rare metals. Additionally, it showcases the application of rare metals in advanced materials such as superconductors, semiconductors, composites, and ceramics.