Raven L. Buckman Johnson, Hark Karkee and Alexander Gundlach-Graham
{"title":"Two-stage semi-supervised machine learning for classification of Ti-rich nanoparticles and microparticles measured by spICP-TOFMS†","authors":"Raven L. Buckman Johnson, Hark Karkee and Alexander Gundlach-Graham","doi":"10.1039/D5JA00108K","DOIUrl":null,"url":null,"abstract":"<p >Single-particle inductively coupled plasma time-of-flight mass spectrometry (spICP-TOFMS) can be used to measure metal-containing nanoparticles (NPs) and sub-micron particles (μPs) at environmentally relevant concentrations. Multielement fingerprints measured by spICP-TOFMS can also be used to differentiate natural and anthropogenic particle types. Thus, the approach offers a promising route to classify, quantify, and track anthropogenic NPs and μPs in natural systems. However, biases in spICP-TOFMS data caused by analytical sensitivities, Poisson detection statistics, and elemental variability at the single-particle level complicate particle-type classification. To overcome the inherent bias in spICP-TOFMS data for the classification of particle types, we have developed a multi-stage semi-supervised machine learning (SSML) strategy that identifies and subsequently trains on systematic noise in spICP-TOFMS data to produce more robust particle-type classifications. Here, we apply our two-stage SSML model to classify individual Ti-containing NPs and μPs <em>via</em> spICP-TOFMS analysis. To build our model, we measure neat suspensions of anthropogenic TiO<small><sub>2</sub></small> particles (E171) and natural titanium-containing particle types: rutile, ilmenite, and biotite by spICP-TOFMS. Element mass amounts recorded per particle are used to classify particle type by SSML and then systematic particle misclassifications are identified and recorded as uncertainty classes. Following, a second SSML model is trained with the addition of uncertain particle-type categories. With two-stage SSML, we demonstrate low false-positive rates (≤5%) and moderate particle recoveries (50–90%) for all anthropogenic and natural particle types. Two-stage SSML is a streamlined, hands-off method to identify and overcome bias in spICP-TOFMS training data that provides a robust particle-type classification.</p>","PeriodicalId":81,"journal":{"name":"Journal of Analytical Atomic Spectrometry","volume":" 7","pages":" 1658-1665"},"PeriodicalIF":3.1000,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/ja/d5ja00108k?page=search","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Analytical Atomic Spectrometry","FirstCategoryId":"92","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/ja/d5ja00108k","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Single-particle inductively coupled plasma time-of-flight mass spectrometry (spICP-TOFMS) can be used to measure metal-containing nanoparticles (NPs) and sub-micron particles (μPs) at environmentally relevant concentrations. Multielement fingerprints measured by spICP-TOFMS can also be used to differentiate natural and anthropogenic particle types. Thus, the approach offers a promising route to classify, quantify, and track anthropogenic NPs and μPs in natural systems. However, biases in spICP-TOFMS data caused by analytical sensitivities, Poisson detection statistics, and elemental variability at the single-particle level complicate particle-type classification. To overcome the inherent bias in spICP-TOFMS data for the classification of particle types, we have developed a multi-stage semi-supervised machine learning (SSML) strategy that identifies and subsequently trains on systematic noise in spICP-TOFMS data to produce more robust particle-type classifications. Here, we apply our two-stage SSML model to classify individual Ti-containing NPs and μPs via spICP-TOFMS analysis. To build our model, we measure neat suspensions of anthropogenic TiO2 particles (E171) and natural titanium-containing particle types: rutile, ilmenite, and biotite by spICP-TOFMS. Element mass amounts recorded per particle are used to classify particle type by SSML and then systematic particle misclassifications are identified and recorded as uncertainty classes. Following, a second SSML model is trained with the addition of uncertain particle-type categories. With two-stage SSML, we demonstrate low false-positive rates (≤5%) and moderate particle recoveries (50–90%) for all anthropogenic and natural particle types. Two-stage SSML is a streamlined, hands-off method to identify and overcome bias in spICP-TOFMS training data that provides a robust particle-type classification.