Implementing tile-based fisher ratio analysis of two-dimensional gas chromatography time-of-flight mass spectrometry data to obtain a master peak table of all detected analyte compounds in many petroleum-based samples
Rachel C. Halvorsen , Wenjing Ma , Caitlin N. Cain , Hep Ingham , Rachel E. Mohler , Robert E. Synovec
{"title":"Implementing tile-based fisher ratio analysis of two-dimensional gas chromatography time-of-flight mass spectrometry data to obtain a master peak table of all detected analyte compounds in many petroleum-based samples","authors":"Rachel C. Halvorsen , Wenjing Ma , Caitlin N. Cain , Hep Ingham , Rachel E. Mohler , Robert E. Synovec","doi":"10.1016/j.jcoa.2025.100249","DOIUrl":null,"url":null,"abstract":"<div><div>Historically, tile-based Fisher ratio (F-ratio) analysis of comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GC × GC-TOFMS) data was developed for analysts to use a supervised experimental design with defined sample classes to obtain a hit list to discover analytes that most significantly distinguish the sample classes at the top of the hit list. In this traditional application, a user-specified F-ratio threshold is used to discard most hits in order to focus on the top hits. To broaden the scope of tile-based F-ratio analysis, in the present study we explore the ability of the software to discover all analyte components that are detected in a set of samples, essentially taking full advantage of the tiling aspect of the software which uncovers all analytes that exhibit sufficient signal relative to the baseline noise across all samples to be deemed detectable and hence to produce an F-ratio. For this study a set of nine petroleum samples, i.e., two hydrobates (light naphthas), two reformates, four naphthas, and a “heavy” gasoline, are simultaneously analyzed and statistically compared via p-testing to blank chromatograms to produce one comprehensive hit list. The pin locations and signal areas at the top <em>m/z</em> F-ratio are used together with replicate blanks to generate a master peak table (MPT) that in turn is used to generate sample-specific peak tables (SSPT), one SSPT for each injection replicate of each petroleum sample (class), that are naturally retention-time aligned via the F-ratio software. The nine petroleum samples vary to a large extent in the identity and number of analytes present. Indeed, while a total of ∼715 analytes were found across all nine samples, only ∼260 of these analytes are fully shared across all sample classes. The number of analytes in the nine petroleum samples ranged from an average of 335 analytes for one of the hydrobates to 669 analytes for two of the naphthas. This workflow also facilitated generating simulated distillation curves for the nine petroleum samples to provide further insight.</div></div>","PeriodicalId":93576,"journal":{"name":"Journal of chromatography open","volume":"8 ","pages":"Article 100249"},"PeriodicalIF":3.2000,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of chromatography open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772391725000477","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Historically, tile-based Fisher ratio (F-ratio) analysis of comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GC × GC-TOFMS) data was developed for analysts to use a supervised experimental design with defined sample classes to obtain a hit list to discover analytes that most significantly distinguish the sample classes at the top of the hit list. In this traditional application, a user-specified F-ratio threshold is used to discard most hits in order to focus on the top hits. To broaden the scope of tile-based F-ratio analysis, in the present study we explore the ability of the software to discover all analyte components that are detected in a set of samples, essentially taking full advantage of the tiling aspect of the software which uncovers all analytes that exhibit sufficient signal relative to the baseline noise across all samples to be deemed detectable and hence to produce an F-ratio. For this study a set of nine petroleum samples, i.e., two hydrobates (light naphthas), two reformates, four naphthas, and a “heavy” gasoline, are simultaneously analyzed and statistically compared via p-testing to blank chromatograms to produce one comprehensive hit list. The pin locations and signal areas at the top m/z F-ratio are used together with replicate blanks to generate a master peak table (MPT) that in turn is used to generate sample-specific peak tables (SSPT), one SSPT for each injection replicate of each petroleum sample (class), that are naturally retention-time aligned via the F-ratio software. The nine petroleum samples vary to a large extent in the identity and number of analytes present. Indeed, while a total of ∼715 analytes were found across all nine samples, only ∼260 of these analytes are fully shared across all sample classes. The number of analytes in the nine petroleum samples ranged from an average of 335 analytes for one of the hydrobates to 669 analytes for two of the naphthas. This workflow also facilitated generating simulated distillation curves for the nine petroleum samples to provide further insight.