Jessica Braun, Djahan Lamei, Philippe H Hünenberger, Gregory A Landrum, Sereina Riniker
{"title":"Torsion angular bin strings: algorithmic update and additional validation.","authors":"Jessica Braun, Djahan Lamei, Philippe H Hünenberger, Gregory A Landrum, Sereina Riniker","doi":"10.1186/s13321-026-01194-6","DOIUrl":null,"url":null,"abstract":"<p><p>In our previous work, we introduced the concept of torsion angular bin strings (TABS), which is a discrete vector representation of a conformer's torsional angles. Through this discretization, conformational states can be counted, yielding an estimate of the upper limit of the expected conformational ensemble size (nTABS). Besides nTABS being used as a quantitative measure of molecular flexibility, TABS itself is a way of grouping the conformers of a molecule without picking thresholds. This feature of TABS is especially valuable, as selecting suitable thresholds for metrics such as heavy-atom root-mean-square deviation (RMSD) or shape Tanimoto is highly system-dependent and can thus be challenging when working with large sets of molecules. Here, we describe the update to the nTABS algorithm of the TABS package since the last release. In addition, we present a classification study of conformer ensembles by TABS and compare it to classifications by a shape Tanimoto metric. Scientific contribution In contrast to our previous implementation, which handled molecular topological symmetry by enumerating all possible combinations that were simply permutations of one another, the new implementation treats TABS as mathematical objects governed by group theory, specifically Burnside's Lemma. This approach requires substantially less code and delivers a notable improvement in computational speed. The study also builds upon our previously developed framework for categorization comparisons between TABS and heavy-atom RMSD. Here, we show the results of a similar comparison with a shape Tanimoto metric, which further support the hypothesis that TABS encode the shape of conformers in a meaningful way.</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":" ","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2026-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1186/s13321-026-01194-6","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
In our previous work, we introduced the concept of torsion angular bin strings (TABS), which is a discrete vector representation of a conformer's torsional angles. Through this discretization, conformational states can be counted, yielding an estimate of the upper limit of the expected conformational ensemble size (nTABS). Besides nTABS being used as a quantitative measure of molecular flexibility, TABS itself is a way of grouping the conformers of a molecule without picking thresholds. This feature of TABS is especially valuable, as selecting suitable thresholds for metrics such as heavy-atom root-mean-square deviation (RMSD) or shape Tanimoto is highly system-dependent and can thus be challenging when working with large sets of molecules. Here, we describe the update to the nTABS algorithm of the TABS package since the last release. In addition, we present a classification study of conformer ensembles by TABS and compare it to classifications by a shape Tanimoto metric. Scientific contribution In contrast to our previous implementation, which handled molecular topological symmetry by enumerating all possible combinations that were simply permutations of one another, the new implementation treats TABS as mathematical objects governed by group theory, specifically Burnside's Lemma. This approach requires substantially less code and delivers a notable improvement in computational speed. The study also builds upon our previously developed framework for categorization comparisons between TABS and heavy-atom RMSD. Here, we show the results of a similar comparison with a shape Tanimoto metric, which further support the hypothesis that TABS encode the shape of conformers in a meaningful way.
期刊介绍:
Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling.
Coverage includes, but is not limited to:
chemical information systems, software and databases, and molecular modelling,
chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases,
computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.