Journal of Chemometrics最新文献

筛选
英文 中文
A Dynamic Iterative Data Cleaning Strategy Based on Model Feedback to Enhance the Prediction Accuracy of Nanocellulose Emulsions 基于模型反馈的动态迭代数据清洗策略提高纳米纤维素乳剂的预测精度
IF 2.3 4区 化学
Journal of Chemometrics Pub Date : 2025-07-12 DOI: 10.1002/cem.70046
Long Wang, Zi'ang Xia, Yao Zhang, Xiaoyu Liu, Chaojie Li, Xue Li, Jiahao Dai, Mingshun Bi, Jingxue Yang, Heng Zhang
{"title":"A Dynamic Iterative Data Cleaning Strategy Based on Model Feedback to Enhance the Prediction Accuracy of Nanocellulose Emulsions","authors":"Long Wang,&nbsp;Zi'ang Xia,&nbsp;Yao Zhang,&nbsp;Xiaoyu Liu,&nbsp;Chaojie Li,&nbsp;Xue Li,&nbsp;Jiahao Dai,&nbsp;Mingshun Bi,&nbsp;Jingxue Yang,&nbsp;Heng Zhang","doi":"10.1002/cem.70046","DOIUrl":"https://doi.org/10.1002/cem.70046","url":null,"abstract":"<div>\u0000 \u0000 <p>The effectiveness of artificial neural networks, which were key technologies in artificial intelligence, greatly depends on the quality of the input data. Data cleaning, a crucial component of data preprocessing, played a vital role in enhancing the accuracy, robustness, and generalization capabilities of neural network models. In this study, a Feedback-Driven Iterative Cleaning (FDIC) framework, guided by model performance, was developed and applied to the study of droplet size prediction models for nanocellulose-stabilized Pickering emulsion systems. After randomly removing between 1% and 40% of the data, an artificial neural network model was established using CNC particle size (X1), CNC concentration (X2), and the oil–water volume ratio of CNC to oil-phase monomer (X3) as input variables, with emulsion droplet size (Y) as the quantitative index. The model's accuracy was evaluated after data removal using the coefficient of determination (R<sup>2</sup>), mean squared error (MSE), and mean absolute scaling error (MASE). The main finding was that targeted removal of a small portion of the data significantly improved the predictive power of the model. Specifically, removing 5% of the dataset results in optimal performance, with <i>R</i><sup><i>2</i></sup> improving from 0.5307 without cleaning to 0.7258, with an MSE of 183.4917, and MASE of 0.4060. This result suggested a significant and quantifiable improvement in the accuracy of the model through our iterative cleaning process. The study revealed a nonlinear relationship between the number of iterations and the model's generalization ability. This finding offered a novel methodological tool for data governance in the smart era and demonstrates significant value in dynamic environments.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144606753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nondestructive Identification of Paper Based on Relative Formation Time Using Three-Dimensional Fluorescence Spectroscopy Combined With Supervised Learning 基于相对形成时间的三维荧光光谱与监督学习相结合的纸张无损识别
IF 2.3 4区 化学
Journal of Chemometrics Pub Date : 2025-07-11 DOI: 10.1002/cem.70043
Xiaohong Chen, Yuhuan He, Lan Cui, Hongda Li, Xiaojing Wu
{"title":"Nondestructive Identification of Paper Based on Relative Formation Time Using Three-Dimensional Fluorescence Spectroscopy Combined With Supervised Learning","authors":"Xiaohong Chen,&nbsp;Yuhuan He,&nbsp;Lan Cui,&nbsp;Hongda Li,&nbsp;Xiaojing Wu","doi":"10.1002/cem.70043","DOIUrl":"https://doi.org/10.1002/cem.70043","url":null,"abstract":"<div>\u0000 \u0000 <p>In order to achieve nondestructive analysis and identification of the relative formation time of paper evidence and to solve the difficulties in document authenticity identification in the field of forensic science, this study selected three-dimensional fluorescence spectroscopy data of paper evidence of the same brand and model collected in the same storage environment within the last decade (2012–2023). After preprocessing steps like eliminating scattering, smoothing noise and principal component analysis (PCA), machine learning algorithms such as <i>K</i>-nearest neighbor (KNN) and linear discriminant analysis (LDA) were employed to classify and predict specific feature bands. The accuracy of KNN and LDA was 94.5% and 98.9%, respectively. Furthermore, relative formation time prediction was conducted for paper samples by LDA in the sample library, achieving an accuracy rate of 98.0%. Finally, the established model was successfully applied to analyze an actual case involving suspected “forged official documents.” It accurately determined the relative formation time of the forged paper, and the analysis results were consistent with the suspect's confession.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
XAI-2DCOS: Enhancing Interpretability in Spectral Deep Learning Models Through 2D Correlation Spectroscopy XAI-2DCOS:通过二维相关光谱增强光谱深度学习模型的可解释性
IF 2.3 4区 化学
Journal of Chemometrics Pub Date : 2025-07-11 DOI: 10.1002/cem.70045
Jhonatan Contreras, Thomas Bocklitz
{"title":"XAI-2DCOS: Enhancing Interpretability in Spectral Deep Learning Models Through 2D Correlation Spectroscopy","authors":"Jhonatan Contreras,&nbsp;Thomas Bocklitz","doi":"10.1002/cem.70045","DOIUrl":"https://doi.org/10.1002/cem.70045","url":null,"abstract":"<p>Deep learning (DL) has significantly advanced Raman spectra analysis, achieving high accuracy and efficiency. However, their complexity and opacity limit their application in areas where understanding and transparency are essential. To address this, we present XAI-2DCOS, an innovative eXplainable Artificial Intelligence (XAI) framework that employs 2D correlation spectroscopy (2DCOS). Traditionally, 2DCOS reveals the sequence of molecular changes under varying conditions. We repurpose it to enhance the interpretability of DL models by linking changes in spectral features to model outputs, identifying critical wavenumbers, and how their variations affect model accuracy. We applied XAI-2DCOS to a DL model trained on a dataset of oil Raman spectra, demonstrating its ability to identify critical spectral features that align with domain knowledge. To improve robustness, we integrated a conditional generative adversarial network (CGAN) for data augmentation. CGAN generates synthetic data, ensuring the presence of spectra across the entire probability range. A normalized relevance score quantifies the contribution for each wavenumber to the model's prediction. A predictive probability map delineates decision boundaries within the PCA space. Synchronous 2DCOS maps are used to guide spectral adjustments that improve prediction confidence for specific class predictions. These adjustments can affect multiple output classes with differential scaling of output activations, suggesting that crossing a threshold can shift the model decision. Our results show that XAI-2DCOS improves the interpretability and reliability of DL models applied to Raman spectra. Furthermore, CGAN data augmentation extends the applicability of XAI-2DCOS to smaller datasets.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70045","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial: Honoring Prof. Age K. Smilde 社论:纪念Age K. Smilde教授
IF 2.3 4区 化学
Journal of Chemometrics Pub Date : 2025-07-10 DOI: 10.1002/cem.70052
Rasmus Bro
{"title":"Editorial: Honoring Prof. Age K. Smilde","authors":"Rasmus Bro","doi":"10.1002/cem.70052","DOIUrl":"https://doi.org/10.1002/cem.70052","url":null,"abstract":"&lt;p&gt;It is both a privilege and an emotional moment for me to write this editorial for the special issue of the &lt;i&gt;Journal of Chemometrics&lt;/i&gt; honoring Prof. Age K. Smilde, who recently retired. For me, and for countless others in our field, Prof. Smilde (also more informally know as Age) has been more than a scholar; he has been a mentor, a collaborator, and an inspiration whose contributions have left a huge mark on the world of chemometrics.&lt;/p&gt;&lt;p&gt;Looking back, it feels almost surreal to think of my early days in academia, 30 years ago, when I was navigating the complex world of multi-way tensor analysis. At the time, Age seemed to me to be the quintessential ‘all-knowing’ professor. His mastery of the field, combined with a willingness to mentor and nurture young scientists, made a profound difference in my career. I remember a conference where he explained the complexity of tensor rank. I quickly grasped the problem and slightly arrogantly said: I will fix it. I tried. I was very fast and 100% wrong. I never managed to make even the slightest progress!&lt;/p&gt;&lt;p&gt;He played a pivotal role in helping me craft some of my earliest papers, including one of the first approaches to tensor regression. Our discussions on the properties of multi-way arrays and their applications remain etched in my memory—not just as lessons in science, but as moments of shared curiosity.&lt;/p&gt;&lt;p&gt;Age's career is nothing short of extraordinary. From his foundational work at the University of Groningen to his tenure at the University of Amsterdam, where he led the group later known as Biosystems Data Analysis, Age has consistently been at the forefront of methodological advancements in not just chemometrics. His work on multi-way analysis, data integration, and systems biology has truly shaped the respective fields. It is no surprise that he has been honored with numerous awards, such as the prestigious Herman Wold Gold Medal and the Kowalski Award, reflecting his pioneering contributions and global recognition.&lt;/p&gt;&lt;p&gt;What sets Age apart, is his ability to foster collaboration and build bridges within the scientific community. He introduced me to some of the most significant researchers not only in chemometrics but also in psychometrics, widening my horizons and opening doors that would otherwise have remained closed. His efforts to create platforms for collaboration, such as co-founding TRICAP and contributing to international chemometric meetings, have enriched our discipline.&lt;/p&gt;&lt;p&gt;Reflecting on the arc of our careers, I cannot help but smile at the realization that the ‘old’ professor who once seemed so far ahead of me is, in fact, only a few years my senior. Time has a way of leveling us, and today I count Age as not only a colleague but also a dear friend and peer. His wisdom, humility, and warmth continue to inspire, and his legacy will undoubtedly endure through the countless students, collaborators, and researchers he has influenced.&lt;/p&gt;&lt;p&gt;This special issue is a testam","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70052","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144589613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate and Rational Collision Cross Section Prediction Using Voxel-Projected Area and Deep Learning 基于体素投影面积和深度学习的准确、合理的碰撞截面预测
IF 2.3 4区 化学
Journal of Chemometrics Pub Date : 2025-07-08 DOI: 10.1002/cem.70040
Jiongyu Wang, Yuxuan Liao, Ting Xie, Ruixi Chen, Jiahui Lai, Zhimin Zhang, Hongmei Lu
{"title":"Accurate and Rational Collision Cross Section Prediction Using Voxel-Projected Area and Deep Learning","authors":"Jiongyu Wang,&nbsp;Yuxuan Liao,&nbsp;Ting Xie,&nbsp;Ruixi Chen,&nbsp;Jiahui Lai,&nbsp;Zhimin Zhang,&nbsp;Hongmei Lu","doi":"10.1002/cem.70040","DOIUrl":"https://doi.org/10.1002/cem.70040","url":null,"abstract":"<div>\u0000 \u0000 <p>Ion mobility spectrometry–mass spectrometry (IMS-MS) enables rapid acquisition of collision cross section (CCS), a critical physicochemical property for analyte characterization. Despite CCS being theoretically defined as the rotationally averaged projected area of 3D atomic spheres, existing models have underutilized this geometric insight. Here, we present a projected area–based CCS prediction method (PACCS). It integrates voxel-projected area approximation, graph neural network (GNN)–extracted features, and <i>m/z</i> to achieve accurate and rational CCS prediction. A voxel-based algorithm efficiently calculates molecular projected areas by leveraging Fibonacci grids sampling and discretizing 3D conformers into voxel grids. PACCS demonstrates exceptional performance, achieving a median relative error (MedRE) of 1.03% and a coefficient of determination (<i>R</i><sup>2</sup>) of 0.994 on the test set. External test set against AllCCS2, GraphCCS, SigmaCCS, CCSbase, and DeepCCS highlights the superiority of PACCS, with 80.1% of predictions exhibiting &lt; 3% error. Notably, PACCS exhibits broad applicability across diverse molecular types, including environmental contaminants (<i>R</i><sup>2</sup> = 0.954–0.979) and structurally complex phycotoxins (<i>R</i><sup>2</sup> = 0.961), highlighting the superiority of PACCS in robustness and versatility. Computational efficiency is enhanced via parallelization, enabling large-scale CCS database generation (e.g., 5.9 million entries for ChEMBL within 10 h). Ablation studies confirm the pivotal role of voxel-projected areas (Pearson correlation coefficients &gt; 0.988), while stability analyses reveal minimal sensitivity to conformational variability (standard deviation of <i>R</i><sup>2</sup> is 0.00003). PACCS provides an open-source, scalable solution for expanding CCS databases, advancing compound identification in metabolomics and environmental analysis.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144574152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Frequency-Domain Alignment of Heterogeneous, Multidimensional Separations Data Through Complex Orthogonal Procrustes Analysis 基于复正交Procrustes分析的异构、多维分离数据频域对齐
IF 2.3 4区 化学
Journal of Chemometrics Pub Date : 2025-07-07 DOI: 10.1002/cem.70042
Michael Sorochan Armstrong
{"title":"Frequency-Domain Alignment of Heterogeneous, Multidimensional Separations Data Through Complex Orthogonal Procrustes Analysis","authors":"Michael Sorochan Armstrong","doi":"10.1002/cem.70042","DOIUrl":"https://doi.org/10.1002/cem.70042","url":null,"abstract":"<div>\u0000 \u0000 <p>Multidimensional separations data have the capacity to reveal detailed information about complex biological samples. However, data analysis has been an ongoing challenge in the area because the peaks that represent chemical factors may drift over the course of several analytical runs along the first- and second-dimension retention times. This makes higher level analyses of the data difficult, because a 1–1 comparison of samples is seldom possible without sophisticated preprocessing routines. This work offers a very simple solution to the alignment problem through an orthogonal Procrustes analysis of the frequency-domain representation of the data, which for each coefficient relative drift and amplitude are represented as a complex number. Its performance on synthetically generated data presenting nonlinear retention distortions is evaluated, in addition to its applicability to quantitative problems using experimental calibration, and untargeted metabolomics data. This analysis is extremely simple and can be recreated using just a few lines of code, relying only on fast algorithms for matrix multiplication and Fourier transforms.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 7","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70042","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144573560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Optimizing Radial Basis Function Support Vector Classifier (SO-RBFSVC) 自优化径向基函数支持向量分类器SO-RBFSVC
IF 2.3 4区 化学
Journal of Chemometrics Pub Date : 2025-05-26 DOI: 10.1002/cem.70038
Qudus Ayodeji Thanni, Peter de Boves Harrington
{"title":"Self-Optimizing Radial Basis Function Support Vector Classifier (SO-RBFSVC)","authors":"Qudus Ayodeji Thanni,&nbsp;Peter de Boves Harrington","doi":"10.1002/cem.70038","DOIUrl":"https://doi.org/10.1002/cem.70038","url":null,"abstract":"<p>Support vector classifiers (SVCs) typically use radial basis function (RBF) kernels to map data into higher dimensional spaces that may improve the linear separation of otherwise nonseparable classes. We present a novel self-optimizing radial basis function support vector classifier (SO-RBFSVC) that integrates response surface methodology (RSM), two-dimensional cubic spline interpolation, and bootstrapped Latin partitions (BLPs) for automated hyperparameter tuning. The SO-RBFSVC simultaneously optimizes the RBF kernel width (<i>σ</i>) and cost parameter (<i>C</i>) using an interpolated response surface obtained from generalized prediction accuracies. The SO-RBFSVC was compared to other self-optimizing classifiers (super SVC [sSVC] and super partial least squares discriminant analysis [sPLS-DA]). Four datasets were evaluated: (i) hemp and marijuana discrimination using proton nuclear magnetic resonance spectra, (ii) barley growth location using near-infrared spectra, (iii) glass-type identification based on elemental composition, and (iv) wine cultivar classification from physicochemical properties. External validation results showed that SO-RBFSVC performed comparably to the other models, achieving error rates of 0.4 ± 0.5% for hemp/marijuana, 7 ± 1% for glass, and 6 ± 1% for wine, while outperforming the linear models with 10 ± 1% error for the barley NIR data. For the first time, generalized sensitivity analysis (GSA) was applied to quantify model linearity. GSA revealed high nonlinearity in the barley dataset, justifying a nonlinear model. The SO-RBFSVC provides robust, automated classifier tuning for low- and high-dimensional datasets, offering ease of use.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 6","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70038","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144140398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Are Chemometric Models Validated? A Systematic Review of Linear Regression Models for NIRS Data in Food Analysis 如何验证化学计量学模型?食品近红外光谱分析数据线性回归模型的系统综述
IF 2.3 4区 化学
Journal of Chemometrics Pub Date : 2025-05-19 DOI: 10.1002/cem.70036
Jokin Ezenarro, Daniel Schorn-García
{"title":"How Are Chemometric Models Validated? A Systematic Review of Linear Regression Models for NIRS Data in Food Analysis","authors":"Jokin Ezenarro,&nbsp;Daniel Schorn-García","doi":"10.1002/cem.70036","DOIUrl":"https://doi.org/10.1002/cem.70036","url":null,"abstract":"<p>Chemometric models play a critical role in the spectroscopic analysis of food, particularly with near-infrared spectroscopy (NIRS), enabling the accurate prediction and monitoring of physicochemical properties. Although chemometric methods have proven to be useful tools in NIRS analysis, their reliability depends on rigorous validation to ensure the rigour of their predictions and their applicability. This systematic review examines validation strategies applied to regression models in NIRS-based food analysis, emphasising the use of cross-validation, external validation and figures of merit (FoM) as key evaluation tools. This comprehensive literature search identified trends in validation methodologies, highlighting frequent reliance on partial least squares (PLS) regression and common flaws in validation methodologies and their reporting. While external validation is considered the best approach, many studies lack it and employ cross-validation methods solely, which may lead to overoptimistic model performance estimates. Furthermore, inconsistencies in the selection and definition of FoM hinder direct comparison across studies. This review underscores the need for increased methodological transparency and rigour in the validation of chemometric models to enhance their reliability.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 6","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144085154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
De Novo Design of HIV-1 Integrase-LEDGF/p75 Inhibitors Through Deep Reinforcement Learning and Virtual Screening 基于深度强化学习和虚拟筛选的HIV-1整合酶- ledgf /p75抑制剂从头设计
IF 2.3 4区 化学
Journal of Chemometrics Pub Date : 2025-05-12 DOI: 10.1002/cem.70037
Hai-Bo Sun, Hai-Long Wu, Tong Wang, An-Qi Chen, Ru-Qin Yu
{"title":"De Novo Design of HIV-1 Integrase-LEDGF/p75 Inhibitors Through Deep Reinforcement Learning and Virtual Screening","authors":"Hai-Bo Sun,&nbsp;Hai-Long Wu,&nbsp;Tong Wang,&nbsp;An-Qi Chen,&nbsp;Ru-Qin Yu","doi":"10.1002/cem.70037","DOIUrl":"https://doi.org/10.1002/cem.70037","url":null,"abstract":"<div>\u0000 \u0000 <p>Human immunodeficiency virus (HIV) has far-reaching impacts on global public health. Acquired immunodeficiency syndrome (AIDS) has caused millions of deaths globally, with thousands still getting infected. Therefore, developing HIV-1 integrase inhibitors is crucial for controlling AIDS by slowing virus replication and transmission. This study is grounded in the framework of deep reinforcement learning, aiming to de novo design inhibitors of HIV-1 integrase-Lens Epithelial-Derived Growth Factor/p75 interaction and subsequently employing molecular docking to screen potential therapeutic compounds. Initially, a molecular generation model was established based on the long short-term memory algorithm and refined through transfer learning to obtain a preliminary generative model. Subsequently, the deep reinforcement learning strategy was employed, using inhibition activity as a reward value, enabling the model more likely to generate molecules with desirable properties. The results indicate that the reinforced generation model not only generates novel and effective SMILES structures with medicinal potential but also demonstrates strong binding affinity between the generated molecules and the target protein, as indicated by molecular docking experiments. Ultimately, through virtual screening, we identified six lead compounds having the potential to become inhibitors of interaction between Lens Epithelial-Derived Growth Factor/p75 and HIV-1 integrase, providing an effective and practical strategy for de novo drug design of HIV-1 integrase inhibitors.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 5","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143939411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Two-Parameter Estimation Technique for Handling Multicollinearity in Inverse Gaussian Regression Model 一种新的处理高斯反回归模型多重共线性的双参数估计技术
IF 2.3 4区 化学
Journal of Chemometrics Pub Date : 2025-05-08 DOI: 10.1002/cem.70032
Ishrat Riaz, Aamir Sanaullah, Mustafa M. Hasaballah, Oluwafemi Samson Balogun, Mahmoud E. Bakr
{"title":"A Novel Two-Parameter Estimation Technique for Handling Multicollinearity in Inverse Gaussian Regression Model","authors":"Ishrat Riaz,&nbsp;Aamir Sanaullah,&nbsp;Mustafa M. Hasaballah,&nbsp;Oluwafemi Samson Balogun,&nbsp;Mahmoud E. Bakr","doi":"10.1002/cem.70032","DOIUrl":"https://doi.org/10.1002/cem.70032","url":null,"abstract":"<div>\u0000 \u0000 <p>This study focuses on the prevalent issue of multicollinearity in the inverse Gaussian regression model (IGRM), which arises when predictor variables have a high degree of correlation. The typical maximum likelihood estimator (MLE) proves to be highly unstable when dealing with linearly linked regressors. Eventually, the accuracy of the model may suffer because of inflated variances and inaccurate coefficient estimates. To improve parameter estimation accuracy and combat multicollinearity, this paper suggests an alternative biased estimator for the IGRM that integrates a two-parameter framework. This novel two-parameter estimator is a general estimator that takes the maximum likelihood, ridge, and Stein estimators as special cases. The theoretical characteristics of the estimator, including its bias and mean squared error (MSE), are develop and then go through a thorough theoretical comparison with the previous estimators in terms of the mean square error matrix (MMSE) criterion. Moreover, the optimal values of the biasing parameters for the advised estimator are also obtained. An extensive simulated study and real-world dataset are examined to assess the practical relevance of the proposed estimator. The empirical results show that, in comparison to conventional estimators, including MLE, ridge, and Stein estimators, the suggested estimator considerably lowers the MSE and improves the parameter estimation accuracy. These results illustrate the novel approach's potential for dealing with multicollinearity in IGRM. The continuous development of reliable estimating methods for generalized linear models (GLMs) is aided by these findings.</p>\u0000 </div>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 5","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143925881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信