Domen Hudnik, Naja Bohanec, Igor Drobnak, Peter Ernst, Alexander Hanke, Matej Horvat, Franz Innerbichler, Miha Mikelj, Tilen Praper, Vasja Progar, Nika Valenčič, Matjaž Omladič
{"title":"自动峰标注和面积估计聚糖图峰直接从色谱","authors":"Domen Hudnik, Naja Bohanec, Igor Drobnak, Peter Ernst, Alexander Hanke, Matej Horvat, Franz Innerbichler, Miha Mikelj, Tilen Praper, Vasja Progar, Nika Valenčič, Matjaž Omladič","doi":"10.1002/cem.3521","DOIUrl":null,"url":null,"abstract":"<p>The present bottleneck in biosimilar bioprocess development has become evaluation of analytical results, due to recent advances in analytics, such as automated sample preparation and development of high-throughput methods. Currently automated chromatogram integration and annotation is only efficient for simple chromatograms. In an ever more competitive field of biosimilars, this represents a serious drawback because chromatographic analytical methods that provide some of the most valuable physicochemical quality attributes of the product also require careful chromatogram integration and annotation. This work focuses on the glycan mapping analytical method as utilized in development of monoclonal antibody biosimilars, evaluating more than 2000 chromatograms spanning the life cycle of multiple biosimilar development projects. It proposes a modified workflow by implementing automatic machine learning algorithms to determine the proportion of specific relevant glycan species in a sample directly from the chromatogram. Data preparation and analysis is performed using a pipeline approach. Pipeline is a modular design of data processing where signal “travels” through various active modules in a series. Each module performs a specific function or transformation on the signal and propagates the transformed signal to the next module. The pipeline is designed in a way that modules can be independently improved and exchanged. Module functions currently implemented are chromatogram resampling by spline interpolation, baseline removal by asymmetric least squares, peak alignment using parametric time warping, and quantification of the relative proportion of a glycan species using partial least squares regression. Hyper-parameters of the pipeline are then optimized using the Nelder–Mead method. The approach stands out for its ability to accommodate a broad landscape of samples, covering multiple different proteins in different stages of biosimilar development, analyzed using different adaptations of the glycan map analytical method. The pipeline presents an intuitive, flexible, and creatively simple method design capable of providing reliable results for a wide range of glycan species essential for biosimilar development. It enables transparent, faster, and less subjective evaluation of analytic raw data (from sample to result). Furthermore, our automated approach maintained an accuracy comparable with manual integration thus demonstrating its readiness for implementation in the conservative and highly regulated environment. The presented methodology reduces the cost and time of biosimilar development and should be applicable for any chromatogram-based analytical method.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"37 12","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic peak annotation and area estimation of glycan map peaks directly from chromatograms\",\"authors\":\"Domen Hudnik, Naja Bohanec, Igor Drobnak, Peter Ernst, Alexander Hanke, Matej Horvat, Franz Innerbichler, Miha Mikelj, Tilen Praper, Vasja Progar, Nika Valenčič, Matjaž Omladič\",\"doi\":\"10.1002/cem.3521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The present bottleneck in biosimilar bioprocess development has become evaluation of analytical results, due to recent advances in analytics, such as automated sample preparation and development of high-throughput methods. Currently automated chromatogram integration and annotation is only efficient for simple chromatograms. In an ever more competitive field of biosimilars, this represents a serious drawback because chromatographic analytical methods that provide some of the most valuable physicochemical quality attributes of the product also require careful chromatogram integration and annotation. This work focuses on the glycan mapping analytical method as utilized in development of monoclonal antibody biosimilars, evaluating more than 2000 chromatograms spanning the life cycle of multiple biosimilar development projects. It proposes a modified workflow by implementing automatic machine learning algorithms to determine the proportion of specific relevant glycan species in a sample directly from the chromatogram. Data preparation and analysis is performed using a pipeline approach. Pipeline is a modular design of data processing where signal “travels” through various active modules in a series. Each module performs a specific function or transformation on the signal and propagates the transformed signal to the next module. The pipeline is designed in a way that modules can be independently improved and exchanged. Module functions currently implemented are chromatogram resampling by spline interpolation, baseline removal by asymmetric least squares, peak alignment using parametric time warping, and quantification of the relative proportion of a glycan species using partial least squares regression. Hyper-parameters of the pipeline are then optimized using the Nelder–Mead method. The approach stands out for its ability to accommodate a broad landscape of samples, covering multiple different proteins in different stages of biosimilar development, analyzed using different adaptations of the glycan map analytical method. The pipeline presents an intuitive, flexible, and creatively simple method design capable of providing reliable results for a wide range of glycan species essential for biosimilar development. It enables transparent, faster, and less subjective evaluation of analytic raw data (from sample to result). Furthermore, our automated approach maintained an accuracy comparable with manual integration thus demonstrating its readiness for implementation in the conservative and highly regulated environment. The presented methodology reduces the cost and time of biosimilar development and should be applicable for any chromatogram-based analytical method.</p>\",\"PeriodicalId\":15274,\"journal\":{\"name\":\"Journal of Chemometrics\",\"volume\":\"37 12\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2023-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemometrics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cem.3521\",\"RegionNum\":4,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SOCIAL WORK\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemometrics","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cem.3521","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL WORK","Score":null,"Total":0}
Automatic peak annotation and area estimation of glycan map peaks directly from chromatograms
The present bottleneck in biosimilar bioprocess development has become evaluation of analytical results, due to recent advances in analytics, such as automated sample preparation and development of high-throughput methods. Currently automated chromatogram integration and annotation is only efficient for simple chromatograms. In an ever more competitive field of biosimilars, this represents a serious drawback because chromatographic analytical methods that provide some of the most valuable physicochemical quality attributes of the product also require careful chromatogram integration and annotation. This work focuses on the glycan mapping analytical method as utilized in development of monoclonal antibody biosimilars, evaluating more than 2000 chromatograms spanning the life cycle of multiple biosimilar development projects. It proposes a modified workflow by implementing automatic machine learning algorithms to determine the proportion of specific relevant glycan species in a sample directly from the chromatogram. Data preparation and analysis is performed using a pipeline approach. Pipeline is a modular design of data processing where signal “travels” through various active modules in a series. Each module performs a specific function or transformation on the signal and propagates the transformed signal to the next module. The pipeline is designed in a way that modules can be independently improved and exchanged. Module functions currently implemented are chromatogram resampling by spline interpolation, baseline removal by asymmetric least squares, peak alignment using parametric time warping, and quantification of the relative proportion of a glycan species using partial least squares regression. Hyper-parameters of the pipeline are then optimized using the Nelder–Mead method. The approach stands out for its ability to accommodate a broad landscape of samples, covering multiple different proteins in different stages of biosimilar development, analyzed using different adaptations of the glycan map analytical method. The pipeline presents an intuitive, flexible, and creatively simple method design capable of providing reliable results for a wide range of glycan species essential for biosimilar development. It enables transparent, faster, and less subjective evaluation of analytic raw data (from sample to result). Furthermore, our automated approach maintained an accuracy comparable with manual integration thus demonstrating its readiness for implementation in the conservative and highly regulated environment. The presented methodology reduces the cost and time of biosimilar development and should be applicable for any chromatogram-based analytical method.
期刊介绍:
The Journal of Chemometrics is devoted to the rapid publication of original scientific papers, reviews and short communications on fundamental and applied aspects of chemometrics. It also provides a forum for the exchange of information on meetings and other news relevant to the growing community of scientists who are interested in chemometrics and its applications. Short, critical review papers are a particularly important feature of the journal, in view of the multidisciplinary readership at which it is aimed.