Artificial intelligence analysis of FTIR and CD spectroscopic data for predicting and quantifying the length and content of protein secondary structures
{"title":"Artificial intelligence analysis of FTIR and CD spectroscopic data for predicting and quantifying the length and content of protein secondary structures","authors":"P. Haris, J. A. Hering","doi":"10.3233/BSI-210210","DOIUrl":null,"url":null,"abstract":"Besides NMR and X-ray crystallography, FTIR and CD spectroscopy are widely considered to be useful for determining protein secondary structure. These techniques can be used to obtain data in few minutes, using small quantities of proteins, which make them amenable for proteomics research. Here we explore the possibility of using artificial intelligence techniques to simultaneously analyse both FTIR and CD spectroscopic data for an identical set of proteins. Neural network analysis was carried out on normalised regions of FTIR (1700-1600 cm−1) and CD (180-259 nm) spectral data both with and without boxcar averaging in order to quantify the average length and percentages of secondary structures. A hybrid genetic algorithm/neural network approach, that automatically selects structure-sensitive wavelength/frequency, was used for the quantification of the protein secondary structure. Using this algorithm we also successfully identified the region of the CD spectrum that contains the most structure-sensitive information. This was located between 214-251 nm, suggesting that this region alone may be sufficient to rapidly determine the secondary structure content from CD spectral data. Overall, CD spectroscopic analysis produced better results compared to FTIR spectroscopy when selected wavelengths were used, although FTIR was better when the entire region between 1700-1600 cm−1 (FTIR), and 180-259 nm (CD), was subjected to neural network analysis. Application of Adaptive Neuro-Fuzzy Inference System (ANFIS) with fuzzy subtractive clustering for the analysis of the spectral data led to a slightly better prediction of the average helix/sheet length for FTIR spectroscopy compared to CD. Our findings reveal the potential of using artificial intelligence techniques for not only extracting structural information but also for better understanding of the relationship between complex spectral data and biologically important information.","PeriodicalId":44239,"journal":{"name":"Biomedical Spectroscopy and Imaging","volume":"1 1","pages":"1-7"},"PeriodicalIF":0.3000,"publicationDate":"2021-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3233/BSI-210210","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Spectroscopy and Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/BSI-210210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"SPECTROSCOPY","Score":null,"Total":0}
引用次数: 0
Abstract
Besides NMR and X-ray crystallography, FTIR and CD spectroscopy are widely considered to be useful for determining protein secondary structure. These techniques can be used to obtain data in few minutes, using small quantities of proteins, which make them amenable for proteomics research. Here we explore the possibility of using artificial intelligence techniques to simultaneously analyse both FTIR and CD spectroscopic data for an identical set of proteins. Neural network analysis was carried out on normalised regions of FTIR (1700-1600 cm−1) and CD (180-259 nm) spectral data both with and without boxcar averaging in order to quantify the average length and percentages of secondary structures. A hybrid genetic algorithm/neural network approach, that automatically selects structure-sensitive wavelength/frequency, was used for the quantification of the protein secondary structure. Using this algorithm we also successfully identified the region of the CD spectrum that contains the most structure-sensitive information. This was located between 214-251 nm, suggesting that this region alone may be sufficient to rapidly determine the secondary structure content from CD spectral data. Overall, CD spectroscopic analysis produced better results compared to FTIR spectroscopy when selected wavelengths were used, although FTIR was better when the entire region between 1700-1600 cm−1 (FTIR), and 180-259 nm (CD), was subjected to neural network analysis. Application of Adaptive Neuro-Fuzzy Inference System (ANFIS) with fuzzy subtractive clustering for the analysis of the spectral data led to a slightly better prediction of the average helix/sheet length for FTIR spectroscopy compared to CD. Our findings reveal the potential of using artificial intelligence techniques for not only extracting structural information but also for better understanding of the relationship between complex spectral data and biologically important information.
期刊介绍:
Biomedical Spectroscopy and Imaging (BSI) is a multidisciplinary journal devoted to the timely publication of basic and applied research that uses spectroscopic and imaging techniques in different areas of life science including biology, biochemistry, biotechnology, bionanotechnology, environmental science, food science, pharmaceutical science, physiology and medicine. Scientists are encouraged to submit their work for publication in the form of original articles, brief communications, rapid communications, reviews and mini-reviews. Techniques covered include, but are not limited, to the following: • Vibrational Spectroscopy (Infrared, Raman, Teraherz) • Circular Dichroism Spectroscopy • Magnetic Resonance Spectroscopy (NMR, ESR) • UV-vis Spectroscopy • Mössbauer Spectroscopy • X-ray Spectroscopy (Absorption, Emission, Photoelectron, Fluorescence) • Neutron Spectroscopy • Mass Spectroscopy • Fluorescence Spectroscopy • X-ray and Neutron Scattering • Differential Scanning Calorimetry • Atomic Force Microscopy • Surface Plasmon Resonance • Magnetic Resonance Imaging • X-ray Imaging • Electron Imaging • Neutron Imaging • Raman Imaging • Infrared Imaging • Terahertz Imaging • Fluorescence Imaging • Near-infrared spectroscopy.