Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics最新文献

筛选
英文 中文
Machine Learning Classification of Antimicrobial Peptides Using Reduced Alphabets 使用简化字母的抗菌肽机器学习分类
M. Othman, Sujay Ratna, Anant Tewari, Anthony M. Kang, I. Vaisman
{"title":"Machine Learning Classification of Antimicrobial Peptides Using Reduced Alphabets","authors":"M. Othman, Sujay Ratna, Anant Tewari, Anthony M. Kang, I. Vaisman","doi":"10.1145/3233547.3233657","DOIUrl":"https://doi.org/10.1145/3233547.3233657","url":null,"abstract":"Antimicrobial peptides (AMPs) are being considered as a promising replacement for antibiotics. They take action in the bodies' adaptive immune system. While its effect inside the body is primarily known, a problem of correctly identifying AMPs based on their sequence features remains a subject of active investigations. Here we optimize the use of the reduced alphabet, simplify 20-letter amino acid alphabet to 2-4 letters, and the use of N-grams, short strings of amino acids, to find a correlation between a profile of N-gram frequencies. The calculations were carried out using java programs written for this study and WEKA machine learning software. Classification using machine learning methods was then conducted for AMP subclasses, including antibacterial, antifungal, and antiviral peptides. The results show that reduced alphabets with N-gram frequency analysis are a promising alternative in the area of AMP classification and prediction. All AMP sequences were retrieved from different sources. AMP set consists of 7984 sequences, not necessarily of any specific class. We also used class-specific AMP sets (antibacterial, antiviral, and antifungal). A raw negative set consisting of 20258 non-AMPs using sequence fragments from annotated protein sequence databases. The classification of AMPs against non-AMPs was successful. Models achieved maximum accuracy of 87.71% using frequency N-gram analysis, alphabet reduction option 47, and the RF model with 10 trees cross-validation. Classification using more specific classes of AMPs was conducted next. First, classification of ABPs against non-ABPs AMPs achieved maximum accuracy of 86.83% using frequency N-gram analysis, alphabet reduction option 47, and RF model, while with bagging algorithm 84.35%. Second, classification of AVPs against non-AVP AMPs achieved an accuracy of 92.75% and 92.30% using frequency N-gram analysis, alphabet reduction option 47 and 29 respectively, and with RF model. This experiment also consisted of many other successful trials. RF significantly outperforms each of the other six learning algorithms. Alphabet reduction 47 most often yielded the highest classification accuracies. This finding implies that 4-cluster alphabet is optimal for N-gram frequency analysis and machine learning. Our results suggest that the classifiers produced possess great predictive power and can be of significant use in various biological and medical applications, potentially saving tens or hundreds of thousands of lives.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122509419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
KmerEstimate: A Streaming Algorithm for Estimating k-mer Counts with Optimal Space Usage KmerEstimate:一种估算具有最佳空间使用的k-mer计数的流算法
S. Behera, Sutanu Gayen, J. Deogun, N. V. Vinodchandran
{"title":"KmerEstimate: A Streaming Algorithm for Estimating k-mer Counts with Optimal Space Usage","authors":"S. Behera, Sutanu Gayen, J. Deogun, N. V. Vinodchandran","doi":"10.1145/3233547.3233587","DOIUrl":"https://doi.org/10.1145/3233547.3233587","url":null,"abstract":"The frequency distribution of k-mers (substrings of length k in a DNA/RNA sequence) is very useful for many bioinformatics applications that use next-generation sequencing (NGS) data. Some examples of these include de Bruijn graph based assembly, read error correction, genome size prediction, and digital normalization. In developing tools for such applications, counting (or estimating) k-mers with low frequency is a pre-processing phase. However, computing k-mer frequency histogram becomes computationally challenging for large-scale genomic data. We present KmerEstimate, a em streaming algorithm that approximates the count of k-mers with a given frequency in a genomic data set. Our algorithm is based on a well known adaptive sampling based streaming algorithm due to Bar-Yossef et al. for approximating distinct elements in a data stream. We implemented and tested our algorithm on several data sets. The results of our algorithm are better than that of other streaming approaches used so far for this problem (notably $ntCard$, the state-of-the-art streaming approach) and is within 0.6% error rate. It uses less memory than $ntCard$ as the sample size is almost 85% less than that of $ntCard$. In addition, our algorithm has provable approximation and space usage guarantees. We also show certain space complexity lower bounds. The source code of our algorithm is available at urlhttps://github.com/srbehera11/KmerEstimate. We present KmerEstimate, a em streaming algorithm that approximates the count of k-mers with a given frequency in a genomic data set. Our algorithm is based on a well known adaptive sampling based streaming algorithm due to Bar-Yossef et al. for approximating distinct elements in a data stream. We implemented and tested our algorithm on several data sets. The results of our algorithm are better than that of other streaming approaches used so far for this problem (notably $ntCard$, the state-of-the-art streaming approach) and are within 0.6% error rate. It uses less memory than $ntCard$ as the sample size is almost 85% less than that of $ntCard$. In addition, our algorithm has provable approximation and space usage guarantees. We also show certain space complexity lower bounds. The source code of our algorithm is available at urlhttps://github.com/srbehera11/KmerEstimate.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122982487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Identification of Non-invasive Cytokine Biomarkers for Polycystic Ovary Syndrome Using Supervised Machine Learning 使用监督机器学习鉴定多囊卵巢综合征的非侵入性细胞因子生物标志物
D. S. Perry, J. Gunawardena, N. Orsi
{"title":"Identification of Non-invasive Cytokine Biomarkers for Polycystic Ovary Syndrome Using Supervised Machine Learning","authors":"D. S. Perry, J. Gunawardena, N. Orsi","doi":"10.1145/3233547.3233611","DOIUrl":"https://doi.org/10.1145/3233547.3233611","url":null,"abstract":"Polycystic ovary syndrome (PCOS) is a common endocrine disorder that affects up to 20% of women, however diagnosis is commonly unreliable and un-quantitative. Here we use supervised machine learning and measurements of 51 cytokines from a large cohort of patients to identify a low-dimensional set of potential biomarkers for diagnosis of PCOS. Both whole blood and individual follicular fluid (FF) aspirates were collected women during pre- intracytoplasmic sperm injection with in vitro fertilization (ICSI/IVF) oocyte retrieval and linked with patients' PCOS status as diagnosed by the Rotterdam criteria (n = 69 PCOS, n = 222 non-PCOS). We trained a binary support vector machine (SVM) using a random subset of patient data to determine cytokine profile associated with PCOS. Our resultant model includes 3 variables and is 76% accurate. This provides insight into the immunological basis of PCOS and may define a potential non-invasive quantitative strategy for diagnosis.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125157505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Prediction of Clinical Outcomes of Spinal Muscular Atrophy Using Motion Tracking Data and Elastic Net Regression 用运动追踪数据和弹性网回归预测脊髓性肌萎缩症的临床结果
David Chen, S. Rust, Enju Lin, Simon M. Lin, Leslie Nelson, L. Alfano, L. Lowes
{"title":"Prediction of Clinical Outcomes of Spinal Muscular Atrophy Using Motion Tracking Data and Elastic Net Regression","authors":"David Chen, S. Rust, Enju Lin, Simon M. Lin, Leslie Nelson, L. Alfano, L. Lowes","doi":"10.1145/3233547.3233572","DOIUrl":"https://doi.org/10.1145/3233547.3233572","url":null,"abstract":"Spinal muscular atrophy (SMA) is a common muscle disease that can lead to high rate of infant mortality. It is important to be able to quickly and accurately diagnose SMAs as well as track disease progression throughout the treatment process. This study introduced a framework for deriving movement features from motion tracking data, and applied a regularized regression method to predict the gold standard clinical measures for SMA, the CHOP INTEND Extremities Scores (CIES). Our results showed the CIES could be predicted with good accuracy using derived motion features and Elastic Net regression. An RMSE of 8.5 points on CIES was achieved in both cross-validation and prediction on the held-out set. A high ROC-AUC of 0.91 was achieved for discriminating SMA infants from Controls on both session and subject levels. It was concluded that motion tracking devices could potentially be used as a low-cost yet effective method to assess and monitor infants with SMA.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115215407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detecting Chromosomal Inversions from Dense SNPs by Combining PCA and Association Tests 结合PCA和关联试验检测密集snp的染色体倒位
R. J. Nowling, S. Emrich
{"title":"Detecting Chromosomal Inversions from Dense SNPs by Combining PCA and Association Tests","authors":"R. J. Nowling, S. Emrich","doi":"10.1145/3233547.3233571","DOIUrl":"https://doi.org/10.1145/3233547.3233571","url":null,"abstract":"Principal Component Analysis (PCA) of dense single nucleotide polymorphism (SNP) data has wide-ranging applications in populations genetics, including detection of chromosomal inversions. SNPs associated with each PC can be identified through single-SNP association tests performed between SNP genotypes and PC coordinates; this approach has several advantages over thresholding loading factors or sparse PCA methods. Insect vector SNP data often have a high proportion of unknown (uncalled) genotypes, however, that cannot be reliably imputed and prevent the direct usage of association tests. Building on our previous work, we propose a novel method for adjusting the association tests to handle these unknown genotypes. We demonstrate the utility of the method through two applications: detecting chromosomal inversions and characterizing differentiation processed captured by PCA. When applied to SNP data from the 2L and 2R chromosome arms of 34 karyotyped Anopheles gambiae and Anopheles coluzzii mosquitoes, our method clearly identifies the 2La, 2Rb, 2Rc, 2Rj, and 2Ru inversions. Using our method to identify SNP associated with 2L-PC3, we observed one of the two insecticide-resistance variants in the Rdl gene; our results suggests that the PC is capturing differentiation driven by insecticide usage.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121410571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
clustQ clustQ
R. Alapati, Debswapna Bhattacharya
{"title":"clustQ","authors":"R. Alapati, Debswapna Bhattacharya","doi":"10.1145/3233547.3233570","DOIUrl":"https://doi.org/10.1145/3233547.3233570","url":null,"abstract":"Structure of a protein largely determines its functional properties. Hence, the knowledge of the protein's 3D structure is an important aspect in determining solutions to fundamental biological problems. Structure prediction algorithms generally employ clustering algorithm to select the optimal model for a target from a large number of predicted confirmations (a.k.a. decoy). Despite significant advancement in clustering-based optimal decoy selection methods, these approaches often cannot deliver high performance in terms of the time taken to cluster large number of protein structures owing to the computational cost associated with pairwise structural superpositions. Here, we propose a superposition-free approach to protein decoy clustering, called clustQ, based on weighted internal distance comparisons. Experimental results suggest that the novel weighing scheme is helpful in both reproducing the decoy-native similarity score and estimating pairwise clustering based predicted quality score in a computationally efficient manner. clustQ attains performance comparable to the state-of-the-art multi-model decoy quality estimation methods participating in the latest Critical Assessment of protein Structure Prediction (CASP) experiments irrespective of target difficulty. Moreover, clustQ predicted score offers a unique way to reliably estimate target difficulty without the knowledge of the experimental structure. clustQ is freely available at http://watson.cse.eng.auburn.edu/clustQ/.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115888453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Use of the Informatics for Integrating Biology and the Bedside (i2b2) Population to Test Serum Bilirubin Levels and Risk for Inflammatory Bowl Diseases and the Involvement of Uridine Glucuronosyltransferase Genes 利用信息学整合生物学和床边(i2b2)人群检测血清胆红素水平和炎症性肠病的风险以及尿苷糖醛酸转移酶基因的参与
C. Gallagher
{"title":"Use of the Informatics for Integrating Biology and the Bedside (i2b2) Population to Test Serum Bilirubin Levels and Risk for Inflammatory Bowl Diseases and the Involvement of Uridine Glucuronosyltransferase Genes","authors":"C. Gallagher","doi":"10.1145/3233547.3233638","DOIUrl":"https://doi.org/10.1145/3233547.3233638","url":null,"abstract":"Chronic inflammation associated with inflammatory bowel disease (IBD) results in increased oxidative stress that damages the colonic microenvironment. A low level of serum bilirubin, an endogenous antioxidant, has been associated with increased risk for Crohn's disease (CD), but no study has tested another common IBD ulcerative colitis (UC). Bilirubin is metabolized in the liver by uridine glucuronosyltransferase 1A1 (UGT1A1) exclusively. Genetic variants cause functional changes in UGT1A1 which result in hyperbilirubinemia, which can be toxic to tissues if untreated and results in a characteristic jaundiced appearance. Approximately 10% of the Caucasian population is homozygous for the microsatellite polymorphism UGT1A1*28, which results in increased total serum bilirubin levels due to reduced transcriptional efficiency of UGT1A1 and an overall 70% reduction in UGT1A1 enzymatic activity. The aim of this study was to examine whether bilirubin levels are associated with the risk for ulcerative colitis (UC). Using the Informatics for Integrating Biology and the Bedside (i2b2), a large case-control population was identified from a single tertiary care center, Penn State Hershey Medical Center (PSU). Similarly, a validation cohort was identified at Virginia Commonwealth University Medical Center. Logistic regression analysis was performed to determine the risk of developing UC with lower concentrations of serum bilirubin. From the PSU cohort, a subset of terminal ileum tissue was obtained at the time of surgical resection to analyze UGT1A1 gene expression (which encodes the enzyme responsible for bilirubin metabolism). Similar to CD patients, UC patients also demonstrated reduced levels of total serum bilirubin. Upon segregating serum bilirubin levels into quartiles, risk of UC increased with reduced concentrations of serum bilirubin. These results were confirmed in our validation cohort. UGT1A1 gene expression was up-regulated in the terminal ileum of a subset of UC patients. Lower levels of the antioxidant bilirubin may reduce the capability of UC patients to remove reactive oxygen species leading to an increase in intestinal injury. One potential explanation for these lower bilirubin levels may be up-regulation of UGT1A1 gene expression, which encodes the only enzyme involved in conjugating bilirubin. Therapeutics that reduce oxidative stress may be beneficial for these patients.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116129189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ACM Notice of Article Removal: Deep Learning Based Medical Diagnosis System Using Multiple Data Sources - originally published in the ACM Digital Library on 29-Aug-2018 ACM文章删除通知:使用多个数据源的基于深度学习的医疗诊断系统-最初发表于ACM数字图书馆2018年8月29日
Qinghan Xue, M. Chuah
{"title":"ACM Notice of Article Removal: Deep Learning Based Medical Diagnosis System Using Multiple Data Sources - originally published in the ACM Digital Library on 29-Aug-2018","authors":"Qinghan Xue, M. Chuah","doi":"10.1145/3233547.3233730","DOIUrl":"https://doi.org/10.1145/3233547.3233730","url":null,"abstract":"Recently, many researchers have conducted data mining over medical data to uncover hidden patterns and use them to learn prediction models for clinical decision making and personalized medicine. While such healthcare learning models can achieve encouraging results, they seldom incorporate existing expert knowledge into their frameworks and hence prediction accuracy for individual patients can still be improved. However, expert knowledge spans across various websites and multiple databases with heterogeneous representations and hence is difficult to harness for improving learning models. In addition, patients' queries at medical consult websites are often ambiguous in their specified terms and hence the returned responses may not contain the information they seek. To tackle these problems, we first design a knowledge extraction framework that can generate an aggregated dataset to characterize diseases by integrating heterogeneous medical data sources. Then, based on the integrated dataset, we propose an end-to-end deep learning based medical diagnosis system (DL-MDS) to provide disease diagnosis for authorized users. Evaluations on real-world data demonstrate that our proposed system achieves good performance on diseases diagnosis with a diverse set of patients' queries.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128810092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge Extraction of Long-Term Complications from Clinical Narratives of Blood Cancer Patients with HCT Treatments 从HCT治疗的血癌患者临床叙述中提取长期并发症的知识
Weizhong Zhu, J. B. Teh, Haiqing Li, S. Armenian
{"title":"Knowledge Extraction of Long-Term Complications from Clinical Narratives of Blood Cancer Patients with HCT Treatments","authors":"Weizhong Zhu, J. B. Teh, Haiqing Li, S. Armenian","doi":"10.1145/3233547.3233635","DOIUrl":"https://doi.org/10.1145/3233547.3233635","url":null,"abstract":"Interactive information extraction (IE) systems supported by biomedical ontologies are intelligent natural language processing (NLP) tools to understand literature and clinical narratives and discover meaningful domain knowledge from unstructured text. This study developed integrated IE systems to detect treatment complications of blood cancer patients from Electrical Medical Records (EMR) in the Long-Term Follow-Up (LTFU) protocol following Hematopoietic Cell Transplantation (HCT). The performance of the proposed approach was very encouraging compared to the gold-standard datasets manually reviewed by domain experts. In addition, the NLP system identified significant amount of cases not caught by experts.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128510142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MAPS 地图
Jinbu Wang, B. Chen
{"title":"MAPS","authors":"Jinbu Wang, B. Chen","doi":"10.1145/3233547.3233710","DOIUrl":"https://doi.org/10.1145/3233547.3233710","url":null,"abstract":"The adaptive immune system is a defense system against repeated infection. In order to trigger the immune response, antigen peptides from the infecting agent must first be recognized by the Major Histocompatibility Complex (MHC) proteins. Identifying peptides that bind to MHC class II is thus a critical step in vaccine development. We hypothesize that comparing individual subsites of the peptide binding groove could predict the individual amino acids of possible antigens. This modularized approach to individual subsites could reduce the amount of training data needed for accurate classification while also reducing computing times associated with molecular simulation and docking. To test this hypothesis, we evaluated the capability of two classification techniques and multiple modular representations of the MHC subsites to correctly classify the binding preference categories of P1 subsites of MHC class II structures. Our results shows that the average accuracies are 0.87 for K-mean and 0.95 for SVM with all feature vector configurations. Our results demonstrate that accurate predictions on individual binding subsites is possible, pointing to larger scale applications predicting whole-peptide preferences.","PeriodicalId":131906,"journal":{"name":"Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121616889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信