Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining最新文献

筛选
英文 中文
Sparse Representation for Prediction of HIV-1 Protease Drug Resistance. 预测HIV-1蛋白酶耐药性的稀疏表示。
Xiaxia Yu, Irene T Weber, Robert W Harrison
{"title":"Sparse Representation for Prediction of HIV-1 Protease Drug Resistance.","authors":"Xiaxia Yu,&nbsp;Irene T Weber,&nbsp;Robert W Harrison","doi":"10.1137/1.9781611972832.38","DOIUrl":"https://doi.org/10.1137/1.9781611972832.38","url":null,"abstract":"<p><p>HIV rapidly evolves drug resistance in response to antiviral drugs used in AIDS therapy. Estimating the specific resistance of a given strain of HIV to individual drugs from sequence data has important benefits for both the therapy of individual patients and the development of novel drugs. We have developed an accurate classification method based on the sparse representation theory, and demonstrate that this method is highly effective with HIV-1 protease. The protease structure is represented using our newly proposed encoding method based on Delaunay triangulation, and combined with the mutated amino acid sequences of known drug-resistant strains to train a machine-learning algorithm both for classification and regression of drug-resistant mutations. An overall cross-validated classification accuracy of 97% is obtained when trained on a publically available data base of approximately 1.5×10<sup>4</sup> known sequences (Stanford HIV database http://hivdb.stanford.edu/cgi-bin/GenoPhenoDS.cgi). Resistance to four FDA approved drugs is computed and comparisons with other algorithms demonstrate that our method shows significant improvements in classification accuracy.</p>","PeriodicalId":74533,"journal":{"name":"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining","volume":"2013 ","pages":"342-349"},"PeriodicalIF":0.0,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1137/1.9781611972832.38","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32407549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Sampling Strategies to Evaluate the Performance of Unknown Predictors. 评估未知预测器性能的抽样策略。
Hamed Valizadegan, Saeed Amizadeh, Milos Hauskrecht
{"title":"Sampling Strategies to Evaluate the Performance of Unknown Predictors.","authors":"Hamed Valizadegan,&nbsp;Saeed Amizadeh,&nbsp;Milos Hauskrecht","doi":"10.1137/1.9781611972825.43","DOIUrl":"https://doi.org/10.1137/1.9781611972825.43","url":null,"abstract":"<p><p>The focus of this paper is on how to select a small sample of examples for labeling that can help us to evaluate many different classification models unknown at the time of sampling. We are particularly interested in studying the sampling strategies for problems in which the prevalence of the two classes is highly biased toward one of the classes. The evaluation measures of interest we want to estimate as accurately as possible are those obtained from the contingency table. We provide a careful theoretical analysis on sensitivity, specificity, and precision and show how sampling strategies should be adapted to the rate of skewness in data in order to effectively compute the three aforementioned evaluation measures.</p>","PeriodicalId":74533,"journal":{"name":"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining","volume":"2012 ","pages":"494-505"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1137/1.9781611972825.43","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32444753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Revenue Generation in Hospital Foundations: Neural Network versus Regression Model Recommendations 医院基金会的创收:神经网络与回归模型建议
M. Malliaris, M. Pappas
{"title":"Revenue Generation in Hospital Foundations: Neural Network versus Regression Model Recommendations","authors":"M. Malliaris, M. Pappas","doi":"10.19030/IJMIS.V15I1.1596","DOIUrl":"https://doi.org/10.19030/IJMIS.V15I1.1596","url":null,"abstract":"This paper looks at revenue amounts generated by non-profit hospital foundations throughout the US. A number of inputs, including, among others, compensation, type of support given to the hospital, type of foundation expenditures, and hospital size, were used to develop models of foundation revenue. Both neural network and regression models were developed and compared in order to see which one gave a better model and to see how they ranked the relative value of the input variables. Though the generated value of revenue for both models correlates highly with actual revenue, the neural network shows smaller error. The order of variable importance for the models is very different. Each model would have different implications for foundations in planning their next round of revenue generating events.","PeriodicalId":74533,"journal":{"name":"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining","volume":"118 1","pages":"181-186"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77433302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Generalized and Heuristic-Free Feature Construction for Improved Accuracy. 提高准确率的广义和无启发式特征构建。
Wei Fan, Erheng Zhong, Jing Peng, Olivier Verscheure, Kun Zhang, Jiangtao Ren, Rong Yan, Qiang Yang
{"title":"Generalized and Heuristic-Free Feature Construction for Improved Accuracy.","authors":"Wei Fan,&nbsp;Erheng Zhong,&nbsp;Jing Peng,&nbsp;Olivier Verscheure,&nbsp;Kun Zhang,&nbsp;Jiangtao Ren,&nbsp;Rong Yan,&nbsp;Qiang Yang","doi":"10.1137/1.9781611972801.55","DOIUrl":"https://doi.org/10.1137/1.9781611972801.55","url":null,"abstract":"<p><p>State-of-the-art learning algorithms accept data in feature vector format as input. Examples belonging to different classes may not always be easy to separate in the original feature space. One may ask: can transformation of existing features into new space reveal significant discriminative information not obvious in the original space? Since there can be infinite number of ways to extend features, it is impractical to first enumerate and then perform feature selection. Second, evaluation of discriminative power on the complete dataset is not always optimal. This is because features highly discriminative on subset of examples may not necessarily be significant when evaluated on the entire dataset. Third, feature construction ought to be automated and general, such that, it doesn't require domain knowledge and its improved accuracy maintains over a large number of classification algorithms. In this paper, we propose a framework to address these problems through the following steps: (1) divide-conquer to avoid exhaustive enumeration; (2) local feature construction and evaluation within subspaces of examples where local error is still high and constructed features thus far still do not predict well; (3) weighting rules based search that is domain knowledge free and has provable performance guarantee. Empirical studies indicate that significant improvement (as much as 9% in accuracy and 28% in AUC) is achieved using the newly constructed features over a variety of inductive learners evaluated against a number of balanced, skewed and high-dimensional datasets. Software and datasets are available from the authors.</p>","PeriodicalId":74533,"journal":{"name":"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining","volume":"2010 ","pages":"629-640"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1137/1.9781611972801.55","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29859654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Anomaly Detection Using the Dempster-Shafer Method 基于Dempster-Shafer方法的异常检测
Qi Chen, U. Aickelin
{"title":"Anomaly Detection Using the Dempster-Shafer Method","authors":"Qi Chen, U. Aickelin","doi":"10.2139/SSRN.2831339","DOIUrl":"https://doi.org/10.2139/SSRN.2831339","url":null,"abstract":"In this paper, we implement an anomaly detection system using the Dempster-Shafer method. Using two standard benchmark problems we show that by combining multiple signals it is possible to achieve better results than by using a single signal. We further show that by applying this approach to a real-world email dataset the algorithm works for email worm detection. Dempster-Shafer can be a promising method for anomaly detection problems with multiple features (data sources), and two or more classes.","PeriodicalId":74533,"journal":{"name":"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining","volume":"1 1","pages":"232-240"},"PeriodicalIF":0.0,"publicationDate":"2008-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88561123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信