在医疗保健数据中使用不确定性测量的以目标为中心的特征选择

IF 8

ACM transactions on computing for healthcare Pub Date : 2020-05-30 DOI:10.1145/3383685

Orpaz Goldstein, Mohammad Kachuee, Kimmo Kärkkäinen, M. Sarrafzadeh

{"title":"在医疗保健数据中使用不确定性测量的以目标为中心的特征选择","authors":"Orpaz Goldstein, Mohammad Kachuee, Kimmo Kärkkäinen, M. Sarrafzadeh","doi":"10.1145/3383685","DOIUrl":null,"url":null,"abstract":"Healthcare big data remains under-utilized due to various incompatibility issues between the domains of data analytics and healthcare. The lack of generalizable iterative feature acquisition methods under budget and machine learning models that allow reasoning with a model’s uncertainty are two examples. Meanwhile, a boost to the available data is currently under way with the rapid growth in the Internet of Things applications and personalized healthcare. For the healthcare domain to be able to adopt models that take advantage of this big data, machine learning models should be coupled with more informative, germane feature acquisition methods, consequently adding robustness to the model’s results. We introduce an approach to feature selection that is based on Bayesian learning, allowing us to report the level of uncertainty in the model, combined with false-positive and false-negative rates. In addition, measuring target-specific uncertainty lifts the restriction on feature selection being target agnostic, allowing for feature acquisition based on a target of focus. We show that acquiring features for a specific target is at least as good as deep learning feature selection methods and common linear feature selection approaches for small non-sparse datasets, and surpasses these when faced with real-world data that is larger in scale and sparseness.","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":" ","pages":"1 - 17"},"PeriodicalIF":8.0000,"publicationDate":"2020-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3383685","citationCount":"1","resultStr":"{\"title\":\"Target-Focused Feature Selection Using Uncertainty Measurements in Healthcare Data\",\"authors\":\"Orpaz Goldstein, Mohammad Kachuee, Kimmo Kärkkäinen, M. Sarrafzadeh\",\"doi\":\"10.1145/3383685\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Healthcare big data remains under-utilized due to various incompatibility issues between the domains of data analytics and healthcare. The lack of generalizable iterative feature acquisition methods under budget and machine learning models that allow reasoning with a model’s uncertainty are two examples. Meanwhile, a boost to the available data is currently under way with the rapid growth in the Internet of Things applications and personalized healthcare. For the healthcare domain to be able to adopt models that take advantage of this big data, machine learning models should be coupled with more informative, germane feature acquisition methods, consequently adding robustness to the model’s results. We introduce an approach to feature selection that is based on Bayesian learning, allowing us to report the level of uncertainty in the model, combined with false-positive and false-negative rates. In addition, measuring target-specific uncertainty lifts the restriction on feature selection being target agnostic, allowing for feature acquisition based on a target of focus. We show that acquiring features for a specific target is at least as good as deep learning feature selection methods and common linear feature selection approaches for small non-sparse datasets, and surpasses these when faced with real-world data that is larger in scale and sparseness.\",\"PeriodicalId\":72043,\"journal\":{\"name\":\"ACM transactions on computing for healthcare\",\"volume\":\" \",\"pages\":\"1 - 17\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2020-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1145/3383685\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM transactions on computing for healthcare\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3383685\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM transactions on computing for healthcare","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3383685","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

由于数据分析和医疗保健领域之间存在各种不兼容问题，医疗保健大数据仍未得到充分利用。在预算和机器学习模型下缺乏可推广的迭代特征获取方法，允许对模型的不确定性进行推理就是两个例子。与此同时，随着物联网应用和个性化医疗的快速增长，可用数据正在不断增加。为了使医疗保健领域能够采用利用这些大数据的模型，机器学习模型应该与更多信息、相关的特征获取方法相结合，从而增加模型结果的鲁棒性。我们引入了一种基于贝叶斯学习的特征选择方法，允许我们报告模型中的不确定性水平，并结合假阳性和假阴性率。此外，测量目标特定的不确定性解除了对目标不可知的特征选择的限制，允许基于焦点目标的特征获取。我们表明，对于小型非稀疏数据集，获取特定目标的特征至少与深度学习特征选择方法和常见线性特征选择方法一样好，并且在面对规模和稀疏度更大的现实世界数据时优于这些方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Target-Focused Feature Selection Using Uncertainty Measurements in Healthcare Data

Healthcare big data remains under-utilized due to various incompatibility issues between the domains of data analytics and healthcare. The lack of generalizable iterative feature acquisition methods under budget and machine learning models that allow reasoning with a model’s uncertainty are two examples. Meanwhile, a boost to the available data is currently under way with the rapid growth in the Internet of Things applications and personalized healthcare. For the healthcare domain to be able to adopt models that take advantage of this big data, machine learning models should be coupled with more informative, germane feature acquisition methods, consequently adding robustness to the model’s results. We introduce an approach to feature selection that is based on Bayesian learning, allowing us to report the level of uncertainty in the model, combined with false-positive and false-negative rates. In addition, measuring target-specific uncertainty lifts the restriction on feature selection being target agnostic, allowing for feature acquisition based on a target of focus. We show that acquiring features for a specific target is at least as good as deep learning feature selection methods and common linear feature selection approaches for small non-sparse datasets, and surpasses these when faced with real-world data that is larger in scale and sparseness.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM transactions on computing for healthcare

CiteScore

10.30

自引率

0.00%

发文量