Orpaz Goldstein, Mohammad Kachuee, Kimmo Kärkkäinen, M. Sarrafzadeh
{"title":"在医疗保健数据中使用不确定性测量的以目标为中心的特征选择","authors":"Orpaz Goldstein, Mohammad Kachuee, Kimmo Kärkkäinen, M. Sarrafzadeh","doi":"10.1145/3383685","DOIUrl":null,"url":null,"abstract":"Healthcare big data remains under-utilized due to various incompatibility issues between the domains of data analytics and healthcare. The lack of generalizable iterative feature acquisition methods under budget and machine learning models that allow reasoning with a model’s uncertainty are two examples. Meanwhile, a boost to the available data is currently under way with the rapid growth in the Internet of Things applications and personalized healthcare. For the healthcare domain to be able to adopt models that take advantage of this big data, machine learning models should be coupled with more informative, germane feature acquisition methods, consequently adding robustness to the model’s results. We introduce an approach to feature selection that is based on Bayesian learning, allowing us to report the level of uncertainty in the model, combined with false-positive and false-negative rates. In addition, measuring target-specific uncertainty lifts the restriction on feature selection being target agnostic, allowing for feature acquisition based on a target of focus. We show that acquiring features for a specific target is at least as good as deep learning feature selection methods and common linear feature selection approaches for small non-sparse datasets, and surpasses these when faced with real-world data that is larger in scale and sparseness.","PeriodicalId":72043,"journal":{"name":"ACM transactions on computing for healthcare","volume":" ","pages":"1 - 17"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3383685","citationCount":"1","resultStr":"{\"title\":\"Target-Focused Feature Selection Using Uncertainty Measurements in Healthcare Data\",\"authors\":\"Orpaz Goldstein, Mohammad Kachuee, Kimmo Kärkkäinen, M. Sarrafzadeh\",\"doi\":\"10.1145/3383685\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Healthcare big data remains under-utilized due to various incompatibility issues between the domains of data analytics and healthcare. The lack of generalizable iterative feature acquisition methods under budget and machine learning models that allow reasoning with a model’s uncertainty are two examples. Meanwhile, a boost to the available data is currently under way with the rapid growth in the Internet of Things applications and personalized healthcare. For the healthcare domain to be able to adopt models that take advantage of this big data, machine learning models should be coupled with more informative, germane feature acquisition methods, consequently adding robustness to the model’s results. We introduce an approach to feature selection that is based on Bayesian learning, allowing us to report the level of uncertainty in the model, combined with false-positive and false-negative rates. In addition, measuring target-specific uncertainty lifts the restriction on feature selection being target agnostic, allowing for feature acquisition based on a target of focus. We show that acquiring features for a specific target is at least as good as deep learning feature selection methods and common linear feature selection approaches for small non-sparse datasets, and surpasses these when faced with real-world data that is larger in scale and sparseness.\",\"PeriodicalId\":72043,\"journal\":{\"name\":\"ACM transactions on computing for healthcare\",\"volume\":\" \",\"pages\":\"1 - 17\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1145/3383685\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM transactions on computing for healthcare\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3383685\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM transactions on computing for healthcare","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3383685","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Target-Focused Feature Selection Using Uncertainty Measurements in Healthcare Data
Healthcare big data remains under-utilized due to various incompatibility issues between the domains of data analytics and healthcare. The lack of generalizable iterative feature acquisition methods under budget and machine learning models that allow reasoning with a model’s uncertainty are two examples. Meanwhile, a boost to the available data is currently under way with the rapid growth in the Internet of Things applications and personalized healthcare. For the healthcare domain to be able to adopt models that take advantage of this big data, machine learning models should be coupled with more informative, germane feature acquisition methods, consequently adding robustness to the model’s results. We introduce an approach to feature selection that is based on Bayesian learning, allowing us to report the level of uncertainty in the model, combined with false-positive and false-negative rates. In addition, measuring target-specific uncertainty lifts the restriction on feature selection being target agnostic, allowing for feature acquisition based on a target of focus. We show that acquiring features for a specific target is at least as good as deep learning feature selection methods and common linear feature selection approaches for small non-sparse datasets, and surpasses these when faced with real-world data that is larger in scale and sparseness.