Application of recommender systems and time series models to monitor quality at HIV/AIDS health facilities

IF 1.8 Q3 PUBLIC ADMINISTRATION
Data & policy Pub Date : 2022-07-11 DOI:10.1017/dap.2022.15
J. Friedman, Zola Allen, Allison Fox, Jose Webert, A. Devlin
{"title":"Application of recommender systems and time series models to monitor quality at HIV/AIDS health facilities","authors":"J. Friedman, Zola Allen, Allison Fox, Jose Webert, A. Devlin","doi":"10.1017/dap.2022.15","DOIUrl":null,"url":null,"abstract":"Abstract The US government invests substantial sums to control the HIV/AIDS epidemic. To monitor progress toward epidemic control, PEPFAR, or the President’s Emergency Plan for AIDS Relief, oversees a data reporting system that includes standard indicators, reporting formats, information systems, and data warehouses. These data, reported quarterly, inform understanding of the global epidemic, resource allocation, and identification of trouble spots. PEPFAR has developed tools to assess the quality of data reported. These tools made important contributions but are limited in the methods used to identify anomalous data points. The most advanced consider univariate probability distributions, whereas correlations between indicators suggest a multivariate approach is better suited. For temporal analysis, the same tool compares values to the averages of preceding periods, though does not consider underlying trends and seasonal factors. To that end, we apply two methods to identify anomalous data points among routinely collected facility-level HIV/AIDS data. One approach is Recommender Systems, an unsupervised machine learning method that captures relationships between users and items. We apply the approach in a novel way by predicting reported values, comparing predicted to reported values, and identifying the greatest deviations. For a temporal perspective, we apply time series models that are flexible to include trend and seasonality. Results of these methods were validated against manual review (95% agreement on non-anomalies, 56% agreement on anomalies for recommender systems; 96% agreement on non-anomalies, 91% agreement on anomalies for time series). This tool will apply greater methodological sophistication to monitoring data quality in an accelerated and standardized manner.","PeriodicalId":93427,"journal":{"name":"Data & policy","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & policy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/dap.2022.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PUBLIC ADMINISTRATION","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract The US government invests substantial sums to control the HIV/AIDS epidemic. To monitor progress toward epidemic control, PEPFAR, or the President’s Emergency Plan for AIDS Relief, oversees a data reporting system that includes standard indicators, reporting formats, information systems, and data warehouses. These data, reported quarterly, inform understanding of the global epidemic, resource allocation, and identification of trouble spots. PEPFAR has developed tools to assess the quality of data reported. These tools made important contributions but are limited in the methods used to identify anomalous data points. The most advanced consider univariate probability distributions, whereas correlations between indicators suggest a multivariate approach is better suited. For temporal analysis, the same tool compares values to the averages of preceding periods, though does not consider underlying trends and seasonal factors. To that end, we apply two methods to identify anomalous data points among routinely collected facility-level HIV/AIDS data. One approach is Recommender Systems, an unsupervised machine learning method that captures relationships between users and items. We apply the approach in a novel way by predicting reported values, comparing predicted to reported values, and identifying the greatest deviations. For a temporal perspective, we apply time series models that are flexible to include trend and seasonality. Results of these methods were validated against manual review (95% agreement on non-anomalies, 56% agreement on anomalies for recommender systems; 96% agreement on non-anomalies, 91% agreement on anomalies for time series). This tool will apply greater methodological sophistication to monitoring data quality in an accelerated and standardized manner.
应用推荐系统和时间序列模型监测艾滋病毒/艾滋病卫生设施的质量
美国政府投入了大量资金来控制艾滋病的流行。为了监测疫情控制的进展,总统艾滋病紧急救援计划(PEPFAR)监督一个数据报告系统,该系统包括标准指标、报告格式、信息系统和数据仓库。这些数据每季度报告一次,有助于了解全球流行病、资源分配和查明问题点。总统防治艾滋病紧急救援计划开发了评估报告数据质量的工具。这些工具做出了重要贡献,但在用于识别异常数据点的方法中受到限制。最先进的方法考虑单变量概率分布,而指标之间的相关性表明多变量方法更适合。对于时间分析,同样的工具将数值与前几个时期的平均值进行比较,但不考虑潜在的趋势和季节因素。为此,我们采用两种方法来识别常规收集的设施级艾滋病毒/艾滋病数据中的异常数据点。一种方法是推荐系统,这是一种无监督的机器学习方法,可以捕获用户和项目之间的关系。我们通过预测报告值,比较预测值和报告值,并确定最大偏差,以一种新颖的方式应用该方法。对于时间的观点,我们应用时间序列模型是灵活的,包括趋势和季节性。这些方法的结果经过了人工审查的验证(推荐系统的非异常一致性为95%,异常一致性为56%;非异常一致性96%,时间序列异常一致性91%)。这一工具将以更快和标准化的方式在方法上更加复杂地监测数据质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.10
自引率
0.00%
发文量
0
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信