Maciej Marek Zych, Raymond Bond, Maurice Mulvenna, Lu Bai, Jorge Martinez-Carracedo, Simon Leigh
{"title":"Grouping Digital Health Apps Based on Their Quality and User Ratings Using K-Medoids Clustering: Cross-Sectional Study.","authors":"Maciej Marek Zych, Raymond Bond, Maurice Mulvenna, Lu Bai, Jorge Martinez-Carracedo, Simon Leigh","doi":"10.2196/57279","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Digital health apps allow for proactive rather than reactive health care and have the potential to take the pressure off health care providers. With over 350,000 digital health apps available on the app stores today, those apps need to be of sufficient quality to be safe to use. Discovering the typology of digital health apps regarding professional and clinical assurance (PCA), user experience (UX), data privacy (DP), and user ratings may help in determining the areas where digital health apps can improve.</p><p><strong>Objective: </strong>This study has two objectives: (1) discover the types (clusters) of digital health apps with regards to their quality (scores) across 3 domains (their PCA, UX, and DP) and user ratings and (2) determine whether the National Institute for Health and Care Excellence (NICE) Evidence Standard Framework's (ESF's) tier, target users of the digital health apps, categories, or features have any association with this typology.</p><p><strong>Methods: </strong>Data were obtained from 1402 digital health app assessments conducted using the Organisation for the Review of Care and Health Apps Baseline Review (OBR), evaluating PCA, UX, and DP. K-medoids clustering identified app typologies, with the optimal number of clusters determined using the elbow method. The Shapiro-Wilk test assessed normality of user ratings and OBR scores. Nonparametric Wilcoxon rank sum tests compared cluster differences in these metrics. Post hoc analysis examined the distribution of NICE ESF tiers, target users, categories, and features across clusters, using Fisher exact test with Bonferroni correction. Effect sizes were calculated using Cohen w.</p><p><strong>Results: </strong>A total of four distinct app clusters emerged: (1) apps with poor user ratings (220/1402, 15.7%), (2) apps with poor PCA and DP scores (252/1402, 18%), (3) apps with poor PCA scores (415/1402, 29.6%), and (4) higher quality apps with high user ratings and OBR scores (515/1402, 36.7%). While some statistically significant associations were found between clusters and NICE ESF tiers (2/3), target users (0/14), categories (4/33), and features (6/19), all had small effect sizes (Cohen w<0.3). The strongest associations were for the \"Service Signposting\" feature (Cohen w=0.24) and NICE ESF tier B (Cohen w=0.19).</p><p><strong>Conclusions: </strong>The largest cluster comprised high-quality apps with strong user ratings and OBR scores (515/1402, 36.7%). A significant proportion (415/1402, 29.6%) performed poorly in PCA despite performing well in other domains. Notably, user ratings did not consistently align with PCA scores; some apps scored highly with users but poorly in PCA and DP. The 4-cluster typology underscores areas needing improvement, particularly PCA. Findings suggest limited association between the examined app characteristics and quality clusters, indicating a need for further investigation into what factors truly influence app quality.</p>","PeriodicalId":14756,"journal":{"name":"JMIR mHealth and uHealth","volume":"13 ","pages":"e57279"},"PeriodicalIF":6.2000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR mHealth and uHealth","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/57279","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Digital health apps allow for proactive rather than reactive health care and have the potential to take the pressure off health care providers. With over 350,000 digital health apps available on the app stores today, those apps need to be of sufficient quality to be safe to use. Discovering the typology of digital health apps regarding professional and clinical assurance (PCA), user experience (UX), data privacy (DP), and user ratings may help in determining the areas where digital health apps can improve.
Objective: This study has two objectives: (1) discover the types (clusters) of digital health apps with regards to their quality (scores) across 3 domains (their PCA, UX, and DP) and user ratings and (2) determine whether the National Institute for Health and Care Excellence (NICE) Evidence Standard Framework's (ESF's) tier, target users of the digital health apps, categories, or features have any association with this typology.
Methods: Data were obtained from 1402 digital health app assessments conducted using the Organisation for the Review of Care and Health Apps Baseline Review (OBR), evaluating PCA, UX, and DP. K-medoids clustering identified app typologies, with the optimal number of clusters determined using the elbow method. The Shapiro-Wilk test assessed normality of user ratings and OBR scores. Nonparametric Wilcoxon rank sum tests compared cluster differences in these metrics. Post hoc analysis examined the distribution of NICE ESF tiers, target users, categories, and features across clusters, using Fisher exact test with Bonferroni correction. Effect sizes were calculated using Cohen w.
Results: A total of four distinct app clusters emerged: (1) apps with poor user ratings (220/1402, 15.7%), (2) apps with poor PCA and DP scores (252/1402, 18%), (3) apps with poor PCA scores (415/1402, 29.6%), and (4) higher quality apps with high user ratings and OBR scores (515/1402, 36.7%). While some statistically significant associations were found between clusters and NICE ESF tiers (2/3), target users (0/14), categories (4/33), and features (6/19), all had small effect sizes (Cohen w<0.3). The strongest associations were for the "Service Signposting" feature (Cohen w=0.24) and NICE ESF tier B (Cohen w=0.19).
Conclusions: The largest cluster comprised high-quality apps with strong user ratings and OBR scores (515/1402, 36.7%). A significant proportion (415/1402, 29.6%) performed poorly in PCA despite performing well in other domains. Notably, user ratings did not consistently align with PCA scores; some apps scored highly with users but poorly in PCA and DP. The 4-cluster typology underscores areas needing improvement, particularly PCA. Findings suggest limited association between the examined app characteristics and quality clusters, indicating a need for further investigation into what factors truly influence app quality.
期刊介绍:
JMIR mHealth and uHealth (JMU, ISSN 2291-5222) is a spin-off journal of JMIR, the leading eHealth journal (Impact Factor 2016: 5.175). JMIR mHealth and uHealth is indexed in PubMed, PubMed Central, and Science Citation Index Expanded (SCIE), and in June 2017 received a stunning inaugural Impact Factor of 4.636.
The journal focusses on health and biomedical applications in mobile and tablet computing, pervasive and ubiquitous computing, wearable computing and domotics.
JMIR mHealth and uHealth publishes since 2013 and was the first mhealth journal in Pubmed. It publishes even faster and has a broader scope with including papers which are more technical or more formative/developmental than what would be published in the Journal of Medical Internet Research.