Elizabeth H Dolan, James Goulding, Laila J Tata, Alexandra R Lang
{"title":"Using Shopping Data to Improve the Diagnosis of Ovarian Cancer: Computational Analysis of a Web-Based Survey.","authors":"Elizabeth H Dolan, James Goulding, Laila J Tata, Alexandra R Lang","doi":"10.2196/37141","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Shopping data can be analyzed using machine learning techniques to study population health. It is unknown if the use of such methods can successfully investigate prediagnosis purchases linked to self-medication of symptoms of ovarian cancer.</p><p><strong>Objective: </strong>The aims of this study were to gain new domain knowledge from women's experiences, understand how women's shopping behavior relates to their pathway to the diagnosis of ovarian cancer, and inform research on computational analysis of shopping data for population health.</p><p><strong>Methods: </strong>A web-based survey on individuals' shopping patterns prior to an ovarian cancer diagnosis was analyzed to identify key knowledge about health care purchases. Logistic regression and random forest models were employed to statistically examine how products linked to potential symptoms related to presentation to health care and timing of diagnosis.</p><p><strong>Results: </strong>Of the 101 women surveyed with ovarian cancer, 58.4% (59/101) bought nonprescription health care products for up to more than a year prior to diagnosis, including pain relief and abdominal products. General practitioner advice was the primary reason for the purchases (23/59, 39%), with 51% (30/59) occurring due to a participant's doctor believing their health problems were due to a condition other than ovarian cancer. Associations were shown between purchases made because a participant's doctor believing their health problems were due to a condition other than ovarian cancer and the following variables: health problems for longer than a year prior to diagnosis (odds ratio [OR] 7.33, 95% CI 1.58-33.97), buying health care products for more than 6 months to a year (OR 3.82, 95% CI 1.04-13.98) or for more than a year (OR 7.64, 95% CI 1.38-42.33), and the number of health care product types purchased (OR 1.54, 95% CI 1.13-2.11). Purchasing patterns are shown to be potentially predictive of a participant's doctor thinking their health problems were due to some condition other than ovarian cancer, with nested cross-validation of random forest classification models achieving an overall in-sample accuracy score of 89.1% and an out-of-sample score of 70.1%.</p><p><strong>Conclusions: </strong>Women in the survey were 7 times more likely to have had a duration of more than a year of health problems prior to a diagnosis of ovarian cancer if they were self-medicating based on advice from a doctor rather than having made the decision to self-medicate independently. Predictive modelling indicates that women in such situations, who are self-medicating because their doctor believes their health problems may be due to a condition other than ovarian cancer, exhibit distinct shopping behaviors that may be identifiable within purchasing data. Through exploratory research combining women sharing their behaviors prior to diagnosis and computational analysis of these data, this study demonstrates that women's shopping data could potentially be useful for early ovarian cancer detection.</p>","PeriodicalId":45538,"journal":{"name":"JMIR Cancer","volume":"9 ","pages":"e37141"},"PeriodicalIF":3.3000,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10131768/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Cancer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/37141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Shopping data can be analyzed using machine learning techniques to study population health. It is unknown if the use of such methods can successfully investigate prediagnosis purchases linked to self-medication of symptoms of ovarian cancer.
Objective: The aims of this study were to gain new domain knowledge from women's experiences, understand how women's shopping behavior relates to their pathway to the diagnosis of ovarian cancer, and inform research on computational analysis of shopping data for population health.
Methods: A web-based survey on individuals' shopping patterns prior to an ovarian cancer diagnosis was analyzed to identify key knowledge about health care purchases. Logistic regression and random forest models were employed to statistically examine how products linked to potential symptoms related to presentation to health care and timing of diagnosis.
Results: Of the 101 women surveyed with ovarian cancer, 58.4% (59/101) bought nonprescription health care products for up to more than a year prior to diagnosis, including pain relief and abdominal products. General practitioner advice was the primary reason for the purchases (23/59, 39%), with 51% (30/59) occurring due to a participant's doctor believing their health problems were due to a condition other than ovarian cancer. Associations were shown between purchases made because a participant's doctor believing their health problems were due to a condition other than ovarian cancer and the following variables: health problems for longer than a year prior to diagnosis (odds ratio [OR] 7.33, 95% CI 1.58-33.97), buying health care products for more than 6 months to a year (OR 3.82, 95% CI 1.04-13.98) or for more than a year (OR 7.64, 95% CI 1.38-42.33), and the number of health care product types purchased (OR 1.54, 95% CI 1.13-2.11). Purchasing patterns are shown to be potentially predictive of a participant's doctor thinking their health problems were due to some condition other than ovarian cancer, with nested cross-validation of random forest classification models achieving an overall in-sample accuracy score of 89.1% and an out-of-sample score of 70.1%.
Conclusions: Women in the survey were 7 times more likely to have had a duration of more than a year of health problems prior to a diagnosis of ovarian cancer if they were self-medicating based on advice from a doctor rather than having made the decision to self-medicate independently. Predictive modelling indicates that women in such situations, who are self-medicating because their doctor believes their health problems may be due to a condition other than ovarian cancer, exhibit distinct shopping behaviors that may be identifiable within purchasing data. Through exploratory research combining women sharing their behaviors prior to diagnosis and computational analysis of these data, this study demonstrates that women's shopping data could potentially be useful for early ovarian cancer detection.
背景:可以使用机器学习技术分析购物数据来研究人口健康。目前尚不清楚使用这种方法是否可以成功地调查与卵巢癌症状自我药物治疗相关的诊断前购买。目的:本研究旨在从女性经验中获得新的领域知识,了解女性购物行为与卵巢癌诊断途径的关系,并为人群健康购物数据的计算分析研究提供信息。方法:对卵巢癌诊断前个人购物模式的网络调查进行分析,以确定医疗保健购买的关键知识。采用逻辑回归和随机森林模型统计检验与潜在症状相关的产品如何与就诊和诊断时间相关。结果:在101名接受调查的卵巢癌女性中,58.4%(59/101)在诊断前购买了长达一年以上的非处方保健产品,包括止痛药和腹部产品。全科医生的建议是购买的主要原因(23/59,39%),51%(30/59)的原因是参与者的医生认为他们的健康问题是由卵巢癌以外的疾病引起的。由于参与者的医生认为他们的健康问题是由卵巢癌以外的疾病引起的,因此他们购买的物品与以下变量之间存在关联:诊断前超过一年的健康问题(比值比[OR] 7.33, 95% CI 1.58-33.97),购买保健产品超过6个月至一年(OR 3.82, 95% CI 1.04-13.98)或超过一年(OR 7.64, 95% CI 1.38-42.33),以及购买的保健产品类型数量(OR 1.54, 95% CI 1.13-2.11)。购买模式被证明可以潜在地预测参与者的医生认为他们的健康问题是由于卵巢癌以外的某些疾病引起的,随机森林分类模型的嵌套交叉验证总体样本内准确度得分为89.1%,样本外得分为70.1%。结论:接受调查的妇女在诊断出卵巢癌之前,如果她们根据医生的建议自行用药,而不是自己决定自行用药,那么她们有超过一年的健康问题的可能性要高出7倍。预测模型表明,在这种情况下,由于医生认为她们的健康问题可能是由卵巢癌以外的其他疾病造成的,所以她们自己服药,她们表现出独特的购物行为,这些行为可以从购买数据中识别出来。通过探索性研究,结合女性在诊断前分享她们的行为,并对这些数据进行计算分析,该研究表明,女性的购物数据可能对卵巢癌的早期检测有用。