Application of natural language processing to predict final recommendation of Brazilian health technology assessment reports

IF 2.6 4区 医学 Q2 HEALTH CARE SCIENCES & SERVICES
Marilia Mastrocolla de Almeida Cardoso, Juliana Machado-Rugolo, Lehana Thabane, Naila Camila da Rocha, Abner Mácula Pacheco Barbosa, Denis Satoshi Komoda, Juliana Tereza Coneglian de Almeida, Daniel da Silva Pereira Curado, Silke Anna Theresa Weber, Luis Gustavo Modelli de Andrade
{"title":"Application of natural language processing to predict final recommendation of Brazilian health technology assessment reports","authors":"Marilia Mastrocolla de Almeida Cardoso, Juliana Machado-Rugolo, Lehana Thabane, Naila Camila da Rocha, Abner Mácula Pacheco Barbosa, Denis Satoshi Komoda, Juliana Tereza Coneglian de Almeida, Daniel da Silva Pereira Curado, Silke Anna Theresa Weber, Luis Gustavo Modelli de Andrade","doi":"10.1017/s0266462324000163","DOIUrl":null,"url":null,"abstract":"Introduction Health technology assessment (HTA) plays a vital role in healthcare decision-making globally, necessitating the identification of key factors impacting evaluation outcomes due to the significant workload faced by HTA agencies. Objectives The aim of this study was to predict the approval status of evaluations conducted by the Brazilian Committee for Health Technology Incorporation (CONITEC) using natural language processing (NLP). Methods Data encompassing CONITEC’s official report summaries from 2012 to 2022. Textual data was tokenized for NLP analysis. Least Absolute Shrinkage and Selection Operator, logistic regression, support vector machine, random forest, neural network, and extreme gradient boosting (XGBoost), were evaluated for accuracy, area under the receiver operating characteristic curve (ROC AUC) score, precision, and recall. Cluster analysis using the k-modes algorithm categorized entries into two clusters (approved, rejected). Results The neural network model exhibited the highest accuracy metrics (precision at 0.815, accuracy at 0.769, ROC AUC at 0.871, and recall at 0.746), followed by XGBoost model. The lexical analysis uncovered linguistic markers, like references to international HTA agencies’ experiences and government as demandant, potentially influencing CONITEC’s decisions. Cluster and XGBoost analyses emphasized that approved evaluations mainly concerned drug assessments, often government-initiated, while non-approved ones frequently evaluated drugs, with the industry as the requester. Conclusions NLP model can predict health technology incorporation outcomes, opening avenues for future research using HTA reports from other agencies. This model has the potential to enhance HTA system efficiency by offering initial insights and decision-making criteria, thereby benefiting healthcare experts.","PeriodicalId":14467,"journal":{"name":"International Journal of Technology Assessment in Health Care","volume":null,"pages":null},"PeriodicalIF":2.6000,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Technology Assessment in Health Care","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1017/s0266462324000163","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction Health technology assessment (HTA) plays a vital role in healthcare decision-making globally, necessitating the identification of key factors impacting evaluation outcomes due to the significant workload faced by HTA agencies. Objectives The aim of this study was to predict the approval status of evaluations conducted by the Brazilian Committee for Health Technology Incorporation (CONITEC) using natural language processing (NLP). Methods Data encompassing CONITEC’s official report summaries from 2012 to 2022. Textual data was tokenized for NLP analysis. Least Absolute Shrinkage and Selection Operator, logistic regression, support vector machine, random forest, neural network, and extreme gradient boosting (XGBoost), were evaluated for accuracy, area under the receiver operating characteristic curve (ROC AUC) score, precision, and recall. Cluster analysis using the k-modes algorithm categorized entries into two clusters (approved, rejected). Results The neural network model exhibited the highest accuracy metrics (precision at 0.815, accuracy at 0.769, ROC AUC at 0.871, and recall at 0.746), followed by XGBoost model. The lexical analysis uncovered linguistic markers, like references to international HTA agencies’ experiences and government as demandant, potentially influencing CONITEC’s decisions. Cluster and XGBoost analyses emphasized that approved evaluations mainly concerned drug assessments, often government-initiated, while non-approved ones frequently evaluated drugs, with the industry as the requester. Conclusions NLP model can predict health technology incorporation outcomes, opening avenues for future research using HTA reports from other agencies. This model has the potential to enhance HTA system efficiency by offering initial insights and decision-making criteria, thereby benefiting healthcare experts.
应用自然语言处理技术预测巴西卫生技术评估报告的最终建议
引言 卫生技术评估(HTA)在全球医疗决策中发挥着至关重要的作用,由于卫生技术评估机构面临着巨大的工作量,因此有必要找出影响评估结果的关键因素。本研究旨在利用自然语言处理(NLP)技术预测巴西卫生技术整合委员会(CONITEC)所做评估的批准状态。方法 数据涵盖 2012 年至 2022 年 CONITEC 的官方报告摘要。对文本数据进行标记化处理,以便进行 NLP 分析。对最小绝对收缩和选择运算器、逻辑回归、支持向量机、随机森林、神经网络和极梯度提升(XGBoost)进行了准确性、接收者工作特征曲线下面积(ROC AUC)得分、精确度和召回率评估。使用 k 模式算法进行的聚类分析将条目分为两类(批准、拒绝)。结果 神经网络模型的准确度指标最高(精确度为 0.815,准确度为 0.769,ROC AUC 为 0.871,召回率为 0.746),其次是 XGBoost 模型。词汇分析发现了一些语言标记,如提及国际 HTA 机构的经验和政府作为需求方,这些标记可能会影响 CONITEC 的决策。聚类分析和 XGBoost 分析强调,已获批准的评估主要涉及药物评估,通常由政府发起,而未获批准的评估则经常评估药物,由行业作为需求方。结论 NLP 模型可以预测卫生技术纳入的结果,为今后利用其他机构的 HTA 报告开展研究开辟了道路。该模型有可能通过提供初步见解和决策标准来提高 HTA 系统的效率,从而使医疗专家受益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Technology Assessment in Health Care
International Journal of Technology Assessment in Health Care 医学-公共卫生、环境卫生与职业卫生
CiteScore
4.40
自引率
15.60%
发文量
116
审稿时长
6-12 weeks
期刊介绍: International Journal of Technology Assessment in Health Care serves as a forum for the wide range of health policy makers and professionals interested in the economic, social, ethical, medical and public health implications of health technology. It covers the development, evaluation, diffusion and use of health technology, as well as its impact on the organization and management of health care systems and public health. In addition to general essays and research reports, regular columns on technology assessment reports and thematic sections are published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信