{"title":"Implementing large language model and retrieval augmented generation to extract geographic locations of illicit transnational kidney trade.","authors":"Zifu Wang, Meng-Hao Li, Patrick Baxter, Olzhas Zhorayev, Jiaxin Wei, Valerie Kovacs, Qiuhan Zhao, Chaowei Yang, Naoru Koizumi","doi":"10.1186/s12942-025-00397-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Illicit kidney trade networks, operating globally, involve intricate interactions among various players, most notably buyers, sellers, brokers, and surgeons. A comprehensive understanding of these trade networks is, however, hindered by the lack of systematically amassed data for analysis. Further, extracting the geographic locations of buyers, sellers, brokers, transplant surgeons, and medical facilities in all relevant publications often involves extensive, time-consuming, manual labelling that is very costly. Although current techniques such as Named Entity Recognition (NER) tools can potentially automate the process, they are limited to identifying country names and often fail to associate the roles (i.e., offering buyer, seller, broker and/or surgery) that each country played.</p><p><strong>Methods: </strong>This study employed state-of-the-art technologies, including Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) model Llama3.3 from Meta in developing a kidney trade country database. We first extracted news articles reporting illicit kidney trade from the LexisNexis database (2000-2022). BERT and Llama3.3 with chain-of-thought prompt tuning strategies were then applied to the materials to determine the relevance of articles to the illegal kidney trade and to identify the roles those different countries played in kidney trade cases over the past 23 years. The specific country classes recorded in the final kidney trade database included: a) countries of origin for kidney sellers; b) countries of origin of kidney buyers; c) countries performing illegal transplant surgeries; and d) countries of origin of organ trafficking brokers.</p><p><strong>Results: </strong>The BERT classification model achieved an accuracy of 88.75%, ensuring that only relevant articles were analyzed. Additionally, the Llama3.3-70B model with chain-of-thought prompt tuning strategies extracted location-based roles with an accuracy of 86.30% for sellers, 88.89% for buyers, 93.33% for brokers, and 95.93% for surgeries, supporting these observed patterns. We observed in the final database that the kidney trade networks change and evolve dynamically where the primary role played by each country (as a host of either sellers, buyers or surgeries) change over time. About half of the top 10 countries playing each role gets replaced by other countries within a decade. The final database also demonstrated that developing countries were more likely to be a host of kidney sellers while that played by developed countries was a host of kidney buyers.</p><p><strong>Conclusion: </strong>The current study developed a geospatial database describing transnational kidney trade country networks over the past two decades. The new approach for geographic location extraction that is more precise compared to conventional NER and machine learning methods.</p>","PeriodicalId":48739,"journal":{"name":"International Journal of Health Geographics","volume":"24 1","pages":"10"},"PeriodicalIF":3.0000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12039186/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Health Geographics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12942-025-00397-8","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Illicit kidney trade networks, operating globally, involve intricate interactions among various players, most notably buyers, sellers, brokers, and surgeons. A comprehensive understanding of these trade networks is, however, hindered by the lack of systematically amassed data for analysis. Further, extracting the geographic locations of buyers, sellers, brokers, transplant surgeons, and medical facilities in all relevant publications often involves extensive, time-consuming, manual labelling that is very costly. Although current techniques such as Named Entity Recognition (NER) tools can potentially automate the process, they are limited to identifying country names and often fail to associate the roles (i.e., offering buyer, seller, broker and/or surgery) that each country played.
Methods: This study employed state-of-the-art technologies, including Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) model Llama3.3 from Meta in developing a kidney trade country database. We first extracted news articles reporting illicit kidney trade from the LexisNexis database (2000-2022). BERT and Llama3.3 with chain-of-thought prompt tuning strategies were then applied to the materials to determine the relevance of articles to the illegal kidney trade and to identify the roles those different countries played in kidney trade cases over the past 23 years. The specific country classes recorded in the final kidney trade database included: a) countries of origin for kidney sellers; b) countries of origin of kidney buyers; c) countries performing illegal transplant surgeries; and d) countries of origin of organ trafficking brokers.
Results: The BERT classification model achieved an accuracy of 88.75%, ensuring that only relevant articles were analyzed. Additionally, the Llama3.3-70B model with chain-of-thought prompt tuning strategies extracted location-based roles with an accuracy of 86.30% for sellers, 88.89% for buyers, 93.33% for brokers, and 95.93% for surgeries, supporting these observed patterns. We observed in the final database that the kidney trade networks change and evolve dynamically where the primary role played by each country (as a host of either sellers, buyers or surgeries) change over time. About half of the top 10 countries playing each role gets replaced by other countries within a decade. The final database also demonstrated that developing countries were more likely to be a host of kidney sellers while that played by developed countries was a host of kidney buyers.
Conclusion: The current study developed a geospatial database describing transnational kidney trade country networks over the past two decades. The new approach for geographic location extraction that is more precise compared to conventional NER and machine learning methods.
期刊介绍:
A leader among the field, International Journal of Health Geographics is an interdisciplinary, open access journal publishing internationally significant studies of geospatial information systems and science applications in health and healthcare. With an exceptional author satisfaction rate and a quick time to first decision, the journal caters to readers across an array of healthcare disciplines globally.
International Journal of Health Geographics welcomes novel studies in the health and healthcare context spanning from spatial data infrastructure and Web geospatial interoperability research, to research into real-time Geographic Information Systems (GIS)-enabled surveillance services, remote sensing applications, spatial epidemiology, spatio-temporal statistics, internet GIS and cyberspace mapping, participatory GIS and citizen sensing, geospatial big data, healthy smart cities and regions, and geospatial Internet of Things and blockchain.