Leandro Rodrigues da Silva Souza , Daniel Hilário da Silva , Caio Tonus Ribeiro , Daiane Alves da Silva , Slawomir J. Nasuto , Catherine M. Sweeney-Reed , Adriano de Oliveira Andrade , Adriano Alves Pereira
{"title":"PubMedMetaTool:使用Python从PubMed自动提取元数据,用于文献计量分析","authors":"Leandro Rodrigues da Silva Souza , Daniel Hilário da Silva , Caio Tonus Ribeiro , Daiane Alves da Silva , Slawomir J. Nasuto , Catherine M. Sweeney-Reed , Adriano de Oliveira Andrade , Adriano Alves Pereira","doi":"10.1016/j.simpa.2025.100766","DOIUrl":null,"url":null,"abstract":"<div><div>Bibliometric analyses often depend on extracting metadata from large scientific databases, a process that is still largely manual, repetitive, and error prone. This paper presents PubMedMetaTool, an open-source Python-based solution that automates the retrieval and transformation of bibliographic metadata from PubMed, using either article titles or Digital Object Identifiers as input. The tool implements a modular pipeline that extracts metadata using NCBI’s Entrez programming utilities and transforms it into formats compatible with tools such as Bibliometrix, VOSviewer, and pyBibX. Designed to be transparent and configurable, the tool improves bibliometric workflow efficiency, accuracy, and interoperability workflows.</div></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"24 ","pages":"Article 100766"},"PeriodicalIF":1.3000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PubMedMetaTool: Automated metadata extraction from PubMed using Python for bibliometric analysis\",\"authors\":\"Leandro Rodrigues da Silva Souza , Daniel Hilário da Silva , Caio Tonus Ribeiro , Daiane Alves da Silva , Slawomir J. Nasuto , Catherine M. Sweeney-Reed , Adriano de Oliveira Andrade , Adriano Alves Pereira\",\"doi\":\"10.1016/j.simpa.2025.100766\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Bibliometric analyses often depend on extracting metadata from large scientific databases, a process that is still largely manual, repetitive, and error prone. This paper presents PubMedMetaTool, an open-source Python-based solution that automates the retrieval and transformation of bibliographic metadata from PubMed, using either article titles or Digital Object Identifiers as input. The tool implements a modular pipeline that extracts metadata using NCBI’s Entrez programming utilities and transforms it into formats compatible with tools such as Bibliometrix, VOSviewer, and pyBibX. Designed to be transparent and configurable, the tool improves bibliometric workflow efficiency, accuracy, and interoperability workflows.</div></div>\",\"PeriodicalId\":29771,\"journal\":{\"name\":\"Software Impacts\",\"volume\":\"24 \",\"pages\":\"Article 100766\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2025-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Software Impacts\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2665963825000260\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Software Impacts","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2665963825000260","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
PubMedMetaTool: Automated metadata extraction from PubMed using Python for bibliometric analysis
Bibliometric analyses often depend on extracting metadata from large scientific databases, a process that is still largely manual, repetitive, and error prone. This paper presents PubMedMetaTool, an open-source Python-based solution that automates the retrieval and transformation of bibliographic metadata from PubMed, using either article titles or Digital Object Identifiers as input. The tool implements a modular pipeline that extracts metadata using NCBI’s Entrez programming utilities and transforms it into formats compatible with tools such as Bibliometrix, VOSviewer, and pyBibX. Designed to be transparent and configurable, the tool improves bibliometric workflow efficiency, accuracy, and interoperability workflows.