Pytaxon: A Python software for resolving and correcting taxonomic names in biodiversity data.

IF 1 4区 环境科学与生态学 Q3 BIODIVERSITY CONSERVATION
Biodiversity Data Journal Pub Date : 2025-01-08 eCollection Date: 2025-01-01 DOI:10.3897/BDJ.13.e138257
Marco A Proença Neto, Marcos P A De Sousa
{"title":"Pytaxon: A Python software for resolving and correcting taxonomic names in biodiversity data.","authors":"Marco A Proença Neto, Marcos P A De Sousa","doi":"10.3897/BDJ.13.e138257","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The standardisation and correction of taxonomic names in large biodiversity databases remain persistent challenges for researchers, as errors in species names can compromise ecological analyses, land-use planning and conservation efforts, particularly when inaccurate data are shared on global biodiversity portals.</p><p><strong>New information: </strong>We present pytaxon, a Python software designed to resolve and correct taxonomic names in biodiversity data by leveraging the Global Names Verifier (GNV) API and employing fuzzy matching techniques to suggest corrections for discrepancies and nomenclatural inconsistencies. The pytaxon offers both a Command Line Interface (CLI) and a Graphical User Interface (GUI), ensuring accessibility to users with different levels of computing expertise. Tests on spreadsheets derived from datasets published in the Global Biodiversity Information Facility (GBIF) demonstrated its effectiveness in identifying and resolving taxonomic errors. By mitigating the propagation of inaccuracies from researchers' datasets to global biodiversity databases, pytaxon supports more reliable conservation decisions and robust scientific investigations. Its contributions enhance data integrity and promote informed biodiversity management in a rapidly evolving global environment.</p>","PeriodicalId":55994,"journal":{"name":"Biodiversity Data Journal","volume":"13 ","pages":"e138257"},"PeriodicalIF":1.0000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11736304/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodiversity Data Journal","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.3897/BDJ.13.e138257","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIODIVERSITY CONSERVATION","Score":null,"Total":0}
引用次数: 0

Abstract

Background: The standardisation and correction of taxonomic names in large biodiversity databases remain persistent challenges for researchers, as errors in species names can compromise ecological analyses, land-use planning and conservation efforts, particularly when inaccurate data are shared on global biodiversity portals.

New information: We present pytaxon, a Python software designed to resolve and correct taxonomic names in biodiversity data by leveraging the Global Names Verifier (GNV) API and employing fuzzy matching techniques to suggest corrections for discrepancies and nomenclatural inconsistencies. The pytaxon offers both a Command Line Interface (CLI) and a Graphical User Interface (GUI), ensuring accessibility to users with different levels of computing expertise. Tests on spreadsheets derived from datasets published in the Global Biodiversity Information Facility (GBIF) demonstrated its effectiveness in identifying and resolving taxonomic errors. By mitigating the propagation of inaccuracies from researchers' datasets to global biodiversity databases, pytaxon supports more reliable conservation decisions and robust scientific investigations. Its contributions enhance data integrity and promote informed biodiversity management in a rapidly evolving global environment.

Pytaxon:一个Python软件,用于解析和纠正生物多样性数据中的分类名称。
背景:大型生物多样性数据库中分类名称的标准化和校正一直是研究人员面临的挑战,因为物种名称的错误可能会影响生态分析、土地利用规划和保护工作,特别是当不准确的数据在全球生物多样性门户网站上共享时。新信息:我们提出pytaxon,一个Python软件,旨在通过利用全球名称验证器(GNV) API和模糊匹配技术来解决和纠正生物多样性数据中的分类名称,并建议纠正差异和命名不一致。pytaxon提供命令行界面(CLI)和图形用户界面(GUI),确保具有不同计算专业水平的用户都可以访问。对来自全球生物多样性信息设施(GBIF)公布的数据集的电子表格进行的测试表明,它在识别和解决分类错误方面是有效的。通过减少从研究人员的数据集到全球生物多样性数据库的不准确性传播,pytaxon支持更可靠的保护决策和有力的科学调查。它的贡献增强了数据的完整性,并在快速变化的全球环境中促进了知情的生物多样性管理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biodiversity Data Journal
Biodiversity Data Journal Agricultural and Biological Sciences-Ecology, Evolution, Behavior and Systematics
CiteScore
2.20
自引率
7.70%
发文量
283
审稿时长
6 weeks
期刊介绍: Biodiversity Data Journal (BDJ) is a community peer-reviewed, open-access, comprehensive online platform, designed to accelerate publishing, dissemination and sharing of biodiversity-related data of any kind. All structural elements of the articles – text, morphological descriptions, occurrences, data tables, etc. – will be treated and stored as DATA, in accordance with the Data Publishing Policies and Guidelines of Pensoft Publishers. The journal will publish papers in biodiversity science containing taxonomic, floristic/faunistic, morphological, genomic, phylogenetic, ecological or environmental data on any taxon of any geological age from any part of the world with no lower or upper limit to manuscript size.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信