{"title":"TaxonWorks十周年:有什么新东西,接下来会发生什么?","authors":"Deborah Paul, Matthew Yoder","doi":"10.3897/biss.7.112040","DOIUrl":null,"url":null,"abstract":"The Species File Group (SFG) endeavors to build tools and community structures that empower researchers and collections staff in their long-term collective efforts to gather, share, and learn from biodiversity data. One such tool is TaxonWorks, now in its 10th year of development. TaxonWorks provides a collaborative workbench where scientists, collection managers, students, and volunteers capture and build on the key data and concepts we use to Describe Life (TaxonWorks motto). It provides a growing number of ways to share descriptions, from Darwin Core Archives, to NeXML-formatted observations and keys, to checklists, and bibliographies.\n \n What’s New? \n \n We have expanded the data model of TaxonWorks, added new tools and functions, and some Companion software, that is, new stand-alone code-bases.\n Two major additions, Unified Filters and Cached Maps, provide developers and users (and users who are developers) the ability to run complex queries across TaxonWorks' rich data model and to display quickly computed maps for datasets of notable size, 100K or more specimen and literature-based records. For example, Cached Maps can superimpose the asserted distribution and georeferenced literature and specimen records to create interactive searchable maps (Fig. 1). \n In TaxonWorks, we aim to empower those working with the data with tools that help them visualize and curate information. To be able to model taxon concept relationships over time to reflect different taxonomic opinions, we added RCC-5 (Region Connection Calculus; Thau et al. 2008), which will make it possible to visualize these relationships. Similarly, we built a new visual editor (Fig. 2) for displaying, editing, and citing biological associations as recorded among specimens or taxa (or both).\n Querying and enhancing data in a given database can be complex. We have worked on harmonizing the look-feel-function of the data filtering interfaces. With our Unified Filters, one can pass the results of one search to another filter (e.g., query for specimens for a given taxonomic group and then ask for the distinct collecting events for those specimens). Then, once you filter to a given dataset, you can use our new Stepwise tasks to enhance and edit that information en-masse.\n Companions code-bases extend what one can do with the data in TaxonWorks, but are also available for use with other software. For example, using our new TaxonPages code, our users can produce their own web pages for taxa (Fig. 1). TaxonPages will be used by SFG groups to make available well over 100K pages this year. They include basic Bioschema integration, links to JSON-formatted data behind every panel, and the option to download any occurrence data present, expressed as Darwin Core attributes, formatted as a CSV file. TaxonPages can be set up in minutes and served on resources like GitHub pages and our user community can customize their content.\n Finally, the TaxonWorks external API has added a huge number of new parameters across multiple new conceptual endpoints.\n \n What’s Next?\n \n With ten years of development, we see a maturing functionality surrounding the core concepts in TaxonWorks, like observations (e.g., traits, phylogenetic data), biological associations (e.g., host-parasite relationships), images, sources (citation management), specimens, collecting events, and collection management.\n Currently, we are focusing on integration with other external services. We have produced multiple new API wrappers, notably Colrapi (wrapping Catalogue of Life Checklist Bank's API) and BellPepper wrapping the new Biodiversity Enhanced Location Services (BELS) Georeference API. These wrappers and ongoing integration with the Global Names Framework give our users the power to improve data quality, e.g., linking to external vocabularies, finding and updating out-of-date nomenclature, and visualizing what TaxonWorks collection object data looks like in the context of external aggregators like the Global Biodiversity Information Facility (GBIF) using our gbifference tool (as in the \"GBIF difference\").\n The TaxonWorks community continues to grow, and therefore so does the diversity of the projects using it. Some of this diversity reflects the stage of projects: new projects need to rapidly create and stub new records, mid-life projects need to seek and add diverse data from a wide range of external resources, and mature projects need tools to identify and resolve outliers. For these data continuum scenarios, we foresee Stepwise tasks customized for managing these data maturity stage differences. Imagine capturing verbatim specimen determination data for medium-sized digitization projects and then parsing linkages to People, Times, and Taxa by the 10s, 100s, or 1000s at a time.\n Some of the growing diversity behind the TaxonWorks community is a result of the end-of-life of similar tools. For example, the SFG was asked to look into moving data from Scratchpads into TaxonWorks. We are in the process of moving one Scratchpad instance and will make the scripts we used to do this publicly available for further development. In August 2023, we migrated 16 projects from legacy SFG software to TaxonWorks, bringing new communities that can now join their expertise with others. As we move forward, we continue to work on distilling, synchronizing, and sharing our experiences and knowledge, via our community collective TaxonWorks Docs, embracing cultural change in support of the power in shared knowledge management.\n Finally, TaxonWorks is committed to serving the needs of those describing species. We expect to see it produce new treatments based on extremely atomized, yet linked, data, recognizable by humans as the format serving those in the field for over 200 years. Fully formatted nomenclatural histories, descriptions, material examined sections, keys, figures and accompanying discussions based on tens of thousands of data points, all of which may be linked back to the natural history collection data that serve as their basis (or is indeed managed alongside those collections), are coming. So too are parallel tools that serve collection management needs, as the two processes are highly intertwined.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TaxonWorks in its 10th Year: What’s new, what’s next?\",\"authors\":\"Deborah Paul, Matthew Yoder\",\"doi\":\"10.3897/biss.7.112040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Species File Group (SFG) endeavors to build tools and community structures that empower researchers and collections staff in their long-term collective efforts to gather, share, and learn from biodiversity data. One such tool is TaxonWorks, now in its 10th year of development. TaxonWorks provides a collaborative workbench where scientists, collection managers, students, and volunteers capture and build on the key data and concepts we use to Describe Life (TaxonWorks motto). It provides a growing number of ways to share descriptions, from Darwin Core Archives, to NeXML-formatted observations and keys, to checklists, and bibliographies.\\n \\n What’s New? \\n \\n We have expanded the data model of TaxonWorks, added new tools and functions, and some Companion software, that is, new stand-alone code-bases.\\n Two major additions, Unified Filters and Cached Maps, provide developers and users (and users who are developers) the ability to run complex queries across TaxonWorks' rich data model and to display quickly computed maps for datasets of notable size, 100K or more specimen and literature-based records. For example, Cached Maps can superimpose the asserted distribution and georeferenced literature and specimen records to create interactive searchable maps (Fig. 1). \\n In TaxonWorks, we aim to empower those working with the data with tools that help them visualize and curate information. To be able to model taxon concept relationships over time to reflect different taxonomic opinions, we added RCC-5 (Region Connection Calculus; Thau et al. 2008), which will make it possible to visualize these relationships. Similarly, we built a new visual editor (Fig. 2) for displaying, editing, and citing biological associations as recorded among specimens or taxa (or both).\\n Querying and enhancing data in a given database can be complex. We have worked on harmonizing the look-feel-function of the data filtering interfaces. With our Unified Filters, one can pass the results of one search to another filter (e.g., query for specimens for a given taxonomic group and then ask for the distinct collecting events for those specimens). Then, once you filter to a given dataset, you can use our new Stepwise tasks to enhance and edit that information en-masse.\\n Companions code-bases extend what one can do with the data in TaxonWorks, but are also available for use with other software. For example, using our new TaxonPages code, our users can produce their own web pages for taxa (Fig. 1). TaxonPages will be used by SFG groups to make available well over 100K pages this year. They include basic Bioschema integration, links to JSON-formatted data behind every panel, and the option to download any occurrence data present, expressed as Darwin Core attributes, formatted as a CSV file. TaxonPages can be set up in minutes and served on resources like GitHub pages and our user community can customize their content.\\n Finally, the TaxonWorks external API has added a huge number of new parameters across multiple new conceptual endpoints.\\n \\n What’s Next?\\n \\n With ten years of development, we see a maturing functionality surrounding the core concepts in TaxonWorks, like observations (e.g., traits, phylogenetic data), biological associations (e.g., host-parasite relationships), images, sources (citation management), specimens, collecting events, and collection management.\\n Currently, we are focusing on integration with other external services. We have produced multiple new API wrappers, notably Colrapi (wrapping Catalogue of Life Checklist Bank's API) and BellPepper wrapping the new Biodiversity Enhanced Location Services (BELS) Georeference API. These wrappers and ongoing integration with the Global Names Framework give our users the power to improve data quality, e.g., linking to external vocabularies, finding and updating out-of-date nomenclature, and visualizing what TaxonWorks collection object data looks like in the context of external aggregators like the Global Biodiversity Information Facility (GBIF) using our gbifference tool (as in the \\\"GBIF difference\\\").\\n The TaxonWorks community continues to grow, and therefore so does the diversity of the projects using it. Some of this diversity reflects the stage of projects: new projects need to rapidly create and stub new records, mid-life projects need to seek and add diverse data from a wide range of external resources, and mature projects need tools to identify and resolve outliers. For these data continuum scenarios, we foresee Stepwise tasks customized for managing these data maturity stage differences. Imagine capturing verbatim specimen determination data for medium-sized digitization projects and then parsing linkages to People, Times, and Taxa by the 10s, 100s, or 1000s at a time.\\n Some of the growing diversity behind the TaxonWorks community is a result of the end-of-life of similar tools. For example, the SFG was asked to look into moving data from Scratchpads into TaxonWorks. We are in the process of moving one Scratchpad instance and will make the scripts we used to do this publicly available for further development. In August 2023, we migrated 16 projects from legacy SFG software to TaxonWorks, bringing new communities that can now join their expertise with others. As we move forward, we continue to work on distilling, synchronizing, and sharing our experiences and knowledge, via our community collective TaxonWorks Docs, embracing cultural change in support of the power in shared knowledge management.\\n Finally, TaxonWorks is committed to serving the needs of those describing species. We expect to see it produce new treatments based on extremely atomized, yet linked, data, recognizable by humans as the format serving those in the field for over 200 years. Fully formatted nomenclatural histories, descriptions, material examined sections, keys, figures and accompanying discussions based on tens of thousands of data points, all of which may be linked back to the natural history collection data that serve as their basis (or is indeed managed alongside those collections), are coming. So too are parallel tools that serve collection management needs, as the two processes are highly intertwined.\",\"PeriodicalId\":9011,\"journal\":{\"name\":\"Biodiversity Information Science and Standards\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biodiversity Information Science and Standards\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3897/biss.7.112040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodiversity Information Science and Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/biss.7.112040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
物种档案组(SFG)致力于建立工具和社区结构,使研究人员和收集人员能够长期共同努力收集、分享和学习生物多样性数据。TaxonWorks就是这样一个工具,现在已经开发了10年。TaxonWorks提供了一个协作工作台,科学家、收集管理人员、学生和志愿者可以在此获取并构建用于描述生命(TaxonWorks的座右铭)的关键数据和概念。它提供了越来越多的方法来共享描述,从达尔文核心档案到nexml格式的观察和关键字,再到清单和参考书目。有什么新鲜事吗?我们扩展了TaxonWorks的数据模型,增加了新的工具和功能,以及一些Companion软件,即新的独立代码库。两个主要的新增功能,统一过滤器和缓存地图,为开发人员和用户(以及开发人员用户)提供了跨TaxonWorks的丰富数据模型运行复杂查询的能力,并为显着大小的数据集(100K或更多的样本和基于文献的记录)快速显示计算地图。例如,缓存地图可以叠加断言的分布和地理参考文献和标本记录,以创建交互式可搜索的地图(图1)。在TaxonWorks中,我们的目标是为那些使用数据的人提供工具,帮助他们可视化和管理信息。为了能够模拟分类单元概念关系随时间的变化,以反映不同的分类观点,我们添加了RCC-5(区域连接演算;Thau et al. 2008),这将使可视化这些关系成为可能。同样,我们构建了一个新的可视化编辑器(图2),用于显示、编辑和引用标本或分类群(或两者)之间记录的生物关联。查询和增强给定数据库中的数据可能很复杂。我们致力于协调数据过滤接口的观感功能。使用我们的统一过滤器,可以将一个搜索结果传递给另一个过滤器(例如,查询给定分类组的标本,然后请求这些标本的不同收集事件)。然后,一旦你过滤到一个给定的数据集,你可以使用我们新的逐步任务来增强和编辑信息。同伴代码库扩展了对TaxonWorks中的数据所能做的事情,但也可用于其他软件。例如,使用我们新的TaxonPages代码,我们的用户可以为分类组创建他们自己的网页(图1)。SFG组今年将使用TaxonPages提供超过10万个页面。它们包括基本的Bioschema集成,每个面板后面指向json格式数据的链接,以及下载任何出现的数据的选项,这些数据表示为Darwin Core属性,格式为CSV文件。TaxonPages可以在几分钟内建立起来,并在GitHub页面等资源上提供服务,我们的用户社区可以自定义其内容。最后,TaxonWorks外部API跨多个新的概念性端点添加了大量的新参数。接下来是什么?经过十年的发展,我们看到围绕TaxonWorks核心概念的成熟功能,如观察(如特征,系统发育数据),生物学关联(如宿主-寄生虫关系),图像,来源(引用管理),标本,收集事件和收集管理。目前,我们专注于与其他外部服务的集成。我们已经制作了多个新的API包装,特别是Colrapi(包装生命清单库的API)和BellPepper包装新的生物多样性增强定位服务(BELS)地理参考API。这些包装器以及与全球名称框架的持续集成使我们的用户能够提高数据质量,例如,链接到外部词汇表,查找和更新过时的命名法,以及使用我们的GBIF差异工具(即“GBIF差异”)可视化TaxonWorks集合对象数据在外部聚合器(如全球生物多样性信息设施(GBIF))上下文中的样子。TaxonWorks社区在持续增长,因此使用它的项目的多样性也在不断增加。其中一些多样性反映了项目的阶段:新项目需要快速创建和删除新记录,中期项目需要从广泛的外部资源中寻找和添加不同的数据,成熟的项目需要工具来识别和解决异常值。对于这些数据连续体场景,我们预见了为管理这些数据成熟度阶段差异而定制的Stepwise任务。想象一下,为中型数字化项目捕获逐字的标本测定数据,然后一次解析到10、100或1000个单位的People、Times和Taxa的链接。TaxonWorks社区背后的一些日益增长的多样性是类似工具寿终人终的结果。例如,SFG被要求研究将数据从Scratchpads转移到TaxonWorks。
TaxonWorks in its 10th Year: What’s new, what’s next?
The Species File Group (SFG) endeavors to build tools and community structures that empower researchers and collections staff in their long-term collective efforts to gather, share, and learn from biodiversity data. One such tool is TaxonWorks, now in its 10th year of development. TaxonWorks provides a collaborative workbench where scientists, collection managers, students, and volunteers capture and build on the key data and concepts we use to Describe Life (TaxonWorks motto). It provides a growing number of ways to share descriptions, from Darwin Core Archives, to NeXML-formatted observations and keys, to checklists, and bibliographies.
What’s New?
We have expanded the data model of TaxonWorks, added new tools and functions, and some Companion software, that is, new stand-alone code-bases.
Two major additions, Unified Filters and Cached Maps, provide developers and users (and users who are developers) the ability to run complex queries across TaxonWorks' rich data model and to display quickly computed maps for datasets of notable size, 100K or more specimen and literature-based records. For example, Cached Maps can superimpose the asserted distribution and georeferenced literature and specimen records to create interactive searchable maps (Fig. 1).
In TaxonWorks, we aim to empower those working with the data with tools that help them visualize and curate information. To be able to model taxon concept relationships over time to reflect different taxonomic opinions, we added RCC-5 (Region Connection Calculus; Thau et al. 2008), which will make it possible to visualize these relationships. Similarly, we built a new visual editor (Fig. 2) for displaying, editing, and citing biological associations as recorded among specimens or taxa (or both).
Querying and enhancing data in a given database can be complex. We have worked on harmonizing the look-feel-function of the data filtering interfaces. With our Unified Filters, one can pass the results of one search to another filter (e.g., query for specimens for a given taxonomic group and then ask for the distinct collecting events for those specimens). Then, once you filter to a given dataset, you can use our new Stepwise tasks to enhance and edit that information en-masse.
Companions code-bases extend what one can do with the data in TaxonWorks, but are also available for use with other software. For example, using our new TaxonPages code, our users can produce their own web pages for taxa (Fig. 1). TaxonPages will be used by SFG groups to make available well over 100K pages this year. They include basic Bioschema integration, links to JSON-formatted data behind every panel, and the option to download any occurrence data present, expressed as Darwin Core attributes, formatted as a CSV file. TaxonPages can be set up in minutes and served on resources like GitHub pages and our user community can customize their content.
Finally, the TaxonWorks external API has added a huge number of new parameters across multiple new conceptual endpoints.
What’s Next?
With ten years of development, we see a maturing functionality surrounding the core concepts in TaxonWorks, like observations (e.g., traits, phylogenetic data), biological associations (e.g., host-parasite relationships), images, sources (citation management), specimens, collecting events, and collection management.
Currently, we are focusing on integration with other external services. We have produced multiple new API wrappers, notably Colrapi (wrapping Catalogue of Life Checklist Bank's API) and BellPepper wrapping the new Biodiversity Enhanced Location Services (BELS) Georeference API. These wrappers and ongoing integration with the Global Names Framework give our users the power to improve data quality, e.g., linking to external vocabularies, finding and updating out-of-date nomenclature, and visualizing what TaxonWorks collection object data looks like in the context of external aggregators like the Global Biodiversity Information Facility (GBIF) using our gbifference tool (as in the "GBIF difference").
The TaxonWorks community continues to grow, and therefore so does the diversity of the projects using it. Some of this diversity reflects the stage of projects: new projects need to rapidly create and stub new records, mid-life projects need to seek and add diverse data from a wide range of external resources, and mature projects need tools to identify and resolve outliers. For these data continuum scenarios, we foresee Stepwise tasks customized for managing these data maturity stage differences. Imagine capturing verbatim specimen determination data for medium-sized digitization projects and then parsing linkages to People, Times, and Taxa by the 10s, 100s, or 1000s at a time.
Some of the growing diversity behind the TaxonWorks community is a result of the end-of-life of similar tools. For example, the SFG was asked to look into moving data from Scratchpads into TaxonWorks. We are in the process of moving one Scratchpad instance and will make the scripts we used to do this publicly available for further development. In August 2023, we migrated 16 projects from legacy SFG software to TaxonWorks, bringing new communities that can now join their expertise with others. As we move forward, we continue to work on distilling, synchronizing, and sharing our experiences and knowledge, via our community collective TaxonWorks Docs, embracing cultural change in support of the power in shared knowledge management.
Finally, TaxonWorks is committed to serving the needs of those describing species. We expect to see it produce new treatments based on extremely atomized, yet linked, data, recognizable by humans as the format serving those in the field for over 200 years. Fully formatted nomenclatural histories, descriptions, material examined sections, keys, figures and accompanying discussions based on tens of thousands of data points, all of which may be linked back to the natural history collection data that serve as their basis (or is indeed managed alongside those collections), are coming. So too are parallel tools that serve collection management needs, as the two processes are highly intertwined.