Journal of open source software最新文献

筛选
英文 中文
PDF Entity Annotation Tool (PEAT). PDF实体注释工具(泥炭)。
Journal of open source software Pub Date : 2025-04-08 DOI: 10.21105/joss.05336
Christopher G Stahl, Kristan J Markey, Brian C Jewell, Dahnish Shams, Michele M Taylor, A Amina Wilkins, Sean Watford, Andy Shapiro, Michelle Angrish
{"title":"PDF Entity Annotation Tool (PEAT).","authors":"Christopher G Stahl, Kristan J Markey, Brian C Jewell, Dahnish Shams, Michele M Taylor, A Amina Wilkins, Sean Watford, Andy Shapiro, Michelle Angrish","doi":"10.21105/joss.05336","DOIUrl":"10.21105/joss.05336","url":null,"abstract":"<p><p>While different text mining approaches - including the use of Artificial Intelligence (AI) and other machine based methods - continue to expand at a rapid pace, the tools used by researchers to create the labeled datasets required for training, modeling, and evaluation remain rudimentary. Labeled datasets contain the target attributes the machine is going to learn; for example, training an algorithm to delineate between images of a car or truck would generally require a set of images with a quantitative description of the underlying features of each vehicle type. Development of labeled textual data that can be used to build natural language machine learning models for scientific literature is not currently integrated into existing manual workflows used by domain experts. Published literature is rich with important information, such as different types of embedded text, plots, and tables that can all be used as inputs to train ML/natural language processing (NLP) models, when extracted and prepared in machine readable formats. Currently, both normalized data extraction of use to domain experts and extraction to support development of ML/NLP models are labor intensive and cumbersome manual processes. Automatic extraction of data and information from formats such as PDFs that are optimized for layout and human readability, not machine readability. The PDF (Portable Document Format) Entity Annotation Tool (PEAT) was developed with the goal of allowing users to annotate publications within their current print format, while also allowing those annotations to be captured in a machine-readable format. One of the main issues with traditional annotation tools is that they require transforming the PDF into plain text to facilitate the annotation process. While doing so lessens the technical challenges of annotating data, the user loses all structure and provenance that was inherent in the underlying PDF. Also, textual data extraction from PDFs can be an error prone process. Challenges include identifying sequential blocks of text and a multitude of document formats (multiple columns, font encodings, etc.). As a result of these challenges, using existing tools for development of NLP/ML models directly from PDFs is difficult because the generated outputs are not interoperable. We created a system that allows annotations to be completed on the original PDF document structure, with no plain text extraction. The result is an application that allows for easier and more accurate annotations. In addition, by including a feature that grants the user the ability to easily create a schema, we have developed a system that can be used to annotate text for different domain-centric schemas of relevance to subject matter experts. Different knowledge domains require distinct schemas and annotation tags to support machine learning.</p>","PeriodicalId":94101,"journal":{"name":"Journal of open source software","volume":"10 108","pages":"5336"},"PeriodicalIF":0.0,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12180754/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144478336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LightLogR: Reproducible analysis of personal light exposure data. LightLogR:个人光照数据的可重复分析。
Journal of open source software Pub Date : 2025-03-13 DOI: 10.21105/joss.07601
Johannes Zauner, Steffen Hartmeyer, Manuel Spitschan
{"title":"LightLogR: Reproducible analysis of personal light exposure data.","authors":"Johannes Zauner, Steffen Hartmeyer, Manuel Spitschan","doi":"10.21105/joss.07601","DOIUrl":"10.21105/joss.07601","url":null,"abstract":"<p><p>Light plays an important role in human health and well-being, which necessitates the study of the effects of personal light exposure in real-world settings, measured by means of wearable devices. A growing number of studies incorporate these kinds of data to assess associations between light and health outcomes. Yet with few or missing standards, guidelines, and frameworks, it is challenging setting up measurements, analysing the data, and comparing outcomes between studies. Overall, time series data from wearable light loggers are significantly more complex compared to controlled stimuli used in laboratory studies. In this paper, we introduce LightLogR, a novel resource to facilitate these research efforts. The package for R statistical software is open-source and permissively MIT-licenced. As part of a developing software ecosystem, LightLogR is built with common challenges of current and future datasets in mind. The package standardises many tasks for importing and processing personal light exposure data. It allows for quick as well as detailed insights into the datasets through summary and visualisation tools. Furthermore, LightLogR incorporates major metrics commonly used in the field (61 metrics across 17 metric families), all while embracing an inherently hierarchical, participant-based data structure.</p>","PeriodicalId":94101,"journal":{"name":"Journal of open source software","volume":"10 107","pages":"7601"},"PeriodicalIF":0.0,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7617517/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143694944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BART-Survival: A Bayesian machine learning approach to survival analyses in Python. BART-Survival:在Python中使用贝叶斯机器学习方法进行生存分析。
Journal of open source software Pub Date : 2025-01-28 DOI: 10.21105/joss.07213
Jacob Tiegs, Julia Raykin, Ilia Rochlin
{"title":"BART-Survival: A Bayesian machine learning approach to survival analyses in Python.","authors":"Jacob Tiegs, Julia Raykin, Ilia Rochlin","doi":"10.21105/joss.07213","DOIUrl":"10.21105/joss.07213","url":null,"abstract":"<p><p>BART-Survival is a Python package that allows time-to-event (survival) analyses in discrete-time using the non-parametric machine learning algorithm, Bayesian Additive Regression Trees (BART). BART-Survival combines the performance of the BART algorithm with the complementary data and model formatting required to complete the survival analyses. The library contains a convenient application programming interface (API) that allows a simple approach when using the library for survival analyses, while maintaining capabilities for added complexity when desired. The package is intended for analysts exploring use of flexible non-parametric alternatives to traditional (semi-)parametric survival analyses.</p>","PeriodicalId":94101,"journal":{"name":"Journal of open source software","volume":"10 105","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11848778/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143495117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IMPPY3D: Image Processing in Python for 3D Image Stacks. IMPPY3D:用于3D图像堆栈的Python图像处理。
Journal of open source software Pub Date : 2025-01-01 DOI: 10.21105/joss.07405
Newell H Moser, Alexander K Landauer, Orion L Kafka
{"title":"IMPPY3D: Image Processing in Python for 3D Image Stacks.","authors":"Newell H Moser, Alexander K Landauer, Orion L Kafka","doi":"10.21105/joss.07405","DOIUrl":"10.21105/joss.07405","url":null,"abstract":"<p><p>Image Processing in Python for 3D image stacks, or IMPPY3D, is a free and open-source software (FOSS) repository that simplifies post-processing and 3D shape characterization for grayscale image stacks, otherwise known as volumetric images, 3D images, or voxel models. While IMPPY3D, pronounced impee-three-dee, was originally created for post-processing image stacks generated from X-ray computed tomography (XCT) measurements, it can be applied generally in post-processing 2D and 3D images. IMPPY3D includes tools for segmenting volumetric images and characterizing the 3D shape of features or regions of interest. These functionalities have proven useful in 3D shape analysis of powder particles, porous polymers, concrete aggregates, internal pores/defects, and more (see the Research Applications section). IMPPY3D consists of a combination of original Python scripts, Cython extensions, and convenience wrappers for popular third-party libraries like SciKit-Image (Walt et al., 2014), OpenCV (Bradski, 2000), and PyVista (Sullivan & Kaszynski, 2019). Highlighted capabilities of IMPPY3D include: varying image processing parameters interactively, applying numerous 2D/3D image filters (e.g., blurring/sharpening, denoising, erosion/dilation), segmenting and labeling continuous 3D objects, precisely rotating and re-slicing an image stack in 3D, generating rotated bounding boxes fitted to voxelized features, converting image stacks into 3D voxel models, exporting 3D models as Visualization Toolkik (VTK) files for ParaView (Ayachit, 2015), and converting voxel models into smooth mesh-based models. Additional information and example scripts can be found in the included ReadMe files within the IMPPY3D GitHub repository (Moser, Landauer, et al., 2024). As a visualized example, Figure 1 demonstrates the high-level steps to characterize powder particles using IMPPY3D. This workflow is also similar to how pores can be visualized and characterized in metal-based additive manufacturing. Additional research applications for IMPPY3D are discussed in a later section.</p>","PeriodicalId":94101,"journal":{"name":"Journal of open source software","volume":"10 108","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11984349/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144061275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A MATLAB-based Instrument Control (MIC) package for fluorescence imaging. 一个基于matlab的仪器控制(MIC)包荧光成像。
Journal of open source software Pub Date : 2025-01-01 Epub Date: 2025-01-28 DOI: 10.21105/joss.07275
Sajjad A Khan, Sandeep Pallikkuth, David J Schodt, Marjolein B M Meddens, Hanieh Mazloom-Farsibaf, Michael J Wester, Sheng Liu, Ellyse Taylor, Mohamadreza Fazel, Farzin Farzam, Keith A Lidke
{"title":"A MATLAB-based Instrument Control (MIC) package for fluorescence imaging.","authors":"Sajjad A Khan, Sandeep Pallikkuth, David J Schodt, Marjolein B M Meddens, Hanieh Mazloom-Farsibaf, Michael J Wester, Sheng Liu, Ellyse Taylor, Mohamadreza Fazel, Farzin Farzam, Keith A Lidke","doi":"10.21105/joss.07275","DOIUrl":"10.21105/joss.07275","url":null,"abstract":"<p><p>MATLAB Instrument Control (MIC) is a software package designed to facilitate data collection for custom-built microscopes. Utilizing object-oriented programming, MIC provides a class for each low-level instrument. These classes inherit from a common MIC abstract class, ensuring a uniform interface across different instruments. Key components such as lasers, stages, power meter and cameras are grouped under abstract subclasses, which standardize interfaces and simplify the development of control classes for new instruments. Both simple and complex systems can be built from these lower level tools. Since the interoperation is developed by the end user, the modes or sequence of operations can be flexibly designed with interactive or automated data collection and integrated analysis. MATLAB provides the ability to create GUIs and therefore MIC allows for both rapid prototyping and for building custom, high-level user interfaces that can be used for production instruments.</p>","PeriodicalId":94101,"journal":{"name":"Journal of open source software","volume":"10 105","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12176407/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144328202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
harmonize-wq: Standardize, clean and wrangle Water Quality Portal data into more analytic-ready formats. harmonize-wq:将水质门户网站的数据标准化、清理和整理成更易于分析的格式。
Journal of open source software Pub Date : 2024-10-22 DOI: 10.21105/joss.07305
Justin Bousquin, Cristina A Mullin
{"title":"harmonize-wq: Standardize, clean and wrangle Water Quality Portal data into more analytic-ready formats.","authors":"Justin Bousquin, Cristina A Mullin","doi":"10.21105/joss.07305","DOIUrl":"10.21105/joss.07305","url":null,"abstract":"","PeriodicalId":94101,"journal":{"name":"Journal of open source software","volume":"9 102","pages":"7305"},"PeriodicalIF":0.0,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11694891/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
primerForge: a Python program for identifying primer pairs capable of distinguishing groups of genomes from each other. primerForge:一个Python程序,用于识别能够区分基因组组的引物对。
Journal of open source software Pub Date : 2024-09-16 DOI: 10.21105/joss.06850
Joseph S Wirth, Lee S Katz, Grant M Williams, Jessica C Chen
{"title":"primerForge: a Python program for identifying primer pairs capable of distinguishing groups of genomes from each other.","authors":"Joseph S Wirth, Lee S Katz, Grant M Williams, Jessica C Chen","doi":"10.21105/joss.06850","DOIUrl":"10.21105/joss.06850","url":null,"abstract":"<p><p>In both molecular epidemiology and microbial ecology, it is useful to be able to categorize specific strains of microorganisms in either an ingroup or an outgroup in a given population, e.g. to distinguish a pathogenic strain of interest from its non-virulent relatives. An \"ingroup\" refers to a group of microbes that are the primary focus of study or interest. Conversely, an \"outgroup\" consists of microbes that are closely-related to, but have evolved separately from, the ingroup. While whole genome sequencing and downstream phylogenetic analyses can be employed to do this, these techniques are often slow and can be resource intensive. Additionally, the laboratory would have to sequence the whole genome to use these tools to determine whether or not a new sample is part of the ingroup or outgroup. Alternatively, polymerase chain reaction (PCR) can be used to amplify regions of genetic material that are specific to the strain(s) of interest. PCR is faster, less expensive, and more accessible than whole genome sequencing, so having a PCR-based approach can accelerate the detection of specific strain(s) of microbes and facilitate diagnoses and/or population studies.</p>","PeriodicalId":94101,"journal":{"name":"Journal of open source software","volume":"9 101","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611387/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142775945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
small_gicp: Efficient and parallel algorithms for pointcloud registration small_gicp:点云注册的高效并行算法
Journal of open source software Pub Date : 2024-08-10 DOI: 10.21105/joss.06948
Kenji Koide
{"title":"small_gicp: Efficient and parallel algorithms for point\u0000cloud registration","authors":"Kenji Koide","doi":"10.21105/joss.06948","DOIUrl":"https://doi.org/10.21105/joss.06948","url":null,"abstract":"","PeriodicalId":94101,"journal":{"name":"Journal of open source software","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141920298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
gollum: An intuitive programmatic and visual interfacefor precomputed synthetic spectral model grids 咕噜:预计算合成光谱模型网格的直观编程和视觉界面
Journal of open source software Pub Date : 2024-08-09 DOI: 10.21105/joss.06601
Sujay Shankar, M. Gully-Santiago, Caroline V. Morley, Jiayi Cao, Kyle F. Kaplan, Karina Kimani-Stewart, Diana Gonzalez-Argúeta
{"title":"gollum: An intuitive programmatic and visual interface\u0000for precomputed synthetic spectral model grids","authors":"Sujay Shankar, M. Gully-Santiago, Caroline V. Morley, Jiayi Cao, Kyle F. Kaplan, Karina Kimani-Stewart, Diana Gonzalez-Argúeta","doi":"10.21105/joss.06601","DOIUrl":"https://doi.org/10.21105/joss.06601","url":null,"abstract":"","PeriodicalId":94101,"journal":{"name":"Journal of open source software","volume":"2 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141921486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
regional-mom6: A Python package for automaticgeneration of regional configurations for the Modular Ocean Model6 regional-mom6:用于自动生成模块海洋模式 6 区域配置的 Python 软件包
Journal of open source software Pub Date : 2024-08-09 DOI: 10.21105/joss.06857
Ashley J. Barnes, Navid C. Constantinou, A. Gibson, A. Kiss, Chris Chapman, John Reilly, Dhruv Bhagtani, Luwei Yang
{"title":"regional-mom6: A Python package for automatic\u0000generation of regional configurations for the Modular Ocean Model\u00006","authors":"Ashley J. Barnes, Navid C. Constantinou, A. Gibson, A. Kiss, Chris Chapman, John Reilly, Dhruv Bhagtani, Luwei Yang","doi":"10.21105/joss.06857","DOIUrl":"https://doi.org/10.21105/joss.06857","url":null,"abstract":"","PeriodicalId":94101,"journal":{"name":"Journal of open source software","volume":"48 49","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141923956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信