Proceedings of the Python in Science Conference最新文献

筛选
英文 中文
It's Time for the Atmospheric Science Community to ACT Together 现在是大气科学界共同行动的时候了
Proceedings of the Python in Science Conference Pub Date : 1900-01-01 DOI: 10.25080/majora-1b6fd038-020
A. Theisen
{"title":"It's Time for the Atmospheric Science Community to ACT Together","authors":"A. Theisen","doi":"10.25080/majora-1b6fd038-020","DOIUrl":"https://doi.org/10.25080/majora-1b6fd038-020","url":null,"abstract":"","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122305691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Pandata Scalable Open-Source Analysis Stack Pandata可扩展的开源分析堆栈
Proceedings of the Python in Science Conference Pub Date : 1900-01-01 DOI: 10.25080/gerudo-f2bc6f59-00b
James Bednar, Martin Durant
{"title":"The Pandata Scalable Open-Source Analysis Stack","authors":"James Bednar, Martin Durant","doi":"10.25080/gerudo-f2bc6f59-00b","DOIUrl":"https://doi.org/10.25080/gerudo-f2bc6f59-00b","url":null,"abstract":"—As the scale of scientific data analysis continues to grow, traditional domain-specific tools often struggle with data of increasing size and complexity. These tools also face sustainability challenges due to a relatively narrow user base, a limited pool of contributors, and constrained funding sources. We introduce the Pandata open-source software stack as a solution, emphasizing the use of domain-independent tools at critical stages of the data life cycle, without compromising the depth of domain-specific analyses. This set of interoperable and compositional tools, including Dask, Xarray, Numba, hvPlot, Panel, and Jupyter, provides a versatile and sustainable model for data analysis and scientific computation. Collectively, the Pandata stack covers the landscape of data access, distributed computation, and interactive visualization across any domain or scale. See github.com/panstacks/pandata to get started using this stack or to help contribute to it.","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131656808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-dimensional linked-data exploration with glue 多维关联数据探索与胶水
Proceedings of the Python in Science Conference Pub Date : 1900-01-01 DOI: 10.25080/majora-a6455521-002
T. Robitaille
{"title":"Multi-dimensional linked-data exploration with glue","authors":"T. Robitaille","doi":"10.25080/majora-a6455521-002","DOIUrl":"https://doi.org/10.25080/majora-a6455521-002","url":null,"abstract":"","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130237879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Turning HPC Systems into Interactive Data Analysis Platforms using Jupyter and Dask 利用Jupyter和Dask将HPC系统转变为交互式数据分析平台
Proceedings of the Python in Science Conference Pub Date : 1900-01-01 DOI: 10.25080/MAJORA-7DDC1DD1-01E
Anderson Banihirwe, M. Rocklin, J. Hamman, Julia Kent, Kevin Paul
{"title":"Turning HPC Systems into Interactive Data Analysis Platforms using Jupyter and Dask","authors":"Anderson Banihirwe, M. Rocklin, J. Hamman, Julia Kent, Kevin Paul","doi":"10.25080/MAJORA-7DDC1DD1-01E","DOIUrl":"https://doi.org/10.25080/MAJORA-7DDC1DD1-01E","url":null,"abstract":"","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124836183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
pyhf: a pure Python statistical fitting library for High Energy Physics with tensors and autograd 一个纯Python统计拟合库,用于高能物理与张量和自grad
Proceedings of the Python in Science Conference Pub Date : 1900-01-01 DOI: 10.25080/MAJORA-7DDC1DD1-019
M. Feickert, L. Heinrich, G. Stark, K. Cranmer
{"title":"pyhf: a pure Python statistical fitting library for High Energy Physics with tensors and autograd","authors":"M. Feickert, L. Heinrich, G. Stark, K. Cranmer","doi":"10.25080/MAJORA-7DDC1DD1-019","DOIUrl":"https://doi.org/10.25080/MAJORA-7DDC1DD1-019","url":null,"abstract":"","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125414936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Annotation of Animal Vocalizations 动物发声的自动注释
Proceedings of the Python in Science Conference Pub Date : 1900-01-01 DOI: 10.25080/MAJORA-7DDC1DD1-024
D. Nicholson
{"title":"Automated Annotation of Animal Vocalizations","authors":"D. Nicholson","doi":"10.25080/MAJORA-7DDC1DD1-024","DOIUrl":"https://doi.org/10.25080/MAJORA-7DDC1DD1-024","url":null,"abstract":"","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121037166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
vak: a neural network framework for researchers studying animal acoustic communication Vak:一个用于研究动物声音交流的神经网络框架
Proceedings of the Python in Science Conference Pub Date : 1900-01-01 DOI: 10.25080/gerudo-f2bc6f59-008
D. Nicholson, Y. Cohen
{"title":"vak: a neural network framework for researchers studying animal acoustic communication","authors":"D. Nicholson, Y. Cohen","doi":"10.25080/gerudo-f2bc6f59-008","DOIUrl":"https://doi.org/10.25080/gerudo-f2bc6f59-008","url":null,"abstract":"—How is speech like birdsong? What do we mean when we say an animal learns their vocalizations? Questions like these are answered by studying how animals communicate with sound. As in many other fields, the study of acoustic communication is being revolutionized by deep neural network models. These models enable answering questions that were previously impossible to address, in part because the models automate analysis of very large datasets. Acoustic communication researchers have developed multiple models for similar tasks, often implemented as research code with one of several libraries, such as Keras and Pytorch. This situation has created a real need for a framework that allows researchers to easily benchmark multiple models, and test new models, with their own data. To address this need, we developed vak (https://github.com/vocalpy/vak), a neural network framework designed for acoustic communication researchers. (\"vak\" is pronounced like \"talk\" or \"squawk\" and was chosen for its similarity to the Latin root voc , as in \"vocal\".) Here we describe the design of the vak, and explain how the framework makes it easy for researchers to apply neural network models to their own data. We highlight enhancements made in version 1.0 that significantly improve user experience with the library. To provide researchers without expertise in deep learning access to these models, vak can be run via a command-line interface that uses configuration files. Vak can also be used directly in scripts by scientist-coders. To achieve this, vak adapts design patterns and an API from other domain-specific PyTorch libraries such as torchvision, with modules representing neural network operations, models, datasets, and transformations for pre-and post-processing. vak also leverages the Lightning library as a backend, so that vak developers and users can focus on the domain. We provide proof-of-concept results showing how vak can be used to test new models and compare existing models from multiple model families. In closing we discuss our roadmap for development and vision for the community","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130362314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
libyt: a Tool for Parallel In Situ Analysis with yt libt:一种与yt并行的原位分析工具
Proceedings of the Python in Science Conference Pub Date : 1900-01-01 DOI: 10.25080/gerudo-f2bc6f59-011
Shin-Rong Tsai, Hsi-Yu Schive, Matthew Turk
{"title":"libyt: a Tool for Parallel In Situ Analysis with yt","authors":"Shin-Rong Tsai, Hsi-Yu Schive, Matthew Turk","doi":"10.25080/gerudo-f2bc6f59-011","DOIUrl":"https://doi.org/10.25080/gerudo-f2bc6f59-011","url":null,"abstract":"—In the era of exascale computing, storage and analysis of large scale data have become more important and difficult. We present libyt , an open source C++ library, that allows researchers to analyze and visualize data using yt or other Python packages in parallel during simulation runtime. We describe the code method for organizing adaptive mesh refinement grid data structure and simulation data, handling data transition between Python and simulation with minimal memory overhead, and conducting analysis with no additional time penalty using Python C API and NumPy C API. We demonstrate how it solves the problem in astrophysical simulations and increases disk usage efficiency. Finally, we conclude it with discussions about libyt .","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131870687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Blosc2 NDim As A Fast Explorer Of The Milky Way (Or Any Other NDim Dataset) 使用Blosc2 NDim作为银河系(或任何其他NDim数据集)的快速探索者
Proceedings of the Python in Science Conference Pub Date : 1900-01-01 DOI: 10.25080/gerudo-f2bc6f59-000
Project Blosc, Francesc Alted, Marta Iborra, Oscar Guiñón, David Ibáñez, S. Barrachina
{"title":"Using Blosc2 NDim As A Fast Explorer Of The Milky Way (Or Any Other NDim Dataset)","authors":"Project Blosc, Francesc Alted, Marta Iborra, Oscar Guiñón, David Ibáñez, S. Barrachina","doi":"10.25080/gerudo-f2bc6f59-000","DOIUrl":"https://doi.org/10.25080/gerudo-f2bc6f59-000","url":null,"abstract":"—Large multidimensional datasets are widely used in various engineering and scientific applications. Prompt access to the subsets of these datasets is crucial for an efficient exploration experience. To facilitate this, we have added support for large dimensional datasets to Blosc2, a compression and format library. The extension enables effective support for large multidimensional datasets, with a special encoding of zeros that allows for efficient handling of sparse datasets. Additionally, the new two-level data partition used in Blosc2 reduces the need for decompressing unnecessary data, further accelerating slicing speed. The Blosc2 NDim layer enables the creation and reading of n-dimensional datasets in an extremely efficient manner. This is due to a completely general n-dim 2-level partitioning, which allows for slicing and dicing of arbitrary large (and compressed) data in a more fine-grained way. Having a second partition provides a better flexibility to fit the different partitions at the different CPU cache levels, making compression even more efficient. Additionally, Blosc2 can make use of Btune, a library that automatically finds the optimal combination of compression parameters to suit user needs. Btune employs various techniques, such as a genetic algorithm and a neural network model, to discover the best parameters for a given dataset much more quickly. This approach is a significant improvement over the traditional trial-and-error method, which can take hours or even days to find the best parameters. As an example, we will demonstrate how Blosc2 NDim enables fast exploration of the Milky Way using the Gaia DR3 dataset.","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"3 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120982968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
aPhyloGeo-Covid: A Web Interface for Reproducible Phylogeographic Analysis of SARS-CoV-2 Variation using Neo4j and Snakemake aPhyloGeo-Covid:使用Neo4j和Snakemake对SARS-CoV-2变异进行可重复系统地理分析的Web界面
Proceedings of the Python in Science Conference Pub Date : 1900-01-01 DOI: 10.25080/gerudo-f2bc6f59-00f
Wanlin Li, Nadia Tahiri
{"title":"aPhyloGeo-Covid: A Web Interface for Reproducible Phylogeographic Analysis of SARS-CoV-2 Variation using Neo4j and Snakemake","authors":"Wanlin Li, Nadia Tahiri","doi":"10.25080/gerudo-f2bc6f59-00f","DOIUrl":"https://doi.org/10.25080/gerudo-f2bc6f59-00f","url":null,"abstract":"—The gene sequencing data, along with the associated lineage tracing and research data generated throughout the Coronavirus disease 2019 (COVID-19) pandemic, constitute invaluable resources that profoundly empower phylogeography research. To optimize the utilization of these resources, we have developed an interactive analysis platform called aPhyloGeo-Covid, leveraging the capabilities of Neo4j, Snakemake, and Python. This platform enables researchers to explore and visualize diverse data sources specifically relevant to SARS-CoV-2 for phylogeographic analysis. The integrated Neo4j database acts as a comprehensive repository, consolidating COVID-19 pandemic-related sequences information, climate data, and demographic data obtained from public databases, facilitating efficient filtering and organization of input data for phylogeographical studies. Presently, the database encompasses over 113,774 nodes and 194,381 relationships. Additionally, aPhyloGeo-Covid provides a scalable and reproducible phylogeographic workflow for investigating the intricate relationship between geographic features and the patterns of variation in diverse SARS-CoV-2 variants. The code repository of platform is publicly accessible on GitHub (https://github.com/tahiri-lab/iPhyloGeo/tree/iPhylooGeo-neo4j), providing researchers with a valuable tool to analyze and explore the intricate dynamics of SARS-CoV-2 within a phylogeographic context.","PeriodicalId":364654,"journal":{"name":"Proceedings of the Python in Science Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124256163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信