2017 New York Scientific Data Summit (NYSDS)最新文献

筛选
英文 中文
Searching for millions of objects in the BOSS spectroscopic survey data with H5Boss 用H5Boss在BOSS光谱调查数据中搜索数百万个物体
2017 New York Scientific Data Summit (NYSDS) Pub Date : 2017-08-01 DOI: 10.1109/NYSDS.2017.8085044
Jialin Liu, D. Bard, Q. Koziol, Stephen Bailey, Prabhat
{"title":"Searching for millions of objects in the BOSS spectroscopic survey data with H5Boss","authors":"Jialin Liu, D. Bard, Q. Koziol, Stephen Bailey, Prabhat","doi":"10.1109/NYSDS.2017.8085044","DOIUrl":"https://doi.org/10.1109/NYSDS.2017.8085044","url":null,"abstract":"Baryon Oscillation Spectroscopic Survey(BOSS) from the Sloan Digital Sky Survey (SDSS), typically produces a single data file per object observed in the FITS format. The FITS format has been a default file format in this field of astronomy for many years. None of the FITS I/O libraries support parallel I/O, thus not a fit in today’s high performance computing. The issue becomes more and more severe as the size of the data and the number of files keep increasing. In this paper, we introduce an alternative file format and build a parallel python tool based on H5py. The developed H5Boss library supports efficient file conversion, large scale data query, and parallel I/O. Given the typical analytics pattern, we are able to scale the H5Boss to millions of object query, with minimum I/O and communication overhead. This study presents a clear picture about the BOSS data analytics and data management with a HPC friendly file format, HDF5.","PeriodicalId":380859,"journal":{"name":"2017 New York Scientific Data Summit (NYSDS)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133017540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Robust and scalable deep learning for X-ray synchrotron image analysis 用于x射线同步加速器图像分析的鲁棒和可扩展深度学习
2017 New York Scientific Data Summit (NYSDS) Pub Date : 2017-08-01 DOI: 10.1109/NYSDS.2017.8085045
Nicole Meister, Ziqiao Guan, Jinzhen Wang, Ronald Lashley, Jiliang Liu, Julien Lhermitte, K. Yager, Hong Qin, Bo Sun, Dantong Yu
{"title":"Robust and scalable deep learning for X-ray synchrotron image analysis","authors":"Nicole Meister, Ziqiao Guan, Jinzhen Wang, Ronald Lashley, Jiliang Liu, Julien Lhermitte, K. Yager, Hong Qin, Bo Sun, Dantong Yu","doi":"10.1109/NYSDS.2017.8085045","DOIUrl":"https://doi.org/10.1109/NYSDS.2017.8085045","url":null,"abstract":"X-ray scattering is a key technique in modern synchrotron facilities towards material analysis and discovery via structural characterization at the molecular scale and nano-scale. Image classification and tagging play a crucial role in recognizing patterns, inferring meaningful physical properties from sample, and guiding subsequent experiment steps. We designed deeplearning based image classification pipelines and gained significant improvements in terms of accuracy and speed. Constrained by available computing resources and optimization library, we need to make trade-off among computation efficiency, input image size and volume, and the flexibility and stability of processing images with different levels of qualities and artifacts. Consequently, our deep learning framework requires careful data preprocessing techniques to down-sample images and extract true image signals. However, X-ray scattering images contain different levels of noise, numerous gaps, rotations, and defects arising from detector limitations, sample (mis)alignment, and experimental configuration. Traditional methods of healing x-ray scattering images make strong assumptions about these artifacts and require hand-crafted procedures and experiment meta-data to de-noise, interpolate measured data to eliminate gaps, and rotate and translate images to align the center of samples with the center of images. These manual procedures are error-prone, experience-driven, and isolated from the intended image prediction, and consequently not scalable to the data rate of X-ray images from modern detectors. We aim to explore deeplearning based image classification techniques that are robust and capable of leverage high-definition experimental images with rich variations even in a production environment that is not defect-free, and ultimately automate labor-intensive data preprocessing tasks and integrate them seamlessly into our TensorFlow based experimental data analysis framework.","PeriodicalId":380859,"journal":{"name":"2017 New York Scientific Data Summit (NYSDS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114938282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信