{"title":"Data in Astrophysics and Geophysics: Novel Research and Applications","authors":"V. Srećković, Milan S. Dimitrijević, Z. Mijić","doi":"10.3390/data9020032","DOIUrl":"https://doi.org/10.3390/data9020032","url":null,"abstract":"Rapid development of communication technologies and constant technological improvements as a result of scientific discoveries require the establishment of specific databases [...]","PeriodicalId":502371,"journal":{"name":"Data","volume":" 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139791366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingjing Sun, Chong Xu, Liye Feng, Lei Li, Xuewei Zhang, Wentao Yang
{"title":"The Yinshan Mountains Record over 10,000 Landslides","authors":"Jingjing Sun, Chong Xu, Liye Feng, Lei Li, Xuewei Zhang, Wentao Yang","doi":"10.3390/data9020031","DOIUrl":"https://doi.org/10.3390/data9020031","url":null,"abstract":"China boasts a vast expanse of mountainous terrain, characterized by intricate geological conditions and structural features, resulting in frequent geological disasters. Among these, landslides, as prototypical geological hazards, pose significant threats to both lives and property. Consequently, conducting a comprehensive landslide inventory in mountainous regions is imperative for current research. This study concentrates on the Yinshan Mountains, an ancient fault-block mountain range spanning east–west in the central Inner Mongolia Autonomous Region, extending from Langshan Mountains in the west to Damaqun Mountains in the east, with the narrow sense Xiao–Yin Mountains District in between. Employing multi-temporal high-resolution remote sensing images from Google Earth, this study conducted visual interpretation, identifying 10,968 landslides in the Yinshan area, encompassing a total area of 308.94 km2. The largest landslide occupies 2.95 km2, while the smallest covers 84.47 m2. Specifically, the Langshan area comprises 331 landslides with a total area of 11.96 km2, the narrow sense Xiao–Yin Mountains include 3393 landslides covering 64.13 km2, and the Manhan Mountains, Damaqun Mountains, and adjacent areas account for 7244 landslides over a total area of 232.85 km2. This research not only contributes to global landslide cataloging initiatives but also serves as a robust foundation for future geohazard prevention and management efforts.","PeriodicalId":502371,"journal":{"name":"Data","volume":"104 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139794279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingjing Sun, Chong Xu, Liye Feng, Lei Li, Xuewei Zhang, Wentao Yang
{"title":"The Yinshan Mountains Record over 10,000 Landslides","authors":"Jingjing Sun, Chong Xu, Liye Feng, Lei Li, Xuewei Zhang, Wentao Yang","doi":"10.3390/data9020031","DOIUrl":"https://doi.org/10.3390/data9020031","url":null,"abstract":"China boasts a vast expanse of mountainous terrain, characterized by intricate geological conditions and structural features, resulting in frequent geological disasters. Among these, landslides, as prototypical geological hazards, pose significant threats to both lives and property. Consequently, conducting a comprehensive landslide inventory in mountainous regions is imperative for current research. This study concentrates on the Yinshan Mountains, an ancient fault-block mountain range spanning east–west in the central Inner Mongolia Autonomous Region, extending from Langshan Mountains in the west to Damaqun Mountains in the east, with the narrow sense Xiao–Yin Mountains District in between. Employing multi-temporal high-resolution remote sensing images from Google Earth, this study conducted visual interpretation, identifying 10,968 landslides in the Yinshan area, encompassing a total area of 308.94 km2. The largest landslide occupies 2.95 km2, while the smallest covers 84.47 m2. Specifically, the Langshan area comprises 331 landslides with a total area of 11.96 km2, the narrow sense Xiao–Yin Mountains include 3393 landslides covering 64.13 km2, and the Manhan Mountains, Damaqun Mountains, and adjacent areas account for 7244 landslides over a total area of 232.85 km2. This research not only contributes to global landslide cataloging initiatives but also serves as a robust foundation for future geohazard prevention and management efforts.","PeriodicalId":502371,"journal":{"name":"Data","volume":"78 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139854228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data in Astrophysics and Geophysics: Novel Research and Applications","authors":"V. Srećković, Milan S. Dimitrijević, Z. Mijić","doi":"10.3390/data9020032","DOIUrl":"https://doi.org/10.3390/data9020032","url":null,"abstract":"Rapid development of communication technologies and constant technological improvements as a result of scientific discoveries require the establishment of specific databases [...]","PeriodicalId":502371,"journal":{"name":"Data","volume":"64 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139851299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Khoruzhaya, T. Bobrovskaya, D. V. Kozlov, Dmitriy Kuligovskiy, Vladimir P. Novik, Kirill M. Arzamasov, E. I. Kremneva
{"title":"Expanded Brain CT Dataset for the Development of AI Systems for Intracranial Hemorrhage Detection and Classification","authors":"A. Khoruzhaya, T. Bobrovskaya, D. V. Kozlov, Dmitriy Kuligovskiy, Vladimir P. Novik, Kirill M. Arzamasov, E. I. Kremneva","doi":"10.3390/data9020030","DOIUrl":"https://doi.org/10.3390/data9020030","url":null,"abstract":"Intracranial hemorrhage (ICH) is a dangerous life-threatening condition leading to disability. Timely and high-quality diagnosis plays a huge role in the course and outcome of this disease. The gold standard in determining ICH is computed tomography. This method requires a prompt involvement of highly qualified personnel, which is not always possible, for example, in case of a staff shortage or increased workload. In such a situation, every minute counts, and time can be lost. The solution to this problem seems to be a set of diagnostic decisions, including the use of artificial intelligence, which will help to identify patients with ICH in a timely manner and provide prompt and quality medical care. However, the main obstacle to the development of artificial intelligence is a lack of high-quality datasets for training and testing. In this paper, we present a dataset including 800 brain CT scans consisting of multiple series of DICOM images with and without signs of ICH, enriched with clinical and technical parameters, as well as the methodology of its generation utilizing natural language processing tools. The dataset is publicly available, which contributes to increased competition in the development of artificial intelligence systems and their advancement and quality improvement.","PeriodicalId":502371,"journal":{"name":"Data","volume":"213 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139799614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Khoruzhaya, T. Bobrovskaya, D. V. Kozlov, Dmitriy Kuligovskiy, Vladimir P. Novik, Kirill M. Arzamasov, E. I. Kremneva
{"title":"Expanded Brain CT Dataset for the Development of AI Systems for Intracranial Hemorrhage Detection and Classification","authors":"A. Khoruzhaya, T. Bobrovskaya, D. V. Kozlov, Dmitriy Kuligovskiy, Vladimir P. Novik, Kirill M. Arzamasov, E. I. Kremneva","doi":"10.3390/data9020030","DOIUrl":"https://doi.org/10.3390/data9020030","url":null,"abstract":"Intracranial hemorrhage (ICH) is a dangerous life-threatening condition leading to disability. Timely and high-quality diagnosis plays a huge role in the course and outcome of this disease. The gold standard in determining ICH is computed tomography. This method requires a prompt involvement of highly qualified personnel, which is not always possible, for example, in case of a staff shortage or increased workload. In such a situation, every minute counts, and time can be lost. The solution to this problem seems to be a set of diagnostic decisions, including the use of artificial intelligence, which will help to identify patients with ICH in a timely manner and provide prompt and quality medical care. However, the main obstacle to the development of artificial intelligence is a lack of high-quality datasets for training and testing. In this paper, we present a dataset including 800 brain CT scans consisting of multiple series of DICOM images with and without signs of ICH, enriched with clinical and technical parameters, as well as the methodology of its generation utilizing natural language processing tools. The dataset is publicly available, which contributes to increased competition in the development of artificial intelligence systems and their advancement and quality improvement.","PeriodicalId":502371,"journal":{"name":"Data","volume":"131 2-3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139859584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jordan Truman Paul Noel, Vinicius Prado da Fonseca, Amilcar Soares
{"title":"A Comprehensive Data Pipeline for Comparing the Effects of Momentum on Sports Leagues","authors":"Jordan Truman Paul Noel, Vinicius Prado da Fonseca, Amilcar Soares","doi":"10.3390/data9020029","DOIUrl":"https://doi.org/10.3390/data9020029","url":null,"abstract":"Momentum has been a consistently studied aspect of sports science for decades. Among the established literature, there has, at times, been a discrepancy between conclusions. However, if momentum is indeed an actual phenomenon, it would affect all aspects of sports, from player evaluation to pre-game prediction and betting. Therefore, using momentum-based features that quantify a team’s linear trend of play, we develop a data pipeline that uses a small sample of recent games to assess teams’ quality of play and measure the predictive power of momentum-based features versus the predictive power of more traditional frequency-based features across several leagues using several machine learning techniques. More precisely, we use our pipeline to determine the differences in the predictive power of momentum-based features and standard statistical features for the National Hockey League (NHL), National Basketball Association (NBA), and five major first-division European football leagues. Our findings show little evidence that momentum has superior predictive power in the NBA. Still, we found some instances of the effects of momentum on the NHL that produced better pre-game predictors, whereas we view a similar trend in European football/soccer. Our results indicate that momentum-based features combined with frequency-based features could improve pre-game prediction models and that, in the future, momentum should be studied more from a feature/performance indicator point-of-view and less from the view of the dependence of sequential outcomes, thus attempting to distance momentum from the binary view of winning and losing.","PeriodicalId":502371,"journal":{"name":"Data","volume":"257 20","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139821370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Valerija Movcana, Arnis Strods, Karīna Narbute, Fēlikss Rūmnieks, Roberts Rimša, G. Mozolevskis, Maksims Ivanovs, Roberts Kadiķis, Karlis Zviedris, Laura Leja, Anastasija Zujeva, Tamāra Laimiņa, A. Abols
{"title":"Organ-On-A-Chip (OOC) Image Dataset for Machine Learning and Tissue Model Evaluation","authors":"Valerija Movcana, Arnis Strods, Karīna Narbute, Fēlikss Rūmnieks, Roberts Rimša, G. Mozolevskis, Maksims Ivanovs, Roberts Kadiķis, Karlis Zviedris, Laura Leja, Anastasija Zujeva, Tamāra Laimiņa, A. Abols","doi":"10.3390/data9020028","DOIUrl":"https://doi.org/10.3390/data9020028","url":null,"abstract":"Organ-on-a-chip (OOC) technology has emerged as a groundbreaking approach for emulating the physiological environment, revolutionizing biomedical research, drug development, and personalized medicine. OOC platforms offer more physiologically relevant microenvironments, enabling real-time monitoring of tissue, to develop functional tissue models. Imaging methods are the most common approach for daily monitoring of tissue development. Image-based machine learning serves as a valuable tool for enhancing and monitoring OOC models in real-time. This involves the classification of images generated through microscopy contributing to the refinement of model performance. This paper presents an image dataset, containing cell images generated from OOC setup with different cell types. There are 3072 images generated by an automated brightfield microscopy setup. For some images, parameters such as cell type, seeding density, time after seeding and flow rate are provided. These parameters along with predefined criteria can contribute to the evaluation of image quality and identification of potential artifacts. This dataset can be used as a basis for training machine learning classifiers for automated data analysis generated from an OOC setup providing more reliable tissue models, automated decision-making processes within the OOC framework and efficient research in the future.","PeriodicalId":502371,"journal":{"name":"Data","volume":"47 38","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139683915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jordan Truman Paul Noel, Vinicius Prado da Fonseca, Amilcar Soares
{"title":"A Comprehensive Data Pipeline for Comparing the Effects of Momentum on Sports Leagues","authors":"Jordan Truman Paul Noel, Vinicius Prado da Fonseca, Amilcar Soares","doi":"10.3390/data9020029","DOIUrl":"https://doi.org/10.3390/data9020029","url":null,"abstract":"Momentum has been a consistently studied aspect of sports science for decades. Among the established literature, there has, at times, been a discrepancy between conclusions. However, if momentum is indeed an actual phenomenon, it would affect all aspects of sports, from player evaluation to pre-game prediction and betting. Therefore, using momentum-based features that quantify a team’s linear trend of play, we develop a data pipeline that uses a small sample of recent games to assess teams’ quality of play and measure the predictive power of momentum-based features versus the predictive power of more traditional frequency-based features across several leagues using several machine learning techniques. More precisely, we use our pipeline to determine the differences in the predictive power of momentum-based features and standard statistical features for the National Hockey League (NHL), National Basketball Association (NBA), and five major first-division European football leagues. Our findings show little evidence that momentum has superior predictive power in the NBA. Still, we found some instances of the effects of momentum on the NHL that produced better pre-game predictors, whereas we view a similar trend in European football/soccer. Our results indicate that momentum-based features combined with frequency-based features could improve pre-game prediction models and that, in the future, momentum should be studied more from a feature/performance indicator point-of-view and less from the view of the dependence of sequential outcomes, thus attempting to distance momentum from the binary view of winning and losing.","PeriodicalId":502371,"journal":{"name":"Data","volume":"26 20","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139881413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henrik tom Wörden, Florian Spreckelsen, Stefan Luther, Ulrich Parlitz, A. Schlemmer
{"title":"Mapping Hierarchical File Structures to Semantic Data Models for Efficient Data Integration into Research Data Management Systems","authors":"Henrik tom Wörden, Florian Spreckelsen, Stefan Luther, Ulrich Parlitz, A. Schlemmer","doi":"10.3390/data9020024","DOIUrl":"https://doi.org/10.3390/data9020024","url":null,"abstract":"Although other methods exist to store and manage data in modern information technology, the standard solution is file systems. Therefore, keeping well-organized file structures and file system layouts can be key to a sustainable research data management infrastructure. However, file structures alone lack several important capabilities for FAIR data management: the two most significant being insufficient visualization of data and inadequate possibilities for searching and obtaining an overview. Research data management systems (RDMSs) can fill this gap, but many do not support the simultaneous use of the file system and RDMS. This simultaneous use can have many benefits, but keeping data in RDMS in synchrony with the file structure is challenging. Here, we present concepts that allow for keeping file structures and semantic data models (in RDMS) synchronous. Furthermore, we propose a specification in yaml format that allows for a structured and extensible declaration and implementation of a mapping between the file system and data models used in semantic research data management. Implementing these concepts will facilitate the re-use of specifications for multiple use cases. Furthermore, the specification can serve as a machine-readable and, at the same time, human-readable documentation of specific file system structures. We demonstrate our work using the Open Source RDMS LinkAhead (previously named “CaosDB”).","PeriodicalId":502371,"journal":{"name":"Data","volume":"77 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139593674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}