{"title":"DeepSpaceYoloDataset: Annotated Astronomical Images Captured with Smart Telescopes","authors":"Olivier Parisot","doi":"10.3390/data9010012","DOIUrl":"https://doi.org/10.3390/data9010012","url":null,"abstract":"Recent smart telescopes allow the automatic collection of a large quantity of data for specific portions of the night sky—with the goal of capturing images of deep sky objects (nebula, galaxies, globular clusters). Nevertheless, human verification is still required afterwards to check whether celestial targets are effectively visible in the images produced by these instruments. Depending on the magnitude of deep sky objects, the observation conditions and the cumulative time of data acquisition, it is possible that only stars are present in the images. In addition, unfavorable external conditions (light pollution, bright moon, etc.) can make capture difficult. In this paper, we describe DeepSpaceYoloDataset, a set of 4696 RGB astronomical images captured by two smart telescopes and annotated with the positions of deep sky objects that are effectively in the images. This dataset can be used to train detection models on this type of image, enabling the better control of the duration of capture sessions, but also to detect unexpected celestial events such as supernova.","PeriodicalId":502371,"journal":{"name":"Data","volume":"2 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139439348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ADAS Simulation Result Dataset Processing Based on Improved BP Neural Network","authors":"Songyan Zhao, Lingshan Chen, Yongchao Huang","doi":"10.3390/data9010011","DOIUrl":"https://doi.org/10.3390/data9010011","url":null,"abstract":"The autonomous driving simulation field lacks evaluation and forecasting systems for simulation results. The data obtained from the simulation of target algorithms and vehicle models cannot be reasonably estimated. This problem affects subsequent vehicle improvement and parameter calibration. The authors relied on the simulation results of the AEB algorithm. We selected the BP Neural Network as the basis and improved it with a genetic algorithm optimized via a roulette algorithm. The regression evaluation indicators of the prediction results show that the GA-BP neural network has better prediction accuracy and generalization ability than the original BP neural network and other optimized BP neural networks. This GA-BP neural network also fills the Gap in Evaluation and Prediction Systems.","PeriodicalId":502371,"journal":{"name":"Data","volume":"48 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139382185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juan Soto-Perdomo, E. Reyes-Vera, J. Montoya-Cardona, Pedro Torres
{"title":"Experimental Dataset of Tunable Mode Converter Based on Long-Period Fiber Gratings Written in Few-Mode Fiber: Impacts of Thermal, Wavelength, and Polarization Variations","authors":"Juan Soto-Perdomo, E. Reyes-Vera, J. Montoya-Cardona, Pedro Torres","doi":"10.3390/data9010010","DOIUrl":"https://doi.org/10.3390/data9010010","url":null,"abstract":"Mode division multiplexing (MDM) is currently one of the most attractive multiplexing techniques in optical communications, as it allows for an increase in the number of channels available for data transmission. Optical modal converters are one of the main devices used in this technique. Therefore, the characterization and improvement of these devices are of great current interest. In this work, we present a dataset of 49,736 near-field intensity images of a modal converter based on a long-period fiber grating (LPFG) written on a few-mode fiber (FMF). This characterization was performed experimentally at various wavelengths, polarizations, and temperature conditions when the device converted from LP01 mode to LP11 mode. The results show that the modal converter can be tuned by adjusting these parameters, and that its operation is optimal under specific circumstances which have a great impact on its performance. Additionally, the potential application of the database is validated in this work. A modal decomposition technique based on the particle swarm algorithm (PSO) was employed as a tool for determining the most effective combinations of modal weights and relative phases from the spatial distributions collected in the dataset. The proposed dataset can open up new opportunities for researchers working on image segmentation, detection, and classification problems related to MDM technology. In addition, we implement novel artificial intelligence techniques that can help in finding the optimal operating conditions for this type of device.","PeriodicalId":502371,"journal":{"name":"Data","volume":"96 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139131651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikita Sharma, J. K. Brinke, L. M. A. B. Jansen, Paul J. M. Havinga, Duc V. Le
{"title":"Wi-Gitation: Replica Wi-Fi CSI Dataset for Physical Agitation Activity Recognition","authors":"Nikita Sharma, J. K. Brinke, L. M. A. B. Jansen, Paul J. M. Havinga, Duc V. Le","doi":"10.3390/data9010009","DOIUrl":"https://doi.org/10.3390/data9010009","url":null,"abstract":"Agitation is a commonly found behavioral condition in persons with advanced dementia. It requires continuous monitoring to gain insights into agitation levels to assist caregivers in delivering adequate care. The available monitoring techniques use cameras and wearables which are distressful and intrusive and are thus often rejected by older adults. To enable continuous monitoring in older adult care, unobtrusive Wi-Fi channel state information (CSI) can be leveraged to monitor physical activities related to agitation. However, to the best of our knowledge, there are no realistic CSI datasets available for facilitating the classification of physical activities demonstrated during agitation scenarios such as disturbed walking, repetitive sitting–getting up, tapping on a surface, hand wringing, rubbing on a surface, flipping objects, and kicking. Therefore, in this paper, we present a public dataset named Wi-Gitation. For Wi-Gitation, the Wi-Fi CSI data were collected with twenty-three healthy participants depicting the aforementioned agitation-related physical activities at two different locations in a one-bedroom apartment with multiple receivers placed at different distances (0.5–8 m) from the participants. The validation results on the Wi-Gitation dataset indicate higher accuracies (F1-Scores ≥0.95) when employing mixed-data analysis, where the training and testing data share the same distribution. Conversely, in scenarios where the training and testing data differ in distribution (i.e., leave-one-out), the accuracies experienced a notable decline (F1-Scores ≤0.21). This dataset can be used for fundamental research on CSI signals and in the evaluation of advanced algorithms developed for tackling domain invariance in CSI-based human activity recognition.","PeriodicalId":502371,"journal":{"name":"Data","volume":" 32","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139137980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Ajithkumar, Gregory Gimenez, P. Stockwell, Suzan N. Almomani, Sarah A Bowden, A. Leichter, Antonio Ahn, Sharon Pattison, Sebastian Schmeier, Frank A. Frizelle, Michael R. Eccles, R. Purcell, Euan J. Rodger, Aniruddha Chatterjee
{"title":"DNA Methylome and Transcriptome Maps of Primary Colorectal Cancer and Matched Liver Metastasis","authors":"P. Ajithkumar, Gregory Gimenez, P. Stockwell, Suzan N. Almomani, Sarah A Bowden, A. Leichter, Antonio Ahn, Sharon Pattison, Sebastian Schmeier, Frank A. Frizelle, Michael R. Eccles, R. Purcell, Euan J. Rodger, Aniruddha Chatterjee","doi":"10.3390/data9010008","DOIUrl":"https://doi.org/10.3390/data9010008","url":null,"abstract":"Sequencing-based genome-wide DNA methylation, gene expression studies and associated data on paired colorectal cancer (CRC) primary and liver metastasis are very limited. We have profiled the DNA methylome and transcriptome of matched primary CRC and liver metastasis samples from the same patients. Genome-scale methylation and expression levels were examined using Reduced Representation Bisulfite Sequencing (RRBS) and RNA-Seq, respectively. To investigate DNA methylation and expression patterns, we generated a total of 1.01 × 109 RRBS reads and 4.38 x 108 RNA-Seq reads from the matched cancer tissues. Here, we describe in detail the sample features, experimental design, methods and bioinformatic pipeline for these epigenetic data. We demonstrate the quality of both the samples and sequence data obtained from the paired samples. The sequencing data obtained from this study will serve as a valuable resource for studying underlying mechanisms of distant metastasis and the utility of epigenetic profiles in cancer metastasis.","PeriodicalId":502371,"journal":{"name":"Data","volume":" 13","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139143253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Profit Maximization Model for Data Consumers with Data Providers’ Incentives in Personal Data Trading Market","authors":"Hyo-Jin Park, Hyeontaek Oh, Jun Kyun Choi","doi":"10.3390/data9010006","DOIUrl":"https://doi.org/10.3390/data9010006","url":null,"abstract":"This paper proposes a profit maximization model for a data consumer when it buys personal data from data providers (by obtaining consent) through data brokers and provides their new services to data providers (i.e., service consumers). To observe the behavioral models of data providers, the data consumer, and service consumers, this paper proposes the willingness-to-sell model of personal data of data providers (which is affected by data providers’ behavior related to explicit consent), the service quality model obtained by the collected personal data from the data consumer’s perspective, and the willingness-to-pay model of service consumers regarding provided new services from the data consumer. Particularly, this paper jointly considers the behavior of data providers and service users under a limited budget. With parameters inspired by real-world surveys on data providers, this paper shows various numerical results to check the feasibility of the proposed models.","PeriodicalId":502371,"journal":{"name":"Data","volume":"18 17","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139158392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karol J. Nava-Quiroz, J. Rojas-Serrano, G. Pérez-Rubio, I. Buendía-Roldán, M. Mejía, J. Fernández-López, E. Ramos-Martínez, L. A. López-Flores, Alma D. Del Ángel-Pablo, R. Falfán-Valencia
{"title":"Single-Nucleotide Variants in PADI2 and PADI4 and Ancestry Informative Markers in Interstitial Lung Disease and Rheumatoid Arthritis among a Mexican Mestizo Population","authors":"Karol J. Nava-Quiroz, J. Rojas-Serrano, G. Pérez-Rubio, I. Buendía-Roldán, M. Mejía, J. Fernández-López, E. Ramos-Martínez, L. A. López-Flores, Alma D. Del Ángel-Pablo, R. Falfán-Valencia","doi":"10.3390/data9010005","DOIUrl":"https://doi.org/10.3390/data9010005","url":null,"abstract":"Rheumatoid arthritis (RA) is an autoimmune disease mainly characterized by joint inflammation. It presents extra-articular manifestations, with the lungs being one of the affected areas. Among these, damage to the pulmonary interstitium (Interstitial Lung Disease—ILD) has been linked to proteins involved in the inflammatory process and related to extracellular matrix deposition and lung fibrosis establishment. Peptidyl arginine deiminase enzymes (PAD), which carry out protein citrullination, play a role in this context. A genetic association analysis was conducted on genes encoding two PAD isoforms: PAD2 and PAD4. This analysis also included ancestry informative markers and protein level determination in samples from patients with RA, RA-associated ILD, and clinically healthy controls. Significant single nucleotide variants (SNV) and one haplotype were identified as susceptibility factors for RA-ILD development. Elevated levels of PAD4 were found in RA-ILD cases, while PADI2 showed an association with RA susceptibility. This work presents data obtained from previously published research. Population variability has been noticed in genetic association studies. We present data for 14 SNVs that show geographical and genetic variation across the Mexican population, which provides highly informative content and greater intrapopulation genetic diversity. Further investigations in the field should be considered in addition to AIMs. The data presented in this study were analyzed in association with SNV genotypes in PADI2 and PADI4 to assess susceptibility to ILD in RA, as well as with changes in PAD2 and PAD4 protein levels according to carrier genotype, in addition to the use of covariates such as ancestry markers.","PeriodicalId":502371,"journal":{"name":"Data","volume":"2 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139157365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sergio Bemposta Rosende, David San José Gavilán, Javier Fernández-Andrés, Javier Sánchez-Soriano
{"title":"An Urban Traffic Dataset Composed of Visible Images and Their Semantic Segmentation Generated by the CARLA Simulator","authors":"Sergio Bemposta Rosende, David San José Gavilán, Javier Fernández-Andrés, Javier Sánchez-Soriano","doi":"10.3390/data9010004","DOIUrl":"https://doi.org/10.3390/data9010004","url":null,"abstract":"A dataset of aerial urban traffic images and their semantic segmentation is presented to be used to train computer vision algorithms, among which those based on convolutional neural networks stand out. This article explains the process of creating the complete dataset, which includes the acquisition of the images, the labeling of vehicles, pedestrians, and pedestrian crossings as well as a description of the structure and content of the dataset (which amounts to 8694 images including visible images and those corresponding to the semantic segmentation). The images were generated using the CARLA simulator (but were like those that could be obtained with fixed aerial cameras or by using multi-copter drones) in the field of intelligent transportation management. The presented dataset is available and accessible to improve the performance of vision and road traffic management systems, especially for the detection of incorrect or dangerous maneuvers.","PeriodicalId":502371,"journal":{"name":"Data","volume":"1983 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139160358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rasmus Bøgh Holmen, Nicolas Gavoille, Jaan Masso, Arūnas Burinskas
{"title":"Internationalization in the Baltic Regional Accounts: A NUTS 3 Region Dataset","authors":"Rasmus Bøgh Holmen, Nicolas Gavoille, Jaan Masso, Arūnas Burinskas","doi":"10.3390/data8120181","DOIUrl":"https://doi.org/10.3390/data8120181","url":null,"abstract":"Features of internationalization, such as trade, foreign direct investments, and international migration, are crucial for understanding the economic developments of small and open economies. However, studying internationalization at the country level may obscure significant heterogeneity in its relationship with economic growth and other economic and social outcomes. Regional accounts provide insights into the geography of internationalization, but collections of such disaggregated statistics are rarely provided by statistical bureaus. The purpose of this paper is twofold. First, we demonstrate how regional account data, including internationalization indicators, can be constructed to obtain consistent and homogeneous regional-level series using a combination of micro and macro data sources. Second, our aim is to foster spatial research on internationalization and the spatial economy in the Baltics by providing comprehensive data collection of socio-economic variables at the NUTS 3 regional level over time. This collection encompasses trade, FDI, and migration, enabling the study of internationalization and other features of the Baltic economy. We present a series of key features, revealing noticeable correlation patterns between regional development and internationalization.","PeriodicalId":502371,"journal":{"name":"Data","volume":" 29","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139207341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Tourist-Based Framework for Developing Digital Marketing for Small and Medium-Sized Enterprises in the Tourism Sector in Saudi Arabia","authors":"Rishaa Alnajim, Bahjat Fakieh","doi":"10.3390/data8120179","DOIUrl":"https://doi.org/10.3390/data8120179","url":null,"abstract":"Social media has become an essential tool for travel planning, with tourists increasingly using it to research destinations, book accommodation, and make travel arrangements. However, little is known about how tourists use social media for travel planning and what factors influence their intentions to use social media for this purpose. This thesis aims to understand tourists’ intentions to use social media for travel planning. Specifically, it investigates the factors influencing tourists’ intentions to use social media for planning travel to Saudi Arabia. It develops a machine learning (ML) classification model to assist Saudi tourism SMEs in creating effective digital marketing strategies for social media platforms. A survey was conducted with 573 tourists interested in visiting Saudi Arabia, using the Design Science Research (DSR) approach. The findings support the tourist-based theoretical framework, showing that perceived usefulness (PU), perceived ease of use (PEOU), satisfaction (SAT), marketing-generated content (MGC), and user-generated content (UGC) significantly impact tourists’ intentions to use social media for travel planning. Tourists’ characteristics and visit characteristics influenced their intentions to use MGC but not UGC. The tourist-based ML classification model, developed using the LinearSVC algorithm, achieved an accuracy of 99% when evaluated using the K-Fold Cross-Validation (KF-CV) technique. The findings of this study have several implications for Saudi tourism SMEs. First, the results suggest that SMEs should focus on developing social media content that is perceived as useful, easy to use, and satisfying. Second, the findings suggest that SMEs should focus on using MGC in their social media marketing campaigns. Third, the results suggest that SMEs should tailor their social media marketing campaigns to the characteristics of their target tourists. This study contributes to the literature on tourism marketing and social media by providing a better understanding of how tourists use social media for travel planning. Saudi tourism SMEs can use the findings of this study to develop more effective digital marketing strategies for social media platforms.","PeriodicalId":502371,"journal":{"name":"Data","volume":"96 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139227470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}