Xiao-Ran Zhou, Sebastian Beier, Dominik Brilhaus, Cristina Martins Rodrigues, Timo Mühlhaus, Dirk von Suchodoletz, Richard M. Twyman, Björn Usadel, Angela Kranz
{"title":"DataPLAN: A Web-Based Data Management Plan Generator for the Plant Sciences","authors":"Xiao-Ran Zhou, Sebastian Beier, Dominik Brilhaus, Cristina Martins Rodrigues, Timo Mühlhaus, Dirk von Suchodoletz, Richard M. Twyman, Björn Usadel, Angela Kranz","doi":"10.3390/data8110159","DOIUrl":"https://doi.org/10.3390/data8110159","url":null,"abstract":"Research data management (RDM) combines a set of practices for the organization, storage and preservation of data from research projects. The RDM strategy of a project is usually formalized as a data management plan (DMP)—a document that sets out procedures to ensure data findability, accessibility, interoperability and reusability (FAIR-ness). Many aspects of RDM are standardized across disciplines so that data and metadata are reusable, but the components of DMPs in the plant sciences are often disconnected. The inability to reuse plant-specific DMP content across projects and funding sources requires additional time and effort to write unique DMPs for different settings. To address this issue, we developed DataPLAN—an open-source tool incorporating prewritten DMP content for the plant sciences that can be used online or offline to prepare multiple DMPs. The current version of DataPLAN supports Horizon 2020 and Horizon Europe projects, as well as projects funded by the German Research Foundation (DFG). Furthermore, DataPLAN offers the option for users to customize their own templates. Additional templates to accommodate other funding schemes will be added in the future. DataPLAN reduces the workload needed to create or update DMPs in the plant sciences by presenting standardized RDM practices optimized for different funding contexts.","PeriodicalId":36824,"journal":{"name":"Data","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135274085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Panel Regression Modelling for COVID-19 Infections and Deaths in Tamil Nadu, India","authors":"Rajarathinam Arunachalam","doi":"10.3390/data8100158","DOIUrl":"https://doi.org/10.3390/data8100158","url":null,"abstract":"The impacts of the coronavirus disease 2019 (COVID-19) pandemic have been extremely severe, with both economic and health crises experienced worldwide. Based on the panel regression model, this study examined the trends and correlations in the number of COVID-19-related deaths and the number of COVID-19-infected cases in all 37 regions of the Tamil Nadu state in India, in August 2020. The fixed effects model had the greatest R2 value of 78% and exhibited significant results. The slope coefficient was also highly significant, showing a considerable variation in the relationship between new COVID-19 cases and deaths. Additionally, for every unit increase in COVID-19-infected cases, the death rate increased by 0.02%.","PeriodicalId":36824,"journal":{"name":"Data","volume":"36 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135414158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ivo Silva , Cristiano Pendão, Joaquín Torres-Sospedra, Adriano Moreira
{"title":"Industrial Environment Multi-Sensor Dataset for Vehicle Indoor Tracking with Wi-Fi, Inertial and Odometry Data","authors":"Ivo Silva , Cristiano Pendão, Joaquín Torres-Sospedra, Adriano Moreira","doi":"10.3390/data8100157","DOIUrl":"https://doi.org/10.3390/data8100157","url":null,"abstract":"This paper describes a dataset collected in an industrial setting using a mobile unit resembling an industrial vehicle equipped with several sensors. Wi-Fi interfaces collect signals from available Access Points (APs), while motion sensors collect data regarding the mobile unit’s movement (orientation and displacement). The distinctive features of this dataset include synchronous data collection from multiple sensors, such as Wi-Fi data acquired from multiple interfaces (including a radio map), orientation provided by two low-cost Inertial Measurement Unit (IMU) sensors, and displacement (travelled distance) measured by an absolute encoder attached to the mobile unit’s wheel. Accurate ground-truth information was determined using a computer vision approach that recorded timestamps as the mobile unit passed through reference locations. We assessed the quality of the proposed dataset by applying baseline methods for dead reckoning and Wi-Fi fingerprinting. The average positioning error for simple dead reckoning, without using any other absolute positioning technique, is 8.25 m and 11.66 m for IMU1 and IMU2, respectively. The average positioning error for simple Wi-Fi fingerprinting is 2.19 m when combining the RSSI information from five Wi-Fi interfaces. This dataset contributes to the fields of Industry 4.0 and mobile sensing, providing researchers with a resource to develop, test, and evaluate indoor tracking solutions for industrial vehicles.","PeriodicalId":36824,"journal":{"name":"Data","volume":"39 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135412994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Data-Driven Exploration of a New Islamic Fatwas Dataset for Arabic NLP Tasks","authors":"Ohoud Alyemny, Hend Al-Khalifa, Abdulrahman Mirza","doi":"10.3390/data8100155","DOIUrl":"https://doi.org/10.3390/data8100155","url":null,"abstract":"Islamic content is a broad and diverse domain that encompasses various sources, topics, and perspectives. However, there is a lack of comprehensive and reliable datasets that can facilitate conducting studies on Islamic content. In this paper, we present fatwaset, the first public Arabic dataset of Islamic fatwas. It contains Islamic fatwas that we collected from various trusted and authenticated sources in the Islamic fatwa domain, such as agencies, religious scholars, and websites. Fatwaset is a rich resource as it does not only contain fatwas but also includes a considerable set of their surrounding metadata. It can be used for many natural language processing (NLP) tasks, such as language modeling, question answering, author attribution, topic identification, text classification, and text summarization. It can also support other domains that are related to Islamic culture, such as philosophy and language art. We describe the methodology and criteria we used to select the content, as well as the challenges and limitations we faced. Additionally, we perform an Exploratory Data Analysis (EDA), which investigates the dataset from different perspectives. The results of the EDA reveal important information that greatly benefits researchers in this area.","PeriodicalId":36824,"journal":{"name":"Data","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135730705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cybersecurity Risk Assessments within Critical Infrastructure Social Networks","authors":"Alimbubi Aktayeva, Yerkhan Makatov, Akku Kubigenova Tulegenovna, Aibek Dautov, Rozamgul Niyazova, Maxud Zhamankarin, Sergey Khan","doi":"10.3390/data8100156","DOIUrl":"https://doi.org/10.3390/data8100156","url":null,"abstract":"Cybersecurity social networking is a new scientific and engineering discipline that was interdisciplinary in its early days, but is now transdisciplinary. The issues of reviewing and analyzing of principal tasks related to information collection, monitoring of social networks, assessment methods, and preventing and combating cybersecurity threats are, therefore, essential and pending. There is a need to design certain methods, models, and program complexes aimed at estimating risks related to the cyberspace of social networks and the support of their activities. This study considers a risk to be the combination of consequences of a given event (or incident) with a probable occurrence (likelihood of occurrence) involved, while risk assessment is a general issue of identification, estimation, and evaluation of risk. The findings of the study made it possible to elucidate that the technique of cognitive modeling for risk assessment is part of a comprehensive cybersecurity approach included in the requirements of basic IT standards, including IT security risk management. The study presents a comprehensive approach in the field of cybersecurity in social networks that allows for consideration of all the elements that constitute cybersecurity as a complex, interconnected system. The ultimate goal of this approach to cybersecurity is the organization of an uninterrupted scheme of protection against any impacts related to physical, hardware, software, network, and human objects or resources of the critical infrastructure of social networks, as well as the integration of various levels and means of protection.","PeriodicalId":36824,"journal":{"name":"Data","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135666660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dmitry P. Karabanov, Dmitry D. Pavlov, Yury Y. Dgebuadze, Mikhail I. Bazarov, Elena A. Borovikova, Yuriy V. Gerasimov, Yulia V. Kodukhova, Pavel B. Mikheev, Eduard V. Nikitin, Tatyana L. Opaleva, Yuri A. Severov, Rimma Z. Sabitova, Alexey K. Smirnov, Yury I. Solomatin, Igor A. Stolbunov, Alexander I. Tsvetkov, Stanislav A. Vlasenko, Irina S. Voroshilova, Wenjun Zhong, Xiaowei Zhang, Alexey A. Kotov
{"title":"A Dataset of Non-Indigenous and Native Fish of the Volga and Kama Rivers (European Russia)","authors":"Dmitry P. Karabanov, Dmitry D. Pavlov, Yury Y. Dgebuadze, Mikhail I. Bazarov, Elena A. Borovikova, Yuriy V. Gerasimov, Yulia V. Kodukhova, Pavel B. Mikheev, Eduard V. Nikitin, Tatyana L. Opaleva, Yuri A. Severov, Rimma Z. Sabitova, Alexey K. Smirnov, Yury I. Solomatin, Igor A. Stolbunov, Alexander I. Tsvetkov, Stanislav A. Vlasenko, Irina S. Voroshilova, Wenjun Zhong, Xiaowei Zhang, Alexey A. Kotov","doi":"10.3390/data8100154","DOIUrl":"https://doi.org/10.3390/data8100154","url":null,"abstract":"Fish in the Volga-Kama River System (the largest river system in Europe) are important as a crucial food source for local populations; fish have the highest trophic level among hydrobionts. The purpose of this research is to describe the diversity of non-indigenous and native fish in the Volga and Kama Rivers, in the European part of Russia. This dataset encompasses data from June 2001 to September 2021 and comprises 1888 records (36,376 individual observations) for littoral and pelagic habitats from 143 sampling sites, representing 52 species from 42 genera in 22 families. The dataset has a Darwin Core standard format and has been fully released in the Global Biodiversity Information Facility (GBIF) under CC-BY 4.0 International license. The data are validated with several international databases such as FishBase, Eschmeyer’s Catalog of Fishes, the Barcode of Life Data System, and the SAS.Planet geoinformations system. Newly established populations have been found for several species belonging to the following Actinopteri families: Alosidae, Anguillidae, Cichlidae, Ehiravidae, Gobiidae, Odontobutidae, Syngnathidae, and Xenocyprididae. Therefore, this dataset can be used in the particular taxon species distribution analysis, which are especially important for non-indigenous species.","PeriodicalId":36824,"journal":{"name":"Data","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135823911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adam M. Jones, Gozde Sahin, Zachary W. Murdock, Yunhao Ge, Ao Xu, Yuecheng Li, Di Wu, Shuo Ni, Po-Hsuan Huang, Kiran Lekkala, Laurent Itti
{"title":"USC-DCT: A Collection of Diverse Classification Tasks","authors":"Adam M. Jones, Gozde Sahin, Zachary W. Murdock, Yunhao Ge, Ao Xu, Yuecheng Li, Di Wu, Shuo Ni, Po-Hsuan Huang, Kiran Lekkala, Laurent Itti","doi":"10.3390/data8100153","DOIUrl":"https://doi.org/10.3390/data8100153","url":null,"abstract":"Machine learning is a crucial tool for both academic and real-world applications. Classification problems are often used as the preferred showcase in this space, which has led to a wide variety of datasets being collected and utilized for a myriad of applications. Unfortunately, there is very little standardization in how these datasets are collected, processed, and disseminated. As new learning paradigms like lifelong or meta-learning become more popular, the demand for merging tasks for at-scale evaluation of algorithms has also increased. This paper provides a methodology for processing and cleaning datasets that can be applied to existing or new classification tasks as well as implements these practices in a collection of diverse classification tasks called USC-DCT. Constructed using 107 classification tasks collected from the internet, this collection provides a transparent and standardized pipeline that can be useful for many different applications and frameworks. While there are currently 107 tasks, USC-DCT is designed to enable future growth. Additional discussion provides explanations of applications in machine learning paradigms such as transfer, lifelong, or meta-learning, how revisions to the collection will be handled, and further tips for curating and using classification tasks at this scale.","PeriodicalId":36824,"journal":{"name":"Data","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136013300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roberta Bettinetti, Roberta Piscia, Marina Manca, Silvana Galassi, Silvia Quadroni, Carlo Dossi, Rossella Perna, Emanuela Boggio, Ginevra Boldrocchi, Michela Mazzoni, Benedetta Villa
{"title":"Dataset of Contamination (2009–2022) Legacy Contaminants (PCB and DDT) in Zooplankton of Lake Maggiore (CIPAIS, International Commission for the Protection of Italian-Swiss Waters)","authors":"Roberta Bettinetti, Roberta Piscia, Marina Manca, Silvana Galassi, Silvia Quadroni, Carlo Dossi, Rossella Perna, Emanuela Boggio, Ginevra Boldrocchi, Michela Mazzoni, Benedetta Villa","doi":"10.3390/data8100152","DOIUrl":"https://doi.org/10.3390/data8100152","url":null,"abstract":"In this paper, we describe a 13-year (2009–2022) dataset of legacy POP concentrations (DDTtot and sumPCB14 from 2016 isomers and congeners concentrations are also reported) in the planktonic crustaceans of Lake Maggiore (≥450 µm size fraction). The data were collected in the framework of a monitoring program finalized to assess the presence of pollutants in the lake biota, including zooplankton organisms directly preyed by fish. The data report both concentration of DDTtot and sumPCB14 in the zooplankton and the standing stock density and biomass of the population in each season. The dataset allows for detecting changes in the concentration over the long term and within a year, thus providing evidence for the seasonal and the plurennial variations in the presence of these pollutants in the lake. They also provide a basis for further studies aimed at modeling paths and the fate of persistent organic pollutants, for which the amount of toxicants stocked in the zooplankton compartment linked to fish is a crucial estimate.","PeriodicalId":36824,"journal":{"name":"Data","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136012934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking a Decade of Hydrogeological Emergencies in Italian Municipalities","authors":"Alessio Gatto, Stefano Clò, Federico Martellozzo, Samuele Segoni","doi":"10.3390/data8100151","DOIUrl":"https://doi.org/10.3390/data8100151","url":null,"abstract":"This dataset collects tabular and geographical information about all hydrogeological disasters (landslides and floods) that occurred in Italy from 2013 to 2022 that caused such severe impacts as to require the declaration of national-level emergencies. The severity and spatiotemporal extension of each emergency are characterized in terms of duration and timing, funds requested by local administrations, funds approved by the national government, and municipalities and provinces hit by the event (further subdivided between those included in the emergency and those not, depending on whether relevant impacts were ascertained). Italian exposure to hydrogeological risk is portrayed strikingly: in the covered period, 123 emergencies affected Italy, all regions were struck at least once, and some provinces were struck more than 10 times. Damage declared by local institutions adds up to EUR 11,000,000,000, while national recovery funds add up to EUR 1,000,000,000. The dataset may foster further research on risk assessment, econometric analysis, public policy support, and decision-making implementation. Moreover, it provides systematic evidence helpful in raising awareness about hydrogeological risks affecting Italy.","PeriodicalId":36824,"journal":{"name":"Data","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136057759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Power-Flow Simulations for Integrating Renewable Distributed Generation from Biogas, Photovoltaic, and Small Wind Sources on an Underground Distribution Feeder","authors":"Welson Bassi, Igor Cordeiro, Ildo Luis Sauer","doi":"10.3390/data8100150","DOIUrl":"https://doi.org/10.3390/data8100150","url":null,"abstract":"The rapid expansion of distributed generation leads to the integration of an increasing number of energy generation sources. However, integrating these sources into electrical distribution networks presents specific challenges to ensure that the distribution networks can effectively accommodate the associated distributed energy and power. Thus, it is crucial to evaluate the electrical effects of power along the conductors, components, and loads. Power-flow analysis is a well-established numerical methodology for assessing parameters and quantities within power systems during steady-state operation. The University of São Paulo’s Cidade Universitária “Armando de Salles Oliveira” (CUASO) campus in São Paulo, Brazil, features an underground power distribution system. The Institute of Energy and Environment (IEE) leads the integration of several distributed generation (DG) sources, including a biogas plant, photovoltaic installations, and a small wind turbine, into one of the CUASO’s feeders, referred to as “USP-105”. Load-flow simulations were conducted using the PowerWorldTM Simulator v.23, considering the interconnection of these sources. This dataset provides comprehensive information and computational files utilized in the simulations. It serves as a valuable resource for reanalysis, didactic purposes, and the dissemination of technical insights related to DG implementation.","PeriodicalId":36824,"journal":{"name":"Data","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135252246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}