Data in BriefPub Date : 2025-09-16DOI: 10.1016/j.dib.2025.112057
Rafael Orlando Uceta-Acosta , Deyslen Mariano-Hernandez , Yeulis Rivas-Peña , Víctor S. Ocaña-Guevara , Miguel Aybar-Mejía , Máximo A. Domínguez-Garabitos
{"title":"A dataset of exogenous variables and historical electricity demand for short-term load forecasting of the national interconnected electric system (SENI) in the Dominican Republic from 2021 to 2024","authors":"Rafael Orlando Uceta-Acosta , Deyslen Mariano-Hernandez , Yeulis Rivas-Peña , Víctor S. Ocaña-Guevara , Miguel Aybar-Mejía , Máximo A. Domínguez-Garabitos","doi":"10.1016/j.dib.2025.112057","DOIUrl":"10.1016/j.dib.2025.112057","url":null,"abstract":"<div><div>This dataset contains historical records of electricity demand in the Dominican Republic from January 2021 to December 2024, with hourly resolution. It was compiled to support short-term load forecasting of the National Interconnected Electric System (SENI). The dataset includes the total system demand in megawatts (MW), along with a set of exogenous variables commonly used in forecasting models. These variables include weather data retrieved from Open-Meteo (such as temperature and humidity), time-lagged demand features, and calendar-based indicators (e.g., weekends, holidays, month, hour). All data were collected from open sources, including the official website of the electricity market and system operator, the Organismo Coordinador (OC), as well as public meteorological APIs.</div><div>The dataset is structured and cleaned to be directly usable for time series modeling applications. It can be reused by researchers, utility planners, and data scientists for benchmarking forecasting models, developing predictive tools, or supporting energy planning tasks in tropical, developing power systems. The data is provided in CSV format.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112057"},"PeriodicalIF":1.4,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145157312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-15DOI: 10.1016/j.dib.2025.112061
Silvia Cavagnoli , Claudia Fabiani , Chiara Chiatti , Anna Laura Pisello
{"title":"Surface roughness data of microsphere-based coatings developed to counter the urban heat island phenomenon","authors":"Silvia Cavagnoli , Claudia Fabiani , Chiara Chiatti , Anna Laura Pisello","doi":"10.1016/j.dib.2025.112061","DOIUrl":"10.1016/j.dib.2025.112061","url":null,"abstract":"<div><div>This work presents a dataset derived from the characterization and surface roughness analysis of coatings developed for the mitigation of the urban heat island phenomenon. In detail, these coatings were produced using a black painted aluminium layer with the surface application of ceramic microspheres and chrome steel grit, both in three different sizes, i.e. a small, a medium, and a large one. The aim of the dataset is to demonstrate how the different types of material applied influences the surface topography and affects roughness in comparison to a reference sample. The analysis was carried out with the NANOVEA Jr25 profilometer, which allowed the collection of useful data for the evaluation of the main surface parameters and the precise measurement of the height and distribution of micro-irregularities. The adopted approach ensured the accurate representation of the surface microstructure, and the dataset obtained was processed with the Mountains8 software to obtain height and hybrid parameters. Analysing the dataset in detail, the samples with ceramic microspheres show a lower roughness than those with steel grit. This reduction in roughness may contribute to improved optical and radiative properties of the surfaces, making ceramic microspheres suitable for radiative cooling applications and mitigation of the urban heat island effect.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112061"},"PeriodicalIF":1.4,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145218687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Money plant leaf (Epipremnum aureum): A comprehensive study of raw datasets with manual classification","authors":"Jameer kotwal , Aarju Jain , Ramgopal Kashyap , Anil Pise , Pramod Patil , Vinod Kimbahune","doi":"10.1016/j.dib.2025.112066","DOIUrl":"10.1016/j.dib.2025.112066","url":null,"abstract":"<div><div>Money plants are widely recognized for their significant spiritual and air-purifying benefits. Research has proven that daily interaction with these vibrant indoor plants effectively reduces anxiety and stress. This paper introduces a robust dataset of <strong>4302 healthy, unhealthy, combined, real and college premises</strong> images of money plants captured using smartphones. The dataset <strong>was</strong> collected from the educational hub, Dr. D. Y. Patil Institute of Technology, Pune campus, Maharashtra, India. Under controlled conditions, images were taken from a mobile device to ensure consistency and quality. From different angles and different backgrounds, images are captured. The aim of creating the dataset <strong>was</strong> to support researchers in achieving their objectives in the agricultural field and to explore our dataset so that it may be used for further research, investigation, and training of artificial intelligence models using our dataset.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112066"},"PeriodicalIF":1.4,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145263599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-15eCollection Date: 2025-10-01DOI: 10.1016/j.dib.2025.112055
Bruno Barbosa, Sandra Oliveira, Jorge Rocha
{"title":"Semi-automated multi-criteria filtering of building footprints for enhanced Wildland-Urban Interface mapping in mainland Portugal.","authors":"Bruno Barbosa, Sandra Oliveira, Jorge Rocha","doi":"10.1016/j.dib.2025.112055","DOIUrl":"10.1016/j.dib.2025.112055","url":null,"abstract":"<p><p>The expansion of the Wildland-Urban Interface (WUI) demands precise mapping to effectively mitigate wildfire risk. However, the absence of national building footprint databases presents a significant challenge. This study, focused on mainland Portugal, proposes a semi-automated, multi-criteria filtering framework to refine global open-source building datasets-specifically Microsoft's Global Building Footprints. The method integrates regional adaptability and spatial metrics such as area thresholds and proximity analyses, using Portugal's official Geographic Buildings Location Database as a reference. The framework prioritizes residential structures by excluding anomalies-such as industrial facilities, photovoltaic arrays, and transmission lines-through dynamically adjusted thresholds at various administrative levels (e.g., municipal and NUTS-2). The filtering process reduced the number of building footprints from approximately 5.6 million to around 3.0 million. We mapped the WUI across Portugal using both the original dataset (WUI_MSB) and the filtered dataset (WUI_MSB_F) to compare outcomes. The WUI was classified into <i>Intermix</i> and <i>Interface</i> types. Buildings that did not meet the minimum criteria to be considered part of the WUI were categorized based on their density: very low, low, medium, or high. The original WUI_MSB covered a total area of 13,177 km², representing approximately 15% of mainland Portugal. After applying the filtering framework, the WUI_MSB_F area was reduced by 49%, totaling 8,327 km². The workflow-implemented using Python scripting and ArcGIS Pro-is scalable for national-level applications. These experimental results highlight the importance of region-specific adjustments and demonstrate how this methodology can support policymakers in identifying and prioritizing context-specific exposed communities. By enhancing the reliability of open datasets, this approach offers a reproducible tool for wildfire resilience planning, particularly in data-scarce regions.</p>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"62 ","pages":"112055"},"PeriodicalIF":1.4,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12477925/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145198263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-15eCollection Date: 2025-10-01DOI: 10.1016/j.dib.2025.112065
Najmul Alam, M A Rahman, Md Rashidul Islam
{"title":"DynaLiRD: A dataset for dynamic line rating of overhead transmission lines, utilizing meteorological data and grid parameters based on the IEEE 738-2012 standard.","authors":"Najmul Alam, M A Rahman, Md Rashidul Islam","doi":"10.1016/j.dib.2025.112065","DOIUrl":"https://doi.org/10.1016/j.dib.2025.112065","url":null,"abstract":"<p><p>This article presents DynaLiRD, a comprehensive dataset for dynamic line rating (DLR) of the Trang-Thap Cham 220 kV overhead transmission line. The DLR values are computed using the IEEE 738-2012 standard based on historical meteorological data such as ambient temperature, wind speed and direction, and global horizontal irradiance as well as detailed line parameters including conductor type, diameter, length, and elevation. To enhance the dataset's applicability in cybersecurity and machine learning research, adversarially perturbed data is included using the fast gradient sign method (FGSM) and basic iterative method (BIM) under varying perturbation intensities. This dataset is essential for DLR estimation, dynamic thermal rating (DTR) forecasting, renewable energy integration into the grid, machine learning (ML) applications, infrastructure planning, energy policy development, and cybersecurity vulnerability investigation. Its structured format and inclusion of both clean and adversarial data make it valuable for evaluating the resilience of data-driven energy systems.</p>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"62 ","pages":"112065"},"PeriodicalIF":1.4,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12478051/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145198786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-13DOI: 10.1016/j.dib.2025.112063
John Barco-Jiménez , Daniel Rosero , Andrés Zambrano , Francisco Eraso-Checa , Miller Ruales , José Camilo Eraso
{"title":"Irradiance dataset in the south of Colombia from 2013 to 2023 in 5-minutes intervals","authors":"John Barco-Jiménez , Daniel Rosero , Andrés Zambrano , Francisco Eraso-Checa , Miller Ruales , José Camilo Eraso","doi":"10.1016/j.dib.2025.112063","DOIUrl":"10.1016/j.dib.2025.112063","url":null,"abstract":"<div><div>This article presents an extensive irradiance dataset collected in San Juan de Pasto, located in southern Colombia, using a Davis Vantage PRO 2 meteorological station. The dataset spans 11 years, covering the period from 2013 to 2023, with measurements taken at 5-minute intervals, resulting in approximately 603,495 irradiance records, each accompanied by a corresponding timestamp.</div><div>The construction of the dataset required a rigorous preprocessing stage. This stage included the removal of erroneous values (NaN) and outliers, the identification of missing entries, and the correction of inconsistencies in the date records. Missing values were addressed through gap-filling procedures based on averaged data, complemented by visual inspections using graphical representations. The cleaned dataset was exported after ensuring data integrity, accuracy, and consistency, which are essential for reliable analysis and subsequent modeling.</div><div>This dataset is valuable for building training datasets used as input for artificial intelligence models to perform short-, medium-, and long-term irradiance forecasting. For instance, Barco-Jiménez et al. (2021) utilized a portion of this dataset to develop multitemporal irradiance predictions. These predictive models can be applied in various domains, including energy management, grid optimization, and solar energy production planning. Furthermore, the dataset supports statistical analyses that provide insights for appropriately sizing photovoltaic systems through indicators such as Hours of Peak Sunlight (HPS), maximum and minimum irradiance values, average daily and monthly irradiance, and seasonal trends. These indicators play a fundamental role in the optimization of photovoltaic system performance, contributing to cost reduction and enhancing energy efficiency across rural, residential, and commercial applications.</div><div>This dataset supports photovoltaic system design and studies on solar energy variability and climate patterns in the region. Analysis of irradiance fluctuations over time provides insights into the influence of atmospheric conditions on solar energy availability. This information is essential for enhancing the reliability of solar power systems and effectively integrating renewable energy sources into existing power grids. The dataset can also be used in educational settings to teach data analysis techniques and renewable energy concepts, providing students and researchers with a practical resource for hands-on learning.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112063"},"PeriodicalIF":1.4,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145263600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-13DOI: 10.1016/j.dib.2025.112053
John Barco-Jiménez , Sixto Campaña , Álvaro Cervelión , Harold Cabrera , Carlos Tobar , Roberto Jaramillo , Andrés Diaz , Abel Méndez Porras
{"title":"Children's emotions dataset: Facial images as action units and valence scores","authors":"John Barco-Jiménez , Sixto Campaña , Álvaro Cervelión , Harold Cabrera , Carlos Tobar , Roberto Jaramillo , Andrés Diaz , Abel Méndez Porras","doi":"10.1016/j.dib.2025.112053","DOIUrl":"10.1016/j.dib.2025.112053","url":null,"abstract":"<div><div>The paper presents a dataset of emotions from children between 10 and 12 years old. This dataset was obtained from videos that are represented in time series of facial Action Units (AUs), and their corresponding valences were scored by professionals. The AUs are extracted from the videos using the <em>Deepface</em> library, and the valence series are obtained from expert observers who rate each video on a range from -1 to 1, covering the spectrum of negative to positive emotions. The dataset was evaluated by a total of 20 professional experts, comprising psychologists and psychology practitioners, with each video receiving an average of 10 reviews. The analysis encompassed a total of 57 videos, representing 22 students, culminating in the acquisition of a comprehensive set comprising 50 temporal series of action units and their associated weighted valence scores. This dataset is useful for training machine learning models in the process of identifying emotions to determine possible patterns of behaviour in classrooms. These patterns may reveal problematic academic attitudes or situations, or, conversely, the early identification of positive emotions that can empower leading students. In addition, it can assist education professionals in undertaking self-evaluations of their formative processes, with a focus on the emotions or attention exhibited by their students within the classroom environment during lessons.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112053"},"PeriodicalIF":1.4,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145128356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-13DOI: 10.1016/j.dib.2025.112062
Radek Hranický, Ondřej Ondryáš, Adam Horák, Petr Pouč, Kamil Jeřábek, Tomáš Ebert, Jan Polišenský
{"title":"A multi-dimensional DNS domain intelligence dataset for cybersecurity research","authors":"Radek Hranický, Ondřej Ondryáš, Adam Horák, Petr Pouč, Kamil Jeřábek, Tomáš Ebert, Jan Polišenský","doi":"10.1016/j.dib.2025.112062","DOIUrl":"10.1016/j.dib.2025.112062","url":null,"abstract":"<div><div>The escalating sophistication and frequency of cyber threats require advanced solutions in cybersecurity research. Particularly, phishing and malware detection have become increasingly reliant on data-driven approaches. This paper presents a unique dataset precisely curated to bolster research in network security, focusing on the classification and analysis of internet domains. This dataset contains information for over a million internet domains with detailed labels distinguishing between phishing, malware, and benign traffic.</div><div>Our dataset is distinctive due to its comprehensive compilation of metainformation derived from multiple sources, including DNS records, TLS handshakes and certificates, WHOIS and RDAP services, IP-related data, and geolocation details. Such rich, multi-dimensional data allows for a deeper analysis and understanding of domain characteristics that are critical in identifying and categorizing cyber threats. The integration of information from diverse sources enhances the dataset's utility, providing a holistic view of each domain's footprint and its potential security implications.</div><div>The data is formatted in JSON, ensuring versatility, accessibility for researchers, and easy integration into various analytical tools and platforms, facilitating ease of use in statistical analysis, machine learning, and other computational analyses. Our dataset's extensive volume and variety surpass any known publicly available resources in this field, making it an invaluable asset for both academic and practical development and testing of cybersecurity solutions.</div><div>This paper thoroughly describes the value of the data, details the comprehensive methodology employed in the collection process, and provides a clear description of the data structure. Such documentation is crucial for ensuring that the dataset can be effectively utilized and reapplied in a variety of research contexts. Its structured format and the broad range of included features are critical for developing robust cybersecurity solutions and can be adapted for emerging threats.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"62 ","pages":"Article 112062"},"PeriodicalIF":1.4,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145117615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-13eCollection Date: 2025-10-01DOI: 10.1016/j.dib.2025.112054
Ying Han, Sina Schriever, Philipp von Hartrott, Christian Rockenhäuser, Birgit Skrotzki
{"title":"Creep dataset for aluminum alloy EN AW-2618A in the T61 and overaged condition 1000 h/190 °C.","authors":"Ying Han, Sina Schriever, Philipp von Hartrott, Christian Rockenhäuser, Birgit Skrotzki","doi":"10.1016/j.dib.2025.112054","DOIUrl":"10.1016/j.dib.2025.112054","url":null,"abstract":"<p><p>The article presents creep data for the precipitation-hardened aluminum alloy EN AW-2618A, which was tested in its initial slightly underaged T61 condition and after being aged at 190 °C for 1000 h. The creep test temperatures ranged between 160 °C and 230 °C, and the applied initial stresses between 40 MPa and 290 MPa. The testing times reached up to 4700 h. 19 data sets are provided as creep strain vs. time series. A data schema originally developed to manage creep reference data of a single-crystalline nickel-based superalloy was adapted for the research data set of the polycrystalline aluminum alloy and used to enrich the test data with extensive metadata. The dataset can be used to calibrate and validate creep models. The creep data supplement the hardness and microstructure data that have been previously published.</p>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"62 ","pages":"112054"},"PeriodicalIF":1.4,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12477942/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145198833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-12eCollection Date: 2025-10-01DOI: 10.1016/j.dib.2025.112067
Mouad Bensalah, Abdellatif Hair, Reda Rabie, Hatim Derrouz
{"title":"High-resolution smart meter load dataset collected from multiple cities in Morocco.","authors":"Mouad Bensalah, Abdellatif Hair, Reda Rabie, Hatim Derrouz","doi":"10.1016/j.dib.2025.112067","DOIUrl":"https://doi.org/10.1016/j.dib.2025.112067","url":null,"abstract":"<p><p>Here in this data set, we are offering high-resolution electric consumption data collected via smart meters installed across multiple Moroccan cities with special focus on 22 kV distribution transformers. The smart meters employed in this study have the feature of storing electric parameters, and the data has been retrieved from residential and industrial regions in Laayoune, Boujdour, Marrakech, and Foum Eloued. The dataset, accumulated at a 10-minute (30-minute) frequency in Marrakech, constitutes the granular foundation for precise load forecasting models tailored to the idiosyncrasies of each region. Raw and processed versions of the dataset are both available, thereby making it a valuable for energy management, load forecasting, and smart grid optimization experts and researchers. The information can be used to study patterns of consumption, detect anomalies, and develop prediction models for the best power distribution and grid stability. Visual representations of trends in electricity consumption by geographic area and time period are also provided to facilitate ease of interpretation and further research use.</p>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"62 ","pages":"112067"},"PeriodicalIF":1.4,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12495070/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145231749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}