{"title":"Dataset on energy consumption in buildings within tropical climate based on design aspects of courtyards","authors":"Abdulbasit Almhafdy , Ashjan Al-Mutairi , Asma Al-Shargabi , Amal Al-Shargabi","doi":"10.1016/j.dib.2025.111834","DOIUrl":"10.1016/j.dib.2025.111834","url":null,"abstract":"<div><div>Sustainability and energy efficiency have become fundamental objectives for modern society. Green roofs and facades are increasingly recognized as innovative and sustainable strategies to improve the energy performance of buildings. This paper introduces a dataset about buildings thermal performance and energy consumption in tropical climate depending on adjacent outdoor enclosed courtyards design features with different architectural shapes U, L, and O. The core data has been collected in public building in Kuala Lumpur, Malaysia. Then it expanded using simulation. The core measured raw data is the temperature and the other data is simulated and/or calculated. The dataset includes detailed design features of courtyards such as plan aspect ratio, number of floors, and orientation. Measurement instruments were calibrated against real-world measurements to ensure accuracy and reliability. The simulated data is tested and validated based on the statistical aspects of the raw data using Pearson correlation coefficient, with a value of 0.882. The dataset includes total 8,685 records across the different courtyard' shapes. This dataset captures intricate relationships between architectural design parameters and energy consumption, making it a valuable resource for architects, engineers, and researchers interested in optimizing building designs for improved energy efficiency. It also allows in-depth analysis and potential reuse in studies related to sustainable architecture and urban planning.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111834"},"PeriodicalIF":1.0,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-06-24DOI: 10.1016/j.dib.2025.111828
Diego Miranda , Carlos Escobedo , Dayana Palma , Rene Noel , Adrián Fernández , Cristian Cechinel , Jaime Godoy , Roberto Munoz
{"title":"A multimodal experimental dataset on agile software development team interactions","authors":"Diego Miranda , Carlos Escobedo , Dayana Palma , Rene Noel , Adrián Fernández , Cristian Cechinel , Jaime Godoy , Roberto Munoz","doi":"10.1016/j.dib.2025.111828","DOIUrl":"10.1016/j.dib.2025.111828","url":null,"abstract":"<div><div>Studying collaborative dynamics in agile development teams requires multi- modal data that captures verbal and non-verbal communication. However, few experimental datasets provide this level of depth in real or simulated teamwork contexts. This article presents a multimodal dataset with experimental data collected during controlled sessions involving simulated agile development teams, each composed of four computer science students. A total of 19 groups (76 different participants) were organized, each participating in two collaborative activities: one without a coordination technique and another using the Planning Poker method. Three of these teams were designated as control groups. The resulting dataset includes audio recordings of verbal interactions and non- verbal behaviour data, such as body posture, facial expressions, visual attention, and gestures, captured using MediaPipe, YOLOv8, and DeepSort. It also contains time-aligned automatic transcriptions generated with WhisperX, attention logs, mimicry labels, and surveys on perceived equity in interactions. This re- source aims to provide a comprehensive view of collaborative behaviour in agile contexts, supporting both qualitative analysis of interactions and the development of predictive models of group performance. The dataset explores how shared visual attention and behavioural synchrony influence team effectiveness and decision-making through this multimodal approach. This work contributes a unique dataset valuable to researchers across multiple fields of study.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111828"},"PeriodicalIF":1.0,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144501818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-06-24DOI: 10.1016/j.dib.2025.111825
Pushpa B․ R․ , Manohar N․ , N. Shobha Rani
{"title":"StarNet: Indian star gooseberries dataset for quality and maturity assessment","authors":"Pushpa B․ R․ , Manohar N․ , N. Shobha Rani","doi":"10.1016/j.dib.2025.111825","DOIUrl":"10.1016/j.dib.2025.111825","url":null,"abstract":"<div><div>Star gooseberry provides immense health benefits and is widely recognized in the Indian medicinal system. It holds significant importance in the food production, pharmaceuticals, and cosmetics industries due to the presence of therapeutic and pharmacological properties. Due to its beneficial properties, gooseberry fruit is widely used in treating various ailments. Therefore, cultivating these fruits presents an opportunity to generate revenue, benefiting both farmers and the agricultural sector. The post-harvest process of fruit typically performs the quality assessment by segregating fruits based on visual characteristics, which is tedious and prone to human error. Hence, there is a need to develop an automated computer vision model to assess the fruit quality more accurately. This study focuses on dataset collection, including image samples of both single and multiple-star gooseberry fruits to automate fruit grading. This dataset has been specifically developed for research purposes, contributing to fruit detection, quality assessment, weight estimation, and classification of fruits at various ripeness stages. Further, it provides researchers with an opportunity to develop an automated system for detecting overlapping fruits and touching contours using machine learning, deep learning, and computer vision systems. Image samples of star gooseberry at different growth stages were collected from orchids in Mysuru, India. The dataset, named “AmlaNet” comprises 792 image samples of star gooseberry, captured against a plain background from varying angles, sizes, brightness levels, and distances. The dataset is organized into four folders such as single star gooseberry fruit, multiple fruits, overlapped, and annotated samples of overlapped star gooseberry fruits including fruit samples with different ripeness stages. This publicly accessible dataset is expected to benefit the research community, enabling advancement in computer vision and AI Applications. It can be accessed at DOI: <span><span>10.17632/2255bdy9mm.1</span><svg><path></path></svg></span></div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111825"},"PeriodicalIF":1.0,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144534812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-06-23DOI: 10.1016/j.dib.2025.111821
Nurul Syahidah Mio Asni , Norazlan Mohmad Misnan , Ahmed Mediani , Ivana Nur Allisya Rozlan , Nurul Amalia Zahari , Syarul Nataqain Baharum , Nurkhalida Kamal
{"title":"Mass spectrometry dataset of conventional and organic tempe before and after in vitro digestion","authors":"Nurul Syahidah Mio Asni , Norazlan Mohmad Misnan , Ahmed Mediani , Ivana Nur Allisya Rozlan , Nurul Amalia Zahari , Syarul Nataqain Baharum , Nurkhalida Kamal","doi":"10.1016/j.dib.2025.111821","DOIUrl":"10.1016/j.dib.2025.111821","url":null,"abstract":"<div><div>Tempe is a superior plant-based protein source that provides a diverse array of nutritional benefits as a result of the presence of bioactive metabolites. Nevertheless, there is a scarcity of information regarding the metabolomics profile between organic and conventional tempe and the fate of these metabolites after <em>in vitro</em> digestion. This report examines the metabolomic profile of soybean as raw material and tempe prior to and following the <em>in vitro</em> digestion process. We obtained a comprehensive set of metabolomic data using ultra-high-performance liquid chromatography coupled with high-resolution mass spectrometry (UHPLC-HRMS). The metabolomics dataset organized into Excel sheets and structured according to polarity, mass to charge ratio (m/z), retention time, feature name, biological replicates and controls. This data offers preliminary insights into the metabolite profile of tempe samples, encompassing source material soybean, tempe, and tempe digesta.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111821"},"PeriodicalIF":1.0,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Complete genome sequences of Vibrio parahaemolyticus strains L2171 and L2181 associated with AHPND in Penaeus vannamei postlarvae by hybrid sequencing","authors":"Guillermo Reyes , Betsy Andrade , Irma Betancourt , Bonny Bayot","doi":"10.1016/j.dib.2025.111819","DOIUrl":"10.1016/j.dib.2025.111819","url":null,"abstract":"<div><div><em>Vibrio parahaemolyticus</em> strains L2171 and L2181 were isolated from a <em>Penaeus vannamei</em> shrimp hatchery. Both strains carry the pVA plasmid harboring the <em>PirAB</em> genes encoding the binary PirAB toxins that cause the acute hepatopancreatic necrosis disease (AHPND) in cultured shrimp. The strains also harbor multidrug resistance (MDR) and a repertoire of virulence factor genes. Our goal was to determine their complete genome sequences and perform a comprehensive analysis of their genetic characteristics. Therefore, the genomes of two strains, which are highly virulent to shrimp were sequenced by Illumina and the PacBio platforms. These data contribute to a better understanding of <em>V. parahaemolyticus</em> and its role as a pathogen in commercially important species such as farmed shrimp, providing valuable insights for disease management in aquaculture.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111819"},"PeriodicalIF":1.0,"publicationDate":"2025-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144501816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-06-22DOI: 10.1016/j.dib.2025.111818
Jefferson Torres-Quezada , Atila Avila-Argudo
{"title":"Dataset of operational energy of forty Andean buildings from 1980 to 2020","authors":"Jefferson Torres-Quezada , Atila Avila-Argudo","doi":"10.1016/j.dib.2025.111818","DOIUrl":"10.1016/j.dib.2025.111818","url":null,"abstract":"<div><div>This dataset presents operational energy data from forty residential buildings constructed between 1980 and 2020 in Cuenca, a city in the Andean region of Ecuador. It includes energy consumption data related to heating, cooling, lighting, electrical appliances, domestic hot water and cooking. Ten sample houses from each decade were selected, representing typical construction practices of their respective periods. The study follows three main stages: (1) Analysis of operational energy consumption, showcasing the evolution of energy use across four decades; (2) Simulation and validation, where energy simulations and calculations are performed for each sample, followed by a validation process using <em>in-situ</em> measurements compared with simulated results; and (3) Data curation, where climate data is compiled and updated for further analysis. This dataset includes files and figures that enhance comprehension and support further research on energy efficiency, sustainable building design, and energy policy development in regions with moderate climates. It also enables comparisons with datasets from other geographic regions, contributing to a broader understanding of energy demand patterns in residential buildings.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111818"},"PeriodicalIF":1.0,"publicationDate":"2025-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-06-21DOI: 10.1016/j.dib.2025.111823
Marcos Gabriel Mendes Lauande , Geraldo Braz Júnior , João Dallyson Sousa de Almeida , Vandecia Rejane Monteiro Fernandes , Anselmo Cardoso de Paiva , Rui Miguel Gil da Costa , Amanda Mara Teles , Leandro Lima da Silva , Haissa Oliveira Brito , Flávia Castello Branco Vidal
{"title":"PCPAm - A dataset of histopathological images of penile cancer for classification tasks","authors":"Marcos Gabriel Mendes Lauande , Geraldo Braz Júnior , João Dallyson Sousa de Almeida , Vandecia Rejane Monteiro Fernandes , Anselmo Cardoso de Paiva , Rui Miguel Gil da Costa , Amanda Mara Teles , Leandro Lima da Silva , Haissa Oliveira Brito , Flávia Castello Branco Vidal","doi":"10.1016/j.dib.2025.111823","DOIUrl":"10.1016/j.dib.2025.111823","url":null,"abstract":"<div><div>Penile cancer has an incidence strongly linked to sociocultural factors, being more common in underdeveloped countries like Brazil, where it represents approximately 2% of cancers affecting men. This dataset was created to address the scarcity of publicly available resources for classifying histopathological images in penile cancer research. The images were collected in 2021 from tissue samples obtained through biopsies of patients undergoing treatment for penile cancer. After staining with Hematoxylin and Eosin (H&E), the tissue samples were photographed using a Leica ICC50 HD camera attached to a bright-field microscope (Leica DM500). The dataset comprises 194 high-resolution images (2048 × 1536 pixels), categorized by magnification (40X and 100X) and pathological classification (Tumor or Non-Tumor). Metadata includes additional information such as histological grade and, for some images, HPV status. Although previous works have focused primarily on binary classification tasks, the dataset includes additional labels, such as histological grade and HPV (Human Papilloma Virus) presence, which provide opportunities for multi-label classification or other types of predictive modelling. These extended labels enhance the dataset’s versatility for more complex tasks in medical image analysis. The dataset holds significant reuse potential for machine learning tasks beyond binary classification, allowing researchers to explore additional layers of analysis, such as HPV detection and histological grading. It can also be used for model benchmarking and comparative studies in cancer research, contributing to developing new diagnostic tools. The dataset and metadata are available for further research and model development.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111823"},"PeriodicalIF":1.0,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-06-21DOI: 10.1016/j.dib.2025.111816
Amandine Cunty , Jessica Dittmer , Déborah Merda , Bruno Legendre , Benoit Remenant , Yannick Blanchard , Sophie Cesbron , Marie-Agnès Jacques , Pascal Gentit , Anne-Laure Boutigny
{"title":"Complete genome sequence data of Xylella fastidiosa subspecies multiplex ST88 and ST89 indicate distinct introductions in France","authors":"Amandine Cunty , Jessica Dittmer , Déborah Merda , Bruno Legendre , Benoit Remenant , Yannick Blanchard , Sophie Cesbron , Marie-Agnès Jacques , Pascal Gentit , Anne-Laure Boutigny","doi":"10.1016/j.dib.2025.111816","DOIUrl":"10.1016/j.dib.2025.111816","url":null,"abstract":"<div><div><em>Xylella fastidiosa</em> is a Gram-negative bacterium native to the Americas and classified as a priority pest under EU regulations. This xylem-limited plant pathogenic bacterium has a wide host range and is transmitted by insect vectors. Since 2013, <em>X. fastidiosa</em> has been identified in several European countries including Italy, France, Spain and Portugal, with different subspecies and sequence types (ST) detected. Since 2015, most strains identified in France are of the subspecies <em>multiplex,</em> specifically ST6 and ST7. Two new STs of <em>X. fastidiosa</em> subsp. <em>multiplex,</em> ST88 and ST89, were recently detected in the region Provence-Alpes-Côte d’Azur (PACA), and one strain of each ST has been isolated from infected plants. To investigate the phylogenetic relationships between the four STs present in France, a complete circular genome and a single-contig genome were assembled for the ST89 and ST88 strains, respectively, by combining PacBio and Illumina sequencing data. A phylogenomic analysis was performed to investigate the phylogenetic position and potential origin of these new strains. This data article contributes to improve our knowledge of the diversity and origin of <em>X. fastidiosa</em> subsp. <em>multiplex</em> in France and Europe.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111816"},"PeriodicalIF":1.0,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-06-21DOI: 10.1016/j.dib.2025.111815
Duc-Minh Nguyen, Tri-Nhan Nguyen, Trung-Quan Hoang, Cao Vu Bui
{"title":"ViCoW: A dataset for colorization and restoration of Vietnam War imagery","authors":"Duc-Minh Nguyen, Tri-Nhan Nguyen, Trung-Quan Hoang, Cao Vu Bui","doi":"10.1016/j.dib.2025.111815","DOIUrl":"10.1016/j.dib.2025.111815","url":null,"abstract":"<div><div>This dataset presents a curated collection of 1896 high-resolution image pairs extracted from four historically significant Vietnamese films set during the Vietnam War era. Each pair consists of an original color frame and its corresponding grayscale version, generated using the ITU-R BT.601 luminance formula. Designed to support research in historical image restoration and colorization, the dataset serves as a benchmark for evaluating AI-driven colorization techniques. Frames were systematically extracted at 3 s intervals from well-preserved archival footage, followed by manual selection to ensure visual diversity and contextual relevance. The dataset is organized into training, validation, and test sets, enabling researchers to train and assess deep learning models for restoring and colorizing historical imagery. In addition to addressing the challenges posed by aged film quality, temporal degradation, and complex visual content, this dataset contributes to digital heritage preservation by making grayscale historical visuals more accessible and engaging for modern audiences. Potential applications include the development of automated colorization systems, domain adaptation research, and AI-powered video restoration from static images.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111815"},"PeriodicalIF":1.0,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144501821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-frequency hydrodynamic and hydrochemical data from karst unsaturated zone","authors":"Leïla Serène , Guillaume Cinkus , Naomi Mazzilli , Jean-Baptiste Decitre , Christophe Emblanch , Franck Tison , Julien Dupont , Milanka Babic , Roland Simler , Matthieu Blanc","doi":"10.1016/j.dib.2025.111812","DOIUrl":"10.1016/j.dib.2025.111812","url":null,"abstract":"<div><div>This dataset provides hydrological continuous data for 12 unsaturated zone flows of the Fontaine the Vaucluse karst system (France). These flows are caused by faults, fractures, karst conduits, and slow-flows from porous rock. This panel of unsaturated zone flow is unique. The dataset includes electrical conductivity, temperature and discharge, obtained through the artificial galleries of the Low Noise Underground Laboratory (LSBB, <span><span>https://lsbb.cnrs.fr</span><svg><path></path></svg></span>), and providing a unique window into the hydrological processes occurring in the unsaturated zone of karst aquifers. It also provides air pressure, temperature and humidity inside the galleries and weather parameters at the surface above the galleries (pluviometry, temperature, solar radiation, humidity, wind speed, wind direction, barometric pressure). The data are provided at an hourly time step from July 2022 to April 2024 and correspond to filtered data.</div><div>Discharge is calculated from a homemade flowmeter consisting of a Plexiglas tube that collects the water and has a pressure sensor at the bottom. The pressure is converted into a volume of water thanks to a linear model, and then converted into discharge using the extreme values corresponding to the filling and emptying of the Plexiglas tube. All data have been filtered by removing erroneous values such as extremes due to probe malfunction or operator intervention, by checking the accuracy of the continuous measurement with point-in-time monitoring, and by removing values when the flow is inactive.</div><div>This dataset is original in that it monitors various types of unsaturated zone flow, which are rarely monitored due to a lack of speleological access, especially slow flows within the limestone matrix. These data can be used to assess pressure transfer thanks to pluviometry, flow temperature and discharge; as well as mass transfer thanks to pluviometry and electrical conductivity. Weather data can also be used to improve climate models as the nearest weather station is 10km away.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111812"},"PeriodicalIF":1.0,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144501822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}