Data in BriefPub Date : 2025-09-19DOI: 10.1016/j.dib.2025.112087
Essaadia Tabarnoust, Mohammed Mghari, Youssef Zaz
{"title":"Moroccan used cars dataset: Insights into the used car market","authors":"Essaadia Tabarnoust, Mohammed Mghari, Youssef Zaz","doi":"10.1016/j.dib.2025.112087","DOIUrl":"10.1016/j.dib.2025.112087","url":null,"abstract":"<div><div>The Moroccan used car market is a key indicator of consumer preferences, economic trends, and market dynamics in North Africa. This paper introduces the Moroccan Used Cars Dataset (MUCars-2024), which documents the used car market in Morocco throughout 2024. The dataset offers a detailed view of vehicles listed for sale during this period, providing valuable insights into this rapidly evolving market. Data were collected from prominent online platforms dedicated to car sales using web scraping techniques, ensuring comprehensive coverage. Post-collection, data were rigorously preprocessed, which included removing unreliable features with excessive missing data and standardizing other attributes such as price, mileage, and vehicle characteristics. The final dataset contains 101,896 listings, offering a robust representation of the Moroccan used car market.</div><div>MUCars-2024 provides a rich set of features—including vehicle specifications (brand, model, year), condition (mileage, first-owner status), and technical details (fiscal power, gearbox)—that enable detailed analysis. As a versatile resource for disciplines like economics, artificial intelligence, and automotive studies, it allows researchers to develop price prediction models, perform clustering analyses, and conduct spatial studies on consumer demand.</div><div>The MUCars-2024 dataset provides a high-resolution snapshot of a critical year in Morocco's automotive market. It serves as a foundational baseline for future temporal studies and enables immediate cross-market comparisons. As a publicly accessible resource, it directly supports research reproducibility and fosters innovation by bridging the gap between raw market data and academic inquiry, offering a valuable tool for data-driven research and industry practice.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112087"},"PeriodicalIF":1.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145156931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DeepUS-ReconSeg: A multi-angle paired B-mode Ultrasound dataset for medical imaging reconstruction and segmentation","authors":"Imrus Salehin , Nazmul Huda Badhon , Md Tomal Ahmed Sajib , Nazmun Nessa Moon","doi":"10.1016/j.dib.2025.112083","DOIUrl":"10.1016/j.dib.2025.112083","url":null,"abstract":"<div><div>This article presents a curated dataset of 4,200 B-mode ultrasound images of human forearms, intended for use in deep learning-based image reconstruction and segmentation tasks. Data have been collected using the Verasonics Vantage 64LE system equipped with an L11-5v linear array transducer, known for its high spatial resolution. Each scanning session captures 100 frames per orientation across both arms of 14 healthy subjects, covering multiple anatomical views. The dataset provides grayscale B-mode images stored in MATLAB .mat format and is publicly available through Mendeley Data. This dataset is valuable for researchers in medical image analysis, especially those developing deep learning models for enhancing ultrasound imaging quality and anatomical structure segmentation.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112083"},"PeriodicalIF":1.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145156930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-19DOI: 10.1016/j.dib.2025.112091
Andreia Figueiredo , João Amaral , Marcos Mendes , Rodrigo Rosmaninho , Duarte Dias , Pedro Rito , Miguel Luís , Duarte Raposo , Susana Sargento
{"title":"Experimental dataset of video and radar detection for cooperative perception in urban environment","authors":"Andreia Figueiredo , João Amaral , Marcos Mendes , Rodrigo Rosmaninho , Duarte Dias , Pedro Rito , Miguel Luís , Duarte Raposo , Susana Sargento","doi":"10.1016/j.dib.2025.112091","DOIUrl":"10.1016/j.dib.2025.112091","url":null,"abstract":"<div><div>Cooperative perception is an emerging concept in intelligent transportation systems that enhances situational awareness by allowing vehicles and infrastructure nodes to share sensor information. By extending the sensing range beyond the line of sight of a single agent, cooperative perception enables safer and more informed decision-making in complex traffic situations. To support research in this area, especially from the perspective of infrastructure-based sensing, high-quality datasets are essential. This article presents a dataset that combines radar and camera-based object detection data in standardized Collective Perception Messages (CPMs), collected in a real vehicular environment. The dataset includes object-level information such as unique tracking identifiers, spatial position, speed, heading, and classification. In addition to the raw sensor detections, it provides message-level CPMs generated in real time by the infrastructure node, following the European Telecommunications Standards Institute (ETSI) Collective Perception Service (CPS) specification and applying its object inclusion rules. All data is timestamped and spatially referenced, enabling the reconstruction of object trajectories and behavior over time. The dataset is suitable for developing and evaluating cooperative perception algorithms, as well as applications like trajectory prediction, object classification refinement, and multi-sensor fusion benchmarking. Its accessibility aims to support the research community in advancing perception and prediction models for autonomous driving and intelligent transportation systems.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112091"},"PeriodicalIF":1.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145128352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-19DOI: 10.1016/j.dib.2025.112081
Mohammad Fraiwan , Ali Ibnian , Nishi Shahnaj Haider
{"title":"A dataset of heart sound regurgitation of patients with heart valve disorders","authors":"Mohammad Fraiwan , Ali Ibnian , Nishi Shahnaj Haider","doi":"10.1016/j.dib.2025.112081","DOIUrl":"10.1016/j.dib.2025.112081","url":null,"abstract":"<div><div>Heart regurgitation is a cardiac condition characterized by the backward flow of blood, producing audible murmur sounds detectable during auscultation. If left untreated, it can lead to serious complications affecting cardiac function. This article presents a comprehensive dataset of heart sound recordings, including aortic regurgitation (AR), mitral regurgitation (MR), tricuspid regurgitation (TR), and healthy heart sounds, collected from patients at a single hospital using an electronic stethoscope. For each participant, recordings were obtained from three standard chest locations, and all diagnoses were confirmed by an experienced cardiologist. The dataset provides high-quality, labeled recordings that capture the variability of regurgitation sounds across different types and locations. It is intended to support the development and evaluation of automated algorithms for detecting cardiac abnormalities, including machine learning and signal processing approaches. Additionally, this dataset offers an educational resource for medical students and trainee clinicians to practice auscultation skills, recognize different types of regurgitation murmurs, and improve diagnostic proficiency. By making these recordings publicly available, the dataset can serve as a benchmark resource for both research and clinical training in cardiac auscultation.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112081"},"PeriodicalIF":1.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145128354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-19DOI: 10.1016/j.dib.2025.112094
Suhana Binta Rashid , Bibhas Roy Chowdhury Piyas , Sadia Rahman , Bijoy Roy Chowdhury Preenon
{"title":"ALERT: A benchmark Bengali dataset for identifying and categorizing religiously aggressive texts","authors":"Suhana Binta Rashid , Bibhas Roy Chowdhury Piyas , Sadia Rahman , Bijoy Roy Chowdhury Preenon","doi":"10.1016/j.dib.2025.112094","DOIUrl":"10.1016/j.dib.2025.112094","url":null,"abstract":"<div><div>The widespread proliferation of religiously aggressive contents on social media platforms poses significant threats to societal harmony and communal solidarity. It often incites religious animosity, provokes violence and disseminates life-threatening messages that intensifies societal divisions and undermines social harmony. Despite significant advancements in identifying such contents in high-resource languages like English, there exists a notable scarcity of resources for regional languages like Bengali which constrains the development of effective detection and prevention tools. To address this gap, we introduce ALERT (Analysis of Linguistic Extremism in Religious Texts), a newly developed Bengali dataset along with English translation which includes 4027 annotated instances classified into four categories: hate speech (995), vandalism (909), atrocity (1117), and no aggression (1006). The dataset was sourced from many online platforms, including Facebook, YouTube, online news websites, blogs and group chats. Each of the instances in the dataset was annotated by any two annotators from the list of four having diverse religious, ethnic, geographical, and academic backgrounds. Any conflicts or disagreements between annotators during the annotation process were resolved through consultation with a domain expert. The preprocessing stages include the elimination of English words, duplication and alphanumeric characters to ensure data integrity. The dataset attains a Cohen’s kappa score of 72 % that signifies a strong inter-annotator agreement and a Jaccard similarity score between 16 % and 23 % which reflects the degree of overlap between classes. Moreover, Experiments with various machine learning, deep learning and transformer-based models yield promising results. ALERT serves as a benchmark dataset for religiously aggressive text classification that may contribute to the advancement of research in this field. The dataset is publicly accessible for research purposes to promote innovation and collaboration within the Bengali NLP community.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112094"},"PeriodicalIF":1.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145218708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-19eCollection Date: 2025-10-01DOI: 10.1016/j.dib.2025.112093
Nur Azmira Alea Nurhazli, Jia Hao Tan, Mohd Farizal Kamaroddin, Mohd Shahir Shamsir, Amira Suriaty Yaakop, Kian Mau Goh
{"title":"Microbial Community Profiles of Biofilms from Hot Springs: 16S and 18S rRNA Amplicon Sequencing Data.","authors":"Nur Azmira Alea Nurhazli, Jia Hao Tan, Mohd Farizal Kamaroddin, Mohd Shahir Shamsir, Amira Suriaty Yaakop, Kian Mau Goh","doi":"10.1016/j.dib.2025.112093","DOIUrl":"10.1016/j.dib.2025.112093","url":null,"abstract":"<p><p>This article presents microbial diversity data from biofilms collected from the sides or outflows of several Malaysian hot springs, with temperatures ranging from 38 to 56 °C and pH values between 7.1 and 8.7. Genomic DNA was extracted from the biofilms and subjected to 16S V3-V4 and 18S V4 amplicon sequencing using the Illumina NovaSeq 6000 platform. Reads were processed with various bioinformatic tools including QIIME2, and eventually, amplicon sequence variants (ASVs) were identified. In almost all analyzed biofilms, approximately 50% of the total ASVs belonged to <i>Cyanobacteriota</i> and <i>Chloroflexota,</i> except for one biofilm, labeled DTO, which was dominated by <i>Pseudomonadata</i> and <i>Cyanobacteriota</i>. Besides bacteria, the data also suggest the presence of various eukaryotic organisms, including small animals such as nematodes, rotifers, and arthropods; fungi and fungus-like organisms such as <i>Ascomycota, Zoopagomycota, Oomycota,</i> and <i>Cryptomycota</i>; as well as photosynthetic eukaryotes from the <i>Viridiplantae</i> group. This dataset serves as a valuable resource for microbial ecology studies in hot spring biofilms and is openly available for reuse, providing a foundation for future research on microbial diversity and functional roles in geothermal ecosystems.</p>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"62 ","pages":"112093"},"PeriodicalIF":1.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12495045/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145231740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-19DOI: 10.1016/j.dib.2025.112086
Hangang Li , Kai Mao , Xiaomin Chen , Hanpeng Li , Hanwen Xu , Shaolong Zhang , Boyu Hua , Qiuming Zhu
{"title":"Air-to-ground channel dataset via UAV-aided measurements in multiple scenarios","authors":"Hangang Li , Kai Mao , Xiaomin Chen , Hanpeng Li , Hanwen Xu , Shaolong Zhang , Boyu Hua , Qiuming Zhu","doi":"10.1016/j.dib.2025.112086","DOIUrl":"10.1016/j.dib.2025.112086","url":null,"abstract":"<div><div>Low-altitude communication networks are emerging to facilitate reliable connectivity for aerial platforms, where a deep understanding of the radio propagation channel is fundamental for the design and optimization of the communication systems. In this study, a self-developed air-to-ground (A2G) channel sounder is employed to conduct field measurements under three typical communication scenarios, i.e., sports field, farmland, and over-water. In each scenario, channel measurement data are collected at different flight heights of the unmanned aerial vehicle (UAV), i.e., 10 m and 15 m. A dataset of measured channel impulse responses (CIRs) is constructed accordingly. Under each scenario, hundreds of CIR snapshots are recorded, where each CIR snapshot has a size of 1 × 200. Each CIR snapshot is also labeled with corresponding global positioning system (GPS) locations and time, where time represents the seconds elapsed since the start of the measurement. All data are stored in .xls format. Unlike most existing A2G channel measurement datasets that are limited to a single scenario or UAV altitude, our dataset simultaneously encompasses diverse scenarios and multiple flight heights at 3.6 GHz, which enhances its value for reproducibility and broad applicability in future channel modeling and system-level evaluations. The dataset is validated through on-site monitoring and repeated measurements when anomalies were detected, and its practical utility is demonstrated through the analysis of power delay profiles and path loss. This dataset provides a realistic representation of height-dependent A2G channel characteristics in diverse environments and reveals both the propagation delay and attenuation of multipath components, offering direct insights into the fundamental behavior of A2G propagation. The dataset also offers valuable references for system design and optimization of the A2G link of the low-altitude communication.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112086"},"PeriodicalIF":1.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145218710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-19DOI: 10.1016/j.dib.2025.112085
Bruno Wichmann , Roberta Moreira Wichmann , Tiago Almeida de Oliveira , Crysttian Arantes Paixão
{"title":"A geocoded dataset of primary health care clinics in Brazil","authors":"Bruno Wichmann , Roberta Moreira Wichmann , Tiago Almeida de Oliveira , Crysttian Arantes Paixão","doi":"10.1016/j.dib.2025.112085","DOIUrl":"10.1016/j.dib.2025.112085","url":null,"abstract":"<div><div>We develop a geocoded dataset of primary health care clinics in Brazil. We merge data from three publicly available sources. The first is the National Registry of Healthcare Facilities (CNES-ST), which collects the location (state, municipality, and 8-digit postal code) of all health care facilities, public or private, operating in Brazil. The second is the National Registry of Addresses for Statistical Purposes (IBGE-CNEFE), which contains the geographic coordinates of all addresses in Brazil (including 8-digit postal codes) and serves as the basis for the Brazilian census. Our approach aggregates individual (address-level) coordinates to the 8-digit postal code, and assigns coordinates to primary care clinics based on each clinics’ postal code. Using data from a third source, the IBGE shapefiles, we estimate the area of postal codes to evaluate the precision of our geo-referencing method. The unique facility identification number (cnes number) can be used to merge our georeferenced data with other publicly available databases of the Brazilian Unified Health System. The final dataset is an unbalanced panel with monthly observations about 293,698 primary care clinics’ locations (i.e. coordinates), from January 2018 to December 2023, totalling 15,455,219 observations.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112085"},"PeriodicalIF":1.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145218808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-19DOI: 10.1016/j.dib.2025.112090
Cairan A. Van Rooyen , Tim Sharpe
{"title":"Survey data on ventilation provision and use in homes in Great Britain","authors":"Cairan A. Van Rooyen , Tim Sharpe","doi":"10.1016/j.dib.2025.112090","DOIUrl":"10.1016/j.dib.2025.112090","url":null,"abstract":"<div><div>This article describes data collected from an online questionnaire survey of 2, 039 adults in England, Wales, and Scotland (Great Britain) to assess the provision and use of mechanical ventilation, trickle vents, and windows in their homes. The survey was deployed on the 17th of June 2022 and collected data across four categories: socioeconomic and demographic characteristics, dwelling features, ventilation practices, and other contextual factors. The questionnaire included 65 questions and had a median completion time of 12 minutes, with all respondents completing the survey.</div><div>The dataset is broadly representative of the British population and housing typologies. It provides valuable insights into the relationships between dwellings, occupants, their ventilation provision, and behaviours. The data is stored in a comma-separated values (.csv) file containing 396 variables, with responses formatted as binary, continuous, discrete, categorical, and free-text.</div><div>This dataset can be utilised by academics for indoor environmental quality and energy modelling to refine assumptions about ventilation provision and practices. Public health professionals can use the data to estimate the health impacts of exposure to indoor pollutants, which can result from poor ventilation provision and to develop targeted health information. Government can use this evidence to inform policies and strategies aimed at improving ventilation in existing homes. The dataset is accessible through the Mendeley Data repository.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112090"},"PeriodicalIF":1.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145218704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-09-19DOI: 10.1016/j.dib.2025.112098
Olayemi O. Akinnola , Abosede E. Samuel , Conrad A. Omonhinmin
{"title":"Dataset on characterisation of microbiome of prostate tissue and expressed prostatic secretions","authors":"Olayemi O. Akinnola , Abosede E. Samuel , Conrad A. Omonhinmin","doi":"10.1016/j.dib.2025.112098","DOIUrl":"10.1016/j.dib.2025.112098","url":null,"abstract":"<div><div>Prostate cancer (PCa) is the second most prevalent cancer in men, particularly affecting those of Black African descent. Nigeria currently has the fourth highest risk for PCa mortality in the world. The microbiome of the prostate has emerged as a critical factor in understanding the aetiology and progression of prostate diseases, such as prostate cancer (PCa), benign prostatic hyperplasia (BPH) , benign stromal hyperplasia (BSH) and prostatitis (PRO). This study to comparatively characterise the microbiome present in prostate tissue and expressed prostatic secretion (EPS) from 30 study subjects diagnosed with PCa, BPH, BSH and PRO and sampled from the urology clinic of Lagos State University Teaching Hospital Ikeja. Bacterial species community composition and diversity were analysed based on 16S rRNA metagenome nucleotide data to ensure the accuracy, reproducibility, and broader applicability of microbiological and genomic research. Data information allows for precise identification of organisms at the species or strain level, essential for verifying experimental results and comparisons of the isolated organism's genome with related strains, providing insights into genetic diversity, virulence factors, and metabolic pathways of the sample population microbiome.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112098"},"PeriodicalIF":1.4,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145218707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}