Data in BriefPub Date : 2025-03-26DOI: 10.1016/j.dib.2025.111510
Alexandre M.J.-C. Wadoux , Felix Stumpf , Thomas Scholten
{"title":"A catchment-scale dataset of soil properties and their mid-infrared spectra","authors":"Alexandre M.J.-C. Wadoux , Felix Stumpf , Thomas Scholten","doi":"10.1016/j.dib.2025.111510","DOIUrl":"10.1016/j.dib.2025.111510","url":null,"abstract":"<div><div>The dataset presents information on soil properties and their associated mid-infrared spectra for a drainage basin of 4.2 km<sup>2</sup>, referred to as Upper Badong catchment (31°1′24′′N, 110°20′35′′E) in the Hubei province, China. Data were collected for topsoil in a highly diverse terrace catchment composed of woodland, cropland and small farm building. Soil properties included in this dataset are pH, texture (i.e. clay, silt and sand content), total carbon, organic carbon and CaCO<sub>3</sub>. In addition, the soil samples were scanned in the mid-infrared range. The data collection processed involved three field campaigns during 2013 and 2014 where topsoil samples were collected in a standardized way across all sites, and soil analyses in the laboratory of soil science following standard procedures. The dataset offers insights into the spatial variation of soil properties in a highly diverse catchment of central China. Researchers interested in soil research can use this dataset for various purposes, including building digital soil mapping models or soil spectroscopic models, benchmarking of models with other datasets, and research in soil erosion modelling.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111510"},"PeriodicalIF":1.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143748417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-03-24DOI: 10.1016/j.dib.2025.111464
Francisco E. Apen , Sean P. Gaynor , Blair Schoene
{"title":"High-precision U-Pb data and reference age for Emerald Lake apatite","authors":"Francisco E. Apen , Sean P. Gaynor , Blair Schoene","doi":"10.1016/j.dib.2025.111464","DOIUrl":"10.1016/j.dib.2025.111464","url":null,"abstract":"<div><div>New isotope dilution thermal ionization mass spectrometry U-Pb data for Emerald Lake apatite demonstrate its potential as a reference material for geochronology. A three-dimensional <sup>238</sup>U/<sup>206</sup>Pb-<sup>207</sup>Pb/<sup>206</sup>Pb-<sup>204</sup>Pb/<sup>206</sup>Pb isochron produces a 95.2 ± 1.1 Ma date with an initial Pb isotopic composition of <sup>206</sup>Pb/<sup>204</sup>Pb = 18.85 ± 0.19 and <sup>207</sup>Pb/<sup>204</sup>Pb = 15.68 ± 0.10 (n = 5, MSWD = 9.5). These data yield a weighted mean initial Pb-corrected <sup>206</sup>Pb/<sup>238</sup>U date of 95.18 ± 0.10 Ma (n = 5, MSWD = 1.5) and a weighted mean initial Pb-corrected <sup>207</sup>Pb/<sup>235</sup>U date of 95.20 ± 0.17 Ma (n = 5, MSWD = 0.5). The new high-precision U-Pb age of Emerald Lake apatite further enables its utility as a reference material for <em>in situ</em> U-Pb apatite geochronology. Aliquots of Emerald Lake apatite are available for distribution for use in future studies.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111464"},"PeriodicalIF":1.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143748362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Soil suction dataset from a lime & cement treated embankment, from 2010 to 2023","authors":"Yasmina Boussafir , Dimitri Mercadier , Christophe Piquet","doi":"10.1016/j.dib.2025.111506","DOIUrl":"10.1016/j.dib.2025.111506","url":null,"abstract":"<div><div>An experimental embankment was built in 2010 in the frame of TerDOUEST Project [1]. Treated silt and clay from the site near Héricourt in France, were used in four sections approximately 50 meters length, and 5 meters high. The first meter at the base of the embankment is buried in the soil, below the ground, to study soil-water table interaction. In this site, a water table is located at approximately 1 to 2 meters depth and is registered with a piezometric probe. The last four meters of Héricourt embankment is above the ground level. The slope of this earth structure is V1:H2. One side of the embankment, has been built with the Low Plastic silt classified A2 according to the French standard NF P11-300 [2], treated for one third with 2% of quick lime, and another third section with 3% of cement (previously CEMII). The last third section was not treated. The other side of the embankment used clay, considered as a Plastic Clay and classified A4 [2]; one third has been treated with 4% of quick lime, another third section, with 2% of quick lime and 3% of cement, and the last third was not treated. In each of the treated sections, sensors were buried at 0.25 - 0.50 and 0.75 m depth in the slope, recording volumetric water content and suction for soil-atmosphere interactions studies. Other sensors recorded the volumetric water content only, in the core of the embankment, its base and the platform [3]. All data are available from 2010 to 2023. A weather station recorded precise meteorological data from 2010 to 2013.</div><div>The goal of this real size embankment was to evaluate the sustainability of treated soils used in earth structures and to test the re-use of very plastic clay thanks to adapted soil treatment. After more than 10 years old, this structure is stable, even if intrinsec characteristics may evolve [4].</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111506"},"PeriodicalIF":1.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143748421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-03-24DOI: 10.1016/j.dib.2025.111516
Alexandre Almeida Del Savio , Ana Luna Torres , Daniel Cárdenas-Salas , Mónica Vergara Olivera , Gianella Urday Ibarra
{"title":"Manually classified dataset of leaning and standing personnel images for construction site monitoring and neural network training","authors":"Alexandre Almeida Del Savio , Ana Luna Torres , Daniel Cárdenas-Salas , Mónica Vergara Olivera , Gianella Urday Ibarra","doi":"10.1016/j.dib.2025.111516","DOIUrl":"10.1016/j.dib.2025.111516","url":null,"abstract":"<div><div>This data paper presents a manually labeled dataset of 1,214 images of personnel captured from a construction site using four static cameras. There are two classes, standing and people leaning. The classification is stored in accompanying text files and bounding box coordinates for every image. The compilation was done to support the developing and validation computer vision and AI models for construction site monitoring. This dataset addresses the challenges of finding personnel in different poses within complex construction environments. The resource will enhance construction site safety monitoring and personnel activity analysis by allowing more precise neural network training. The dataset is stored in a public repository, making it openly accessible for academic and industrial purposes regarding computer vision, civil engineering, and workplace safety.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111516"},"PeriodicalIF":1.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143715853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Teachers' perspectives on the current state of the development of Vietnamese junior secondary school students’ digital competence","authors":"Thi Phuong Dang, Thi Thao Bui, Dieu Quynh Bui, Quoc Anh Vuong, Thu Linh Kieu","doi":"10.1016/j.dib.2025.111507","DOIUrl":"10.1016/j.dib.2025.111507","url":null,"abstract":"<div><div>This dataset provides detailed information about lower secondary school teachers' perceptions and assessments of the current state of digital competency development for lower secondary school students in Vietnam. The dataset encompasses five main aspects from teachers' perspectives: (a) General teacher information, including province name, region, gender, age, years of service and educational qualifications, subjects currently teaching at school; (b) Teachers' understanding of digital competency and their assessment of the role of digital competency for students; (c) Conditions necessary to implement educational activities aimed at developing students' digital competency; (d) Current activities of teachers and schools (regarding the use of equipment, IT infrastructure, teaching methods, assessment activities); (e) Effectiveness of activities (regarding teaching methods, testing, assessment) aimed at developing digital competency for students. Data collection was conducted online via Google Forms from March 1-20, 2024, with the participation of 7415 lower secondary school teachers from thirteen provinces in Vietnam. This dataset aims to provide valuable detailed information for policymakers and educational administrators about the current context in schools regarding educational activities aimed at developing digital competency for lower secondary school students. Additionally, this dataset serves as a basis for proposing solutions to develop students' digital competency, while helping lower secondary schools have direction in strengthening support for teachers to participate in training and professional development aimed at developing digital competency for learners. Educational administrators and researchers can use this data to better understand pedagogical requirements such as teacher training and policy development related to students' digital competency development. Furthermore, this dataset can help educational technology developers understand the needs, readiness, and use of equipment and IT infrastructure in teaching to develop students' digital competency across subjects. Overall, this dataset is valuable in providing an overview from teachers' perspectives on the current state of teaching aimed at developing digital competency for students in lower secondary schools for school leaders and educational policymakers in developing strategies, policies and guidelines appropriate to reality in training future human resources to achieve certain proficiency in digital skills.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111507"},"PeriodicalIF":1.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143725453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Antimicrobial data set and occurrence of acute kidney injury in patients admitted to a hospital in Western Pará, Brazil","authors":"Hiago Sousa Pinheiro , Camila Castilho Moraes , Géssica Aleane Moraes Esquerdo , Elenn Suzany Pereira Aranha , Luige Pinho Moraes , Tânia Mara Pires Moraes , Waldiney Pires Moraes","doi":"10.1016/j.dib.2025.111498","DOIUrl":"10.1016/j.dib.2025.111498","url":null,"abstract":"<div><div>In hospital units, the evaluation and pharmaceutical follow-up of medical prescriptions is an important source for pharmaceutical care and pharmaceutical clinical services. One common problem that has high hospital incidence rates is the occurrence of acute kidney injury (AKI). Pharmacovigilance among other activities is implemented in hospitals for the purpose of receiving and monitoring reports of adverse effects related to medications administered to patients. The survey evaluated the incidence of acute kidney injury in patients hospitalized and exposed to antimicrobials in a public hospital in the state of Pará, Brazil. A prospective and observational cohort was carried out, whose outcome of interest is the occurrence of AKI in patients admitted to the hospital between October 2018 and January 2019. The data were recorded and stored in a database, then descriptive analysis was performed in the GraphPad Prism 6.0 program. Quantitative variables were expressed as standard deviation (SD) of the mean and the number of cases as a percentage. We collected data from 70 patients who were admitted to the hospital and needed to use any of the antimicrobials selected in the observation period during hospital treatment. The survey results showed that mostly male (64.29%; <em>n =</em> 45). Age ranged from 19 to 96 years, with a mean of 52.49 years (SD = 20.31). The patients included were mostly from the oncology clinic (34.29%; <em>n =</em> 24) those that had had surgery (27.14%; <em>n =</em> 19). Most critically ill patients admitted to the adult ICU (26.47 %; <em>n =</em> 9) developed AKI. Regarding the number of medications used by patients, there was a variation from 5 to 17, with a mean of 10.26 (SD = 2.90) medications prescribed per patient. In the data regarding the antimicrobials, most patients took ceftriaxone (<em>n =</em> 29), cefepime (<em>n =</em> 27) and piperacillin/tazobactam (<em>n =</em> 23). In terms of the number of antimicrobials prescribed per patient, 60% (<em>n =</em> 42) of patients took only one, 30% (<em>n =</em> 21) took two and 10% (<em>n =</em> 7) took three or more antimicrobials for treatment of infections. The plasma concentrations of vancomycin ranged from 3.0 µg/mL to 22.5 µg/mL. Of the 10 samples collected, 10.0% (<em>n =</em> 1) were above the therapeutic range established by the literature (between 10 to 20 µg/mL), 30.0% (<em>n =</em> 3) were within the reference values and 60.0% (<em>n =</em> 6) of the patients had values below the reference values. Patients who developed AKI (60.0%; <em>n =</em> 6) during vancomycin use had concentration values between 3 µg/mL and 15.9 µg/mL, most of whom had values below the recommended therapeutic range. Of these patients with AKI, 83.33% (<em>n =</em> 5) used more than one nephrotoxic antimicrobial during hospital treatment. The concentrations of patients who were not diagnosed with AKI (40.0%; <em>n =</em> 4) ranged from 3.0 µg/mL to 22.5 µg/mL.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111498"},"PeriodicalIF":1.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143725544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From archives to AI: Residential property data across three decades in Brunei Darussalam","authors":"Haziq Jamil , Amira Barizah Noorosmawie , Hafeezul Waezz Rabu , Lutfi Abdul Razak","doi":"10.1016/j.dib.2025.111505","DOIUrl":"10.1016/j.dib.2025.111505","url":null,"abstract":"<div><div>This article introduces the first publicly available data set for analysing the Brunei housing market, covering more than 30,000 property listings from 1993 to early 2025. The data set, curated from property advertisements in newspapers and online platforms, includes key attributes such as price, location, property type, and physical characteristics, enriched with area-level spatial information. Comprehensive and historical, it complements the Brunei Darussalam Central Bank's Residential Property Price Index (RPPI), addressing the limitations of restricted access to raw RPPI data and its relatively short timeline since its inception in 2015. Data collection involved manual transcription from archival sources and automated web scraping using programmatic techniques, supported by innovative processing with Large Language Models (LLMs) to codify unstructured text. The data set enables spatial and temporal analysis, with potential applications in economics, urban planning, and real estate research. Although listing prices are only a proxy for market values and may deviate from actual sale prices due to negotiation dynamics and other factors, this data set still provides a valuable resource for quantitative analyses of housing market trends and for informing policy decisions.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111505"},"PeriodicalIF":1.0,"publicationDate":"2025-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143748361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-03-23DOI: 10.1016/j.dib.2025.111514
Md Tahsin , Muhammad Ibrahim , Anika Tabassum Nafisa , Maksura Binte Rabbani Nuha , Mehrab Islam Arnab , Md. Hasanul Ferdaus , Mohammad Manzurul Islam , Mohammad Rifat Ahmmad Rashid , Taskeed Jabid , Md. Sawkat Ali , Nishat Tasnim Niloy
{"title":"PaddyVarietyBD: Classifying paddy variations of Bangladesh with a novel image dataset","authors":"Md Tahsin , Muhammad Ibrahim , Anika Tabassum Nafisa , Maksura Binte Rabbani Nuha , Mehrab Islam Arnab , Md. Hasanul Ferdaus , Mohammad Manzurul Islam , Mohammad Rifat Ahmmad Rashid , Taskeed Jabid , Md. Sawkat Ali , Nishat Tasnim Niloy","doi":"10.1016/j.dib.2025.111514","DOIUrl":"10.1016/j.dib.2025.111514","url":null,"abstract":"<div><div>Among countless crop varieties produced worldwide, the staple food of most of Asia, some parts of Europe, and North America is rice. Being an essential food item, rice offers an integral contribution to the economy of countries like China, India, Bangladesh, Pakistan, Indonesia, and so on. Scientists have long been working on developing new and improved rice species to battle different environmental hindrances and natural calamities. Although numerous research and studies have been conducted on this diverse crop, artificial intelligence, in particular, machine learning has not been practiced in this field with its full potential. The key factors behind this lag include the unavailability of standard and ready-to-use datasets. Intending to mitigate this drawback, this paper proposes an image dataset of paddy species to assist researchers and scientists in classifying, analyzing, and evaluating paddy classes. To the best of our knowledge, this is the first standard and open dataset of paddy varieties in Bangladesh. The rice sample was collected from two places namely – Bangladesh Institute of Nuclear Agriculture (BINA) and the Bangladesh Institute of Rice Research Institute (BRRI) where agrarian scientists work on developing new or improving existing paddy species. The dataset contains 14,000 RGB microscopic images of each paddy kernel. The enormity and inclusivity of the dataset make it useful for global research purposes. The dataset can be a useful resource not only in the area of artificial intelligence, but also in agriculture, botanical, and economic research.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111514"},"PeriodicalIF":1.0,"publicationDate":"2025-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143735313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-03-23DOI: 10.1016/j.dib.2025.111512
Heather Margaret Logan, Maggie Ziggie Søndergaard, Valentina Rossi, Kamilla Kastrup Hansen, Anders Damgaard
{"title":"Nordic textile anatomy database: Composition of garments available in the nordic retail mass market and post-consumer textile waste market","authors":"Heather Margaret Logan, Maggie Ziggie Søndergaard, Valentina Rossi, Kamilla Kastrup Hansen, Anders Damgaard","doi":"10.1016/j.dib.2025.111512","DOIUrl":"10.1016/j.dib.2025.111512","url":null,"abstract":"<div><div>Textiles are complex materials made of multiple and blended resources, often assembled in unique configurations imbuing each textile with its own anatomy. These <strong><em>textile anatomies</em></strong> make the identification, separation, sorting, and recycling of post-consumer textiles, especially, difficult. While textile anatomy data is often retrievable for pre- and post-industrial textiles (off-cuts or rejects from manufacturing), it is often difficult to retrieve for textiles which have reached the consumer. The lack of data available on the textile anatomies of pre- and post-consumer textile waste skews predictions and market forecasts for the expected yields, capacities, and qualities of post-consumer textile sorting and recycling activities. This disrupts planning for sorting, removal of findings, and scaling of recycling technologies. To better plan for the innovation needs, market capacity, and policy levers needed to improve the efficiency of sorting and recycling activities, there is an urgent need for data on the unique anatomies of pre- and post-consumer textiles. This is especially important as the EU mandates that all member states must separately collect and treat post-consumer textiles beginning in 2025.</div><div>Therefore, this database contains two datasets offering textile anatomies for more than 5000 separate garment samples from the post-industrial-pre consumer retail mass market (RMM) and the Post-Consumer Textile Waste Market (PCTWM). This database contains crucial data on each garment's fibre composition, finding presence, and layer presence. The two datasets are the results of two separate waste composition campaigns conducted in the Nordic Region in 2022: One focused on the textile anatomies of the RMM (4,495 samples) and the other on the PCTWM (1,248 samples). The RMM data was collected by sampling garments across mass market retailers in the Copenhagen municipality of DK during the spring/summer seasons of 2022. The PCTWM data was collected by sampling post-consumer textile waste bales from pre- and post-sorting lines at the SIPTEX sorting facility in Malmo, SE in the winter of 2022. In both datasets, surveys deployed via webapp were utilized to streamline sampling and ensure consistent recording of the fibre blends, number of findings present, and layers present. In the PCTWM dataset additional data is provided on the fibre composition of layers, as well as the placement and type of findings present.</div><div>Each dataset in this database can be used by industrial ecologists, economists, and textile engineers to better forecast, map, and analyse the potential treatment of expected post-consumer textiles. Moreover, the methodology and approach to data gathering can be used as a blueprint for future regionalized databases throughout the European Union. The use of this database can be particularly useful to analyse the economic, environmental, and resource impacts of common garments as well as inform te","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111512"},"PeriodicalIF":1.0,"publicationDate":"2025-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143748363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-03-22DOI: 10.1016/j.dib.2025.111518
Jamal Hussain Shah , Maira Afzal , Samia Riaz , Mussarat Yasmin , Seifedine Kadry , Fahad Ahmed Khokhar
{"title":"A comprehensive dataset of soccer event images for advancing automatic recognition systems","authors":"Jamal Hussain Shah , Maira Afzal , Samia Riaz , Mussarat Yasmin , Seifedine Kadry , Fahad Ahmed Khokhar","doi":"10.1016/j.dib.2025.111518","DOIUrl":"10.1016/j.dib.2025.111518","url":null,"abstract":"<div><div>This article presents a detailed overview of a dataset, created for in-depth analysis of soccer events. This dataset will serve as a foundation for researchers and practitioners in the field, providing a perspective on different soccer events under various views. This soccer dataset is designed to categorize soccer matches into various events and contains 187,151 instances divided across 14 groups. To make this dataset simple, it is separated into two main datasets. The first dataset is known as the “View-Based Dataset.” which is divided into four categories: Long view, Medium view, Short view, and Outer view, for a total of 137,196 images. The second dataset is the “Event-Based Dataset,” which has 10 separate classes that highlight multiple soccer events Red card, Spectator, Yellow card, Plenty stock, Player celebration, Offside, Goal attempt, Goal, and Free kick for a total of 38,728 images. Each class in both datasets helps to provide a full understanding of soccer events. This dataset can serve as a foundation for future video analysis studies, promoting progress in soccer analytics and related domains.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111518"},"PeriodicalIF":1.0,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143748366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}