Data in BriefPub Date : 2025-04-25DOI: 10.1016/j.dib.2025.111574
Meaghan E. Spedden , George C. O’Neill , Timothy O. West , Tim M. Tierney , Stephanie Mellor , Nicholas A. Alexander , Robert Seymour , Jesper Lundbye-Jensen , Jens Bo Nielsen , Simon F. Farmer , Sven Bestmann , Gareth R. Barnes
{"title":"Wearable MEG data recorded during human stepping","authors":"Meaghan E. Spedden , George C. O’Neill , Timothy O. West , Tim M. Tierney , Stephanie Mellor , Nicholas A. Alexander , Robert Seymour , Jesper Lundbye-Jensen , Jens Bo Nielsen , Simon F. Farmer , Sven Bestmann , Gareth R. Barnes","doi":"10.1016/j.dib.2025.111574","DOIUrl":"10.1016/j.dib.2025.111574","url":null,"abstract":"<div><div>Non-invasive spatiotemporal imaging of brain activity during large-scale, whole body movement is a significant methodological challenge for the field of movement neuroscience. Here, we present a dataset recorded using a new imaging modality – optically-pumped magnetoencephalography (OP-MEG) – to record brain activity during human stepping. Participants (n=3) performed a visually guided stepping task requiring precise foot placement while dual-axis and triaxial OP-MEG and leg muscle activity (electromyography, EMG) were recorded. The dataset also includes a structural MRI for each participant and foot kinematics. This multimodal dataset offers a resource for methodological development and testing for OPM data (e.g., movement-related interference rejection), within-subject analyses, and exploratory analyses to generate hypotheses for further work on the neural control of human stepping.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111574"},"PeriodicalIF":1.0,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143941533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data on Swiss consumers’ support for different policies aiming to increase sustainability in food consumption and assessment of actor responsibility","authors":"Jeanine Ammann , Andreia Arbenz , Gabriele Mack , Michael Siegrist","doi":"10.1016/j.dib.2025.111570","DOIUrl":"10.1016/j.dib.2025.111570","url":null,"abstract":"<div><div>We present survey data from 453 Swiss consumers. Data were collected in the German-speaking parts of Switzerland in February 2023 using an online panel provider. The survey included seven distinctive parts. In a first part, personal data including political orientation and consumption behaviour were collected. In a second part, participants assessed the current consumption in Switzerland regarding sustainability. Participants’ food sustainability knowledge was assessed in part three of the survey. In part four, participants rated a total of 19 policy measures for sustainable consumption for their acceptance. Part five dealt with actor responsibility. It included four questions to assess participants’ health consciousness. In part six, we measured participants’ environmental attitudes. In part seven, participants answered questions on who they think was responsible to take action to increase sustainability in consumption and how much trust they had in these actors to do this successfully. The research design was approved by the Ethics Committee of ETH Zurich (approval number: EK 2023-N-04).</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111570"},"PeriodicalIF":1.0,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-04-23DOI: 10.1016/j.dib.2025.111578
Ashutosh Ashutosh, Sai Chand
{"title":"Dataset on fatal road traffic crash attributes extracted via natural language processing of online media articles in India","authors":"Ashutosh Ashutosh, Sai Chand","doi":"10.1016/j.dib.2025.111578","DOIUrl":"10.1016/j.dib.2025.111578","url":null,"abstract":"<div><div>Road traffic crashes are among the leading causes of death globally, resulting in substantial social and economic impacts. Online media is a key source of public information on road safety. Understanding how crashes are reported is crucial for detecting potential reporting biases and enhancing safety awareness. Hence, to address the issue of the lack of high-quality, media-reported fatal crash data, fatal crash reports were extracted for 2022–2023 from The Times of India, a prominent Indian news outlet. The resulting dataset comprised 2898 fatal crashes, 6584 fatalities and 7812 injuries, including 16 detailed crash attributes. This dataset was developed using web scraping and natural language processing (NLP) techniques. Automated tools such as Selenium and BeautifulSoup were employed to extract raw data from the news source. NLP algorithms were then applied to identify key crash attributes, including crash date, location, vehicles involved and number of fatalities. This study provides a replicable framework for constructing robust datasets from media sources, enabling multidisciplinary research on transportation safety, media reporting and public perception of crashes. The dataset is expected to serve as a valuable resource for analysing how the media shapes road safety narratives and for investigations on identifying high-fatality crash locations.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111578"},"PeriodicalIF":1.0,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143895726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-04-22DOI: 10.1016/j.dib.2025.111573
Samson Mugisha, Shreyas Labhsetwar, Devam Dave, Richard Klemke, Jay S. Desgrosellier
{"title":"A dataset of chronic nicotine-induced genes in breast cancer cells","authors":"Samson Mugisha, Shreyas Labhsetwar, Devam Dave, Richard Klemke, Jay S. Desgrosellier","doi":"10.1016/j.dib.2025.111573","DOIUrl":"10.1016/j.dib.2025.111573","url":null,"abstract":"<div><div>These data show the differentially expressed genes (DEG) from HCC38 breast cancer cell line chronically exposed to nicotine versus vehicle control. Additional data is also provided from dynamic trajectory analysis, identifying the most dynamic genes due to chronic nicotine treatment. To produce this dataset, we first performed single cell RNA sequencing from HCC38 cells chronically treated with vehicle or nicotine, followed by scanpy analysis to yield 6 discrete cell clusters at conservative resolution. We then evaluated differential gene expression between chronic nicotine and control cells for each individual cluster or in the whole sample using PyDESeq2. For dynamic trajectory analysis, Velocyto (0.6) was used to estimate the spliced and unspliced counts for each gene between chronic nicotine-treated cells and vehicle, allowing computation of gene velocities. These data are useful for analysing the expression of individual genes or gene velocities either in the whole sample or in the different clusters identified. Since the HCC38 cell line used in these experiments is heterogeneous, including cells with features of stem-like, luminal progenitor-like and more differentiated cells, this dataset allows examination of the conserved as well as disparate gene expression effects of nicotine in different breast cancer cell types. Our dataset has a great potential for re-use given the recent surge in interest surrounding the role tobacco-use plays in breast cancer progression.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111573"},"PeriodicalIF":1.0,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143887348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SUMediPose: A 2D-3D pose estimation dataset","authors":"Chris-Mari Schreuder , Oloff Bergh , Lizé Steyn , Rensu P. Theart","doi":"10.1016/j.dib.2025.111579","DOIUrl":"10.1016/j.dib.2025.111579","url":null,"abstract":"<div><div>Biomechanical movement analysis is crucial in medical and sports contexts, yet the technology remains expensive and inaccessible to many. Recent advancements in machine learning and computer vision, particularly in Pose Estimation (PE), offer promising alternatives. PE models detect key points on the human body to estimate its pose in either 2D or 3D space, enabling markerless motion capture. This approach facilitates more natural and flexible movement tracking without the need for physical markers. However, markerless systems generally lack the accuracy of marker-based methods and require extensive annotated data for training, which is often anatomically inaccurate. Additionally, current 3D pose estimation techniques face practical challenges, including complex hardware setups, intricate camera calibrations, and a shortage of reliable ground truth 2D-3D datasets.</div><div>To address these challenges, we introduce a multimodal dataset comprising 3,444 recordings, 2,896,943 image frames, and 3,804,413 corresponding 3D and 2D marker-based motion capture keypoint coordinates. The dataset includes 28 participants performing eight strength and conditioning actions at three different speeds, with full image and keypoint data available for 26 participants, while two participants have only keypoint data without accompanying image data. Video and image data were captured using a custom-developed multi-RGB-camera system, while the marker-based 3D data was acquired using the Vicon system and subsequently projected into each camera’s internal coordinate system, represented in both 3D space and 2D image space. The multi-RGB-camera system consists of six cameras arranged in a circular formation around the subject, offering a full 360° view of the scene from the same height and resulting in a diverse set of viewing angles. The recording setup was designed to allow both capture systems to record participants' movements simultaneously, synchronizing the data to provide ground truth 3D data, which was then back-projected to generate 2D-pixel keypoint data for each corresponding image frame. This design enables the dataset to support both 2D and 3D pose estimation tasks. To ensure anatomical accuracy, a professional placed an extensive array of markers on each participant, adhering to industry standards.</div><div>The dataset also includes all intrinsic and extrinsic camera parameters, as well as origin axis data, necessary for performing any 3D or 2D projections. This allows the dataset to be adjusted and tailored to meet specific research or application needs.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111579"},"PeriodicalIF":1.0,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143890576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-04-21DOI: 10.1016/j.dib.2025.111569
Nehal M. El-Deeb , Tamer A.E. Ahmed , Amal D. Premarathna , Vitalijs Rjabovs , Rando Tuvikene , Riadh Hammami , Martine Boulianne , Maxwell T. Hincke
{"title":"Experimental datasets on the extraction of functional ingredients from seaweeds for controlling bacterial infection","authors":"Nehal M. El-Deeb , Tamer A.E. Ahmed , Amal D. Premarathna , Vitalijs Rjabovs , Rando Tuvikene , Riadh Hammami , Martine Boulianne , Maxwell T. Hincke","doi":"10.1016/j.dib.2025.111569","DOIUrl":"10.1016/j.dib.2025.111569","url":null,"abstract":"<div><div>Seaweeds are gaining significant attention for their bioactive compounds, which hold great potential for use in food, cosmetics, and pharmaceuticals [<span><span>1</span></span>]. To avoid the use of toxic substances in the extraction process, there is a need for innovative and eco-friendly methods to exploit the highly potent raw seaweed biomass. Described herein are the datasets of how the particle size reduction of seaweeds positively enhanced the efficacy of green extraction in boosting the extraction yields of seaweed bioactive compounds.</div><div>Different green extraction approaches were used to accumulate different seaweed particle sizes that were collected via grinding and sieving [<span><span>2</span></span>]. The total yields of carbohydrates, glucuronic acids, proteins, phenolics and flavonoids were quantified to evaluate the efficacy of the extraction strategies. The efficacy and safety usages of the extracts were assessed using different pathogenic bacterial strains and human cell lines, respectively.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111569"},"PeriodicalIF":1.0,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143895724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-04-21DOI: 10.1016/j.dib.2025.111576
Sendy Brammadi , Poerbandono , Roberto Mayerle
{"title":"Cross-sectional ocean current data from vessel mounted acoustic Doppler current profiler (ADCP) survey in the Alas Strait, Indonesia","authors":"Sendy Brammadi , Poerbandono , Roberto Mayerle","doi":"10.1016/j.dib.2025.111576","DOIUrl":"10.1016/j.dib.2025.111576","url":null,"abstract":"<div><div>In this article, we present three-dimensional current vectors obtained by 600 kHz vessel mounted Teledyne Workhorse Acoustic Doppler Current Profiler (ADCP). The data is collected during a field measuring campaign on the 25th to the 27th of September 2022 (spring tide) in the Alas Strait, Indonesia. The data presented is the first in-situ measurement in this region to collect ocean current data from a vessel moving across multiple cross-sectional transects. The collected ADCP raw data undergoes conversion to ASCII format, reformatting into a more organized tabulation, and quality control for removal of bad data due to biases, errors, interferences, and/or noises. Moreover, tidal and depth data are also provided. As many as 13 cross-sectional transects and one along strait transect are presented covering water depth from about 1.71 m to approximately 60 m. These current, tide, and depth datasets provide an opportunity to characterize current behaviour at specific tidal phases and locations. The cross-sectional observation could be used to calibrate and validate results from hydrodynamic models.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111576"},"PeriodicalIF":1.0,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143890567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ID-SMSA: Indonesian stock market dataset for sentiment analysis","authors":"Jason Hartanto , Timothy Liundi , Rhio Sutoyo , Esther Widhi Andangsari","doi":"10.1016/j.dib.2025.111571","DOIUrl":"10.1016/j.dib.2025.111571","url":null,"abstract":"<div><div>Social media has impacted daily life, affecting people’s habits regarding accessing and sharing information. Among the platforms, X (formerly Twitter) gives users the freedom of speech to express their subjects and topics. Hence, users express their opinions on every topic, from light-hearted to heavy topics such as politics and the economy. This vast opinion from users creates a valuable resource for research. This paper presents the Indonesian Stock Market Dataset for Sentiment Analysis (ID-SMSA), a collection of 3288 tweets discussing the top 10 largest market caps in the Indonesian stock market as of March 2023. The dataset is in Indonesian and an English translated version is provided, making it the first Indonesian-language dataset discussing the Indonesian stock market. Human annotators labelled each tweet as positive, neutral, or negative based on baseline annotation characteristics criteria created and reviewed by an expert in clinical psychology. A voting system determines which tweets to include in the dataset. This creates a consistent dataset that reflects clear and agreed-upon sentiments and removes ambiguous and contradictory data. The voted tweets include 2339 positive, 999 neutral, and 1025 negative sentiments. This dataset supports research into Indonesian stock market growth and the role of social media in financial discussions.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111571"},"PeriodicalIF":1.0,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143902251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-04-21DOI: 10.1016/j.dib.2025.111562
Charles Jones , Donald D. Lucas , Allison Bagley , Callum Thompson
{"title":"A 30-yr high-resolution weather research and forecasting model downscaling data over California and Nevada","authors":"Charles Jones , Donald D. Lucas , Allison Bagley , Callum Thompson","doi":"10.1016/j.dib.2025.111562","DOIUrl":"10.1016/j.dib.2025.111562","url":null,"abstract":"<div><div>This dataset presents a 30-year high resolution meteorological dataset obtained using the WRF model (Advanced version Research WRF version 4.4). We used WRF and European Centre for Medium-Range Weather Forecasts Reanalysis v5 as initial and boundary conditions to generate gridded meteorological variables. A large number of surface weather stations was used for model validation. A multi-physics analysis was first developed to identify a good physics suite extended from 6 November 00 UTC to 10 November 23 UTC, 2018, which included the Camp Fire in northern California. Based on the best physics suite, the downscaling dataset extends from 1 December to 28 February, 1990–2021 and the horizontal domain has 1.5 km grid spacing covering the entire states of California and Nevada in the United States. Comparisons between hourly surface observations and WRF simulations of air temperature, relative humidity and wind speeds show mean absolute errors on the order of (1.6-2.0 C), (10 %) and 1.2–1.5 m s<sup>-1</sup>, respectively.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111562"},"PeriodicalIF":1.0,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143906592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data in BriefPub Date : 2025-04-21DOI: 10.1016/j.dib.2025.111575
Juha-Pekka Soininen , Gabriella Laatikainen
{"title":"What is a data space—Logical architecture model","authors":"Juha-Pekka Soininen , Gabriella Laatikainen","doi":"10.1016/j.dib.2025.111575","DOIUrl":"10.1016/j.dib.2025.111575","url":null,"abstract":"<div><div>The paper presents a use case model and a logical architecture model of a data space system. The models view the data space system from the user and operator perspectives and describe the needed functionalities and their connection on an abstract level. The core features in our data space model are collaboration networks and contract-based data sharing. The models are meant as a simple and explainable first introduction to what a data space system is, and they provide an implementation technology-independent basis for creating data space implementation specifications. The model is validated by comparing it with existing data space reference models from IDSA, GAIA-X, and the Data Space Support Centre. The modelled core features map with the technical specifications of the system. Our study enables data space system developers to think out of the box and create new innovative solutions.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111575"},"PeriodicalIF":1.0,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143887416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}