{"title":"The Effect: An Introduction to Research Design and Causality","authors":"Y. Wang","doi":"10.1080/26941899.2023.2167433","DOIUrl":"https://doi.org/10.1080/26941899.2023.2167433","url":null,"abstract":"","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44730273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid Forecasting for Functional Time Series of Dissolved Oxygen Profiles","authors":"Luke Durell, J. Scott, A. Hering","doi":"10.1080/26941899.2022.2152401","DOIUrl":"https://doi.org/10.1080/26941899.2022.2152401","url":null,"abstract":"","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49001985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning from Lending in the Interbank Network","authors":"P. Laux, Wei Qian, Haici Zhang","doi":"10.1080/26941899.2022.2151949","DOIUrl":"https://doi.org/10.1080/26941899.2022.2151949","url":null,"abstract":"","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42790583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data science in sciencePub Date : 2023-01-01Epub Date: 2023-02-28DOI: 10.1080/26941899.2023.2166624
Kara Karpman, Sumanta Basu, David Easley, Sanghee Kim
{"title":"Learning Financial Networks with High-frequency Trade Data.","authors":"Kara Karpman, Sumanta Basu, David Easley, Sanghee Kim","doi":"10.1080/26941899.2023.2166624","DOIUrl":"10.1080/26941899.2023.2166624","url":null,"abstract":"<p><p>Financial networks are typically estimated by applying standard time series analyses to price-based economic variables collected at low-frequency (e.g., daily or monthly stock returns or realized volatility). These networks are used for risk monitoring and for studying information flows in financial markets. High-frequency intraday trade data sets may provide additional insights into network linkages by leveraging high-resolution information. However, such data sets pose significant modeling challenges due to their asynchronous nature, complex dynamics, and nonstationarity. To tackle these challenges, we estimate financial networks using random forests, a state-of-the-art machine learning algorithm which offers excellent prediction accuracy without expensive hyperparameter optimization. The edges in our network are determined by using microstructure measures of one firm to forecast the sign of the change in a market measure such as the realized volatility of another firm. We first investigate the evolution of network connectivity in the period leading up to the U.S. financial crisis of 2007-09. We find that the networks have the highest density in 2007, with high degree connectivity associated with Lehman Brothers in 2006. A second analysis into the nature of linkages among firms suggests that larger firms tend to offer better predictive power than smaller firms, a finding qualitatively consistent with prior works in the market microstructure literature.</p>","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10798789/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47504522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Functional Stochastic Volatility in Financial Option Surfaces","authors":"Phillip A. Jang, Michael Jauch, D. Matteson","doi":"10.1080/26941899.2022.2152764","DOIUrl":"https://doi.org/10.1080/26941899.2022.2152764","url":null,"abstract":"","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42815626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-Fungible Token Transactions: Data and Challenges","authors":"Jason B. Cho, Sven Serneels, D. Matteson","doi":"10.1080/26941899.2022.2151950","DOIUrl":"https://doi.org/10.1080/26941899.2022.2151950","url":null,"abstract":"Non-fungible tokens (NFT) have recently emerged as a novel blockchain hosted financial asset class that has attracted major transaction volumes. Investment decisions rely on data and adequate preprocessing and application of analytics to them. Both owing to the non-fungible nature of the tokens and to a blockchain being the primary data source, NFT transaction data pose several challenges not commonly encountered in traditional financial data. Using data that consist of the transaction history of eight highly valued NFT collections, a selection of such challenges is illustrated. These are: price differentiation by token traits, the possible existence of lateral swaps and wash trades in the transaction history and finally, severe volatility. While this paper merely scratches the surface of how data analytics can be applied in this context, the data and challenges laid out here may present opportunities for future research on the topic.","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41828060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Regularized Predictive Models for Beef Eating Quality of Individual Meals","authors":"G. Tarr, I. Wilms","doi":"10.1080/26941899.2022.2151948","DOIUrl":"https://doi.org/10.1080/26941899.2022.2151948","url":null,"abstract":"Faced with changing markets and evolving consumer demands, beef industries are investing in grading systems to maximise value extraction throughout their entire supply chain. The Meat Standards Australia (MSA) system is a customer-oriented total quality management system that stands out internationally by predicting quality grades of specific muscles processed by a designated cooking method. The model currently underpinning the MSA system requires laborious effort to estimate and its prediction performance may be less accurate in the presence of unbalanced data sets where many\"muscle x cook\"combinations have few observations and/or few predictors of palatability are available. This paper proposes a novel predictive method for beef eating quality that bridges a spectrum of muscle x cook-specific models. At one extreme, each muscle x cook combination is modelled independently; at the other extreme a pooled predictive model is obtained across all muscle x cook combinations. Via a data-driven regularization method, we cover all muscle x cook-specific models along this spectrum. We demonstrate that the proposed predictive method attains considerable accuracy improvements relative to independent or pooled approaches on unique MSA data sets.","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46893809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marina Friedrich, E. Mahieu, Stephan Smeekes, Jakob Raymaekers, I. Wilms, D. Matteson
{"title":"Data Science in Science: Special Issue on Data Science in Environmental and Climate Sciences","authors":"Marina Friedrich, E. Mahieu, Stephan Smeekes, Jakob Raymaekers, I. Wilms, D. Matteson","doi":"10.1080/26941899.2022.2081002","DOIUrl":"https://doi.org/10.1080/26941899.2022.2081002","url":null,"abstract":"","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41456847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Science in Science: A New Journal with a Radically Collaborative Mission","authors":"D. Matteson","doi":"10.1080/26941899.2022.2043137","DOIUrl":"https://doi.org/10.1080/26941899.2022.2043137","url":null,"abstract":"","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41370299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of CYGNSS and Jason-3 Wind Speed Measurements via Gaussian Processes","authors":"William Bekerman, J. Guinness","doi":"10.1080/26941899.2023.2194349","DOIUrl":"https://doi.org/10.1080/26941899.2023.2194349","url":null,"abstract":"Wind is a critical component of the Earth system and has unmistakable impacts on everyday life. The CYGNSS satellite mission improves observational coverage of ocean winds via a fleet of eight micro-satellites that use reflected GNSS signals to infer surface wind speed. We present analyses characterizing variability in wind speed measurements among the eight CYGNSS satellites and between antennas. In particular, we use a carefully constructed Gaussian process model that leverages comparisons between CYGNSS and Jason-3 during a one-year period from September 2019 to September 2020. The CYGNSS sensors exhibit a range of biases, most of them between -1.0 m/s and +0.2 m/s with respect to Jason-3, indicating that some CYGNSS sensors are biased with respect to one another and with respect to Jason-3. The biases between the starboard and port antennas within a CYGNSS satellite are smaller. Our results are consistent with, yet sharper than, a more traditional paired comparison analysis. We also explore the possibility that the bias depends on wind speed, finding some evidence that CYGNSS satellites have positive biases with respect to Jason-3 at low wind speeds. However, we argue that there are subtle issues associated with estimating wind speed-dependent biases, so additional careful statistical modeling and analysis is warranted.","PeriodicalId":72770,"journal":{"name":"Data science in science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43811000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}