Yangkang Chen, Alexandras Savvaidis, O.M. Saad, Guo-Chin Dino Huang, Daniel Siervo, Vincent O’Sullivan, Cooper McCabe, Bede Uku, Preston Fleck, Grace Burke, Natalie L. Alvarez, Jessica Domino, I. Grigoratos
{"title":"TXED: The Texas Earthquake Dataset for AI","authors":"Yangkang Chen, Alexandras Savvaidis, O.M. Saad, Guo-Chin Dino Huang, Daniel Siervo, Vincent O’Sullivan, Cooper McCabe, Bede Uku, Preston Fleck, Grace Burke, Natalie L. Alvarez, Jessica Domino, I. Grigoratos","doi":"10.1785/0220230327","DOIUrl":null,"url":null,"abstract":"\n Machine-learning (ML) seismology relies on large datasets with high-fidelity labels from humans to train generalized models. Among the seismological applications of ML, earthquake detection, and P- and S-wave arrival picking are the most widely studied, with capabilities that can exceed humans. Here, we present a regional artificial intelligence (AI) earthquake dataset (TXED) compiled for the state of Texas. The TXED dataset is composed of earthquake signals with manually picked P- and S-wave arrival times and manually picked noise waveforms corresponding to more than 20,000 earthquake events spanning from the beginning of the Texas seismological network (TexNet) (1 January 2017) to date. These data are a supplement to the existing worldwide open-access seismological AI datasets and represent the signal and noise characteristics of Texas. Direct applications of the TXED datasets include improving the performance of a global picking model in Texas by transfer learning using the new dataset. This dataset will also serve as a benchmark dataset for fundamental AI research like designing seismology-oriented deep-learning architectures. We plan to continue to expand the TXED dataset as more observations are made by TexNet analysts.","PeriodicalId":508466,"journal":{"name":"Seismological Research Letters","volume":"38 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seismological Research Letters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1785/0220230327","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Machine-learning (ML) seismology relies on large datasets with high-fidelity labels from humans to train generalized models. Among the seismological applications of ML, earthquake detection, and P- and S-wave arrival picking are the most widely studied, with capabilities that can exceed humans. Here, we present a regional artificial intelligence (AI) earthquake dataset (TXED) compiled for the state of Texas. The TXED dataset is composed of earthquake signals with manually picked P- and S-wave arrival times and manually picked noise waveforms corresponding to more than 20,000 earthquake events spanning from the beginning of the Texas seismological network (TexNet) (1 January 2017) to date. These data are a supplement to the existing worldwide open-access seismological AI datasets and represent the signal and noise characteristics of Texas. Direct applications of the TXED datasets include improving the performance of a global picking model in Texas by transfer learning using the new dataset. This dataset will also serve as a benchmark dataset for fundamental AI research like designing seismology-oriented deep-learning architectures. We plan to continue to expand the TXED dataset as more observations are made by TexNet analysts.