{"title":"Knowledge discovery in an earthquake text database: correlation between significant earthquakes and the time of day","authors":"J. Goldman, D. S. Parker, W. Chu","doi":"10.1109/SSDM.1997.621144","DOIUrl":null,"url":null,"abstract":"The authors take a real world application from a text database and present a case history. The techniques ultimately led to a discovery contradicting an accepted paradigm in seismology. Using simple, tailored, keyword extraction, they examined a text collection of earthquake data. A discovery was made when an unusual pattern emerged from the text. They then tested a more comprehensive numerical database, treating the the text discovery as a hypothesis. It was verified using a standard /spl chi//sup 2/ statistic. The hypothesis was significant earthquakes in the longitude regions that include California, occur more often in the morning hours than any other time of day.","PeriodicalId":159935,"journal":{"name":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSDM.1997.621144","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The authors take a real world application from a text database and present a case history. The techniques ultimately led to a discovery contradicting an accepted paradigm in seismology. Using simple, tailored, keyword extraction, they examined a text collection of earthquake data. A discovery was made when an unusual pattern emerged from the text. They then tested a more comprehensive numerical database, treating the the text discovery as a hypothesis. It was verified using a standard /spl chi//sup 2/ statistic. The hypothesis was significant earthquakes in the longitude regions that include California, occur more often in the morning hours than any other time of day.