T. Petersen, M. A. Suryani, C. Beth, Hardik Patel, K. Wallmann, M. Renz
{"title":"Geo-Quantities: A Framework for Automatic Extraction of Measurements and Spatial Context from Scientific Documents","authors":"T. Petersen, M. A. Suryani, C. Beth, Hardik Patel, K. Wallmann, M. Renz","doi":"10.1145/3469830.3470911","DOIUrl":null,"url":null,"abstract":"Quantitative information derived from scientific documents provides an important source of data for studies in almost all domains, however, manual extraction of this information is very time consuming. In this paper we will introduce a system Geo-Quantities that supports the automatic extraction of quantitative, spatial and temporal information of a given measurement entity from scientific literature using text mining techniques. The difficulty of automatic measurement recognition is mainly caused by the diverse expressions in the papers. Geo-Quantities offers an interactive interface for the visualization of extracted user-defined information, in particular spatial and temporal context. In our demonstration, we will showcase the capabilities of our system by retrieving measurements such as “mass accumulation rates” and “sedimentation rates” from scientific publications in the field of marine geology, which could have high impact in studies for building global mass accumulation rate maps. For training and evaluation of Geo-Quantities we use a corpus of domain-relevant papers.","PeriodicalId":206910,"journal":{"name":"17th International Symposium on Spatial and Temporal Databases","volume":"119 9","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"17th International Symposium on Spatial and Temporal Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3469830.3470911","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Quantitative information derived from scientific documents provides an important source of data for studies in almost all domains, however, manual extraction of this information is very time consuming. In this paper we will introduce a system Geo-Quantities that supports the automatic extraction of quantitative, spatial and temporal information of a given measurement entity from scientific literature using text mining techniques. The difficulty of automatic measurement recognition is mainly caused by the diverse expressions in the papers. Geo-Quantities offers an interactive interface for the visualization of extracted user-defined information, in particular spatial and temporal context. In our demonstration, we will showcase the capabilities of our system by retrieving measurements such as “mass accumulation rates” and “sedimentation rates” from scientific publications in the field of marine geology, which could have high impact in studies for building global mass accumulation rate maps. For training and evaluation of Geo-Quantities we use a corpus of domain-relevant papers.