{"title":"FinSBD-2021: The 3rd Shared Task on Structure Boundary Detection in Unstructured Text in the Financial Domain","authors":"Willy Au, Abderrahim Ait-Azzi, Juyeon Kang","doi":"10.1145/3442442.3451378","DOIUrl":null,"url":null,"abstract":"Document processing is a foundational pre-processing task in natural language application applied in the financial domain. In this paper, we present the result of FinSBD-3, the 3rd shared task on Structure Boundary Detection in unstructured text in the financial domain. The shared task is organized as part of the 1st Workshop on Financial Technology on the Web. Participants were asked to create system detecting the boundaries of elements in unstructured text extracted from financial PDF. This edition extends the previous shared tasks by adding boundaries of visual elements such as tables, figures, page headers and page footers; on top of sentences, lists and list items which were already present in previous edition of the shared tasks.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Proceedings of the Web Conference 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442442.3451378","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Document processing is a foundational pre-processing task in natural language application applied in the financial domain. In this paper, we present the result of FinSBD-3, the 3rd shared task on Structure Boundary Detection in unstructured text in the financial domain. The shared task is organized as part of the 1st Workshop on Financial Technology on the Web. Participants were asked to create system detecting the boundaries of elements in unstructured text extracted from financial PDF. This edition extends the previous shared tasks by adding boundaries of visual elements such as tables, figures, page headers and page footers; on top of sentences, lists and list items which were already present in previous edition of the shared tasks.