M. Garcia-Constantino, Katie Atkinson, Danushka Bollegala, Karl Chapman, Frans Coenen, Claire Roberts, Katy Robson
{"title":"CLIEL:从商业法律文件中提取基于上下文的信息","authors":"M. Garcia-Constantino, Katie Atkinson, Danushka Bollegala, Karl Chapman, Frans Coenen, Claire Roberts, Katy Robson","doi":"10.1145/3086512.3086520","DOIUrl":null,"url":null,"abstract":"The effectiveness of document Information Extraction (IE) is greatly affected by the structure and layout of the documents being considered. In the case of legal documents relating to commercial law, an additional challenge is the many different and varied formats, structures and layouts used. In this paper, we present work on a flexible and scalable IE environment, the CLIEL (Commercial Law Information Extraction based on Layout) environment, for application to commercial law documentation that allows layout rules to be derived and then utilised to support IE. The proposed CLIEL environment operates using NLP (Natural Language Processing) techniques, JAPE (Java Annotation Patterns Engine) rules and some GATE (General Architecture for Text Engineering) modules. The system is fully described and evaluated using a commercial law document corpus. The results demonstrate that considering the layout is beneficial for extracting data point instances from legal document collections.","PeriodicalId":425187,"journal":{"name":"Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"CLIEL: context-based information extraction from commercial law documents\",\"authors\":\"M. Garcia-Constantino, Katie Atkinson, Danushka Bollegala, Karl Chapman, Frans Coenen, Claire Roberts, Katy Robson\",\"doi\":\"10.1145/3086512.3086520\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The effectiveness of document Information Extraction (IE) is greatly affected by the structure and layout of the documents being considered. In the case of legal documents relating to commercial law, an additional challenge is the many different and varied formats, structures and layouts used. In this paper, we present work on a flexible and scalable IE environment, the CLIEL (Commercial Law Information Extraction based on Layout) environment, for application to commercial law documentation that allows layout rules to be derived and then utilised to support IE. The proposed CLIEL environment operates using NLP (Natural Language Processing) techniques, JAPE (Java Annotation Patterns Engine) rules and some GATE (General Architecture for Text Engineering) modules. The system is fully described and evaluated using a commercial law document corpus. The results demonstrate that considering the layout is beneficial for extracting data point instances from legal document collections.\",\"PeriodicalId\":425187,\"journal\":{\"name\":\"Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3086512.3086520\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3086512.3086520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CLIEL: context-based information extraction from commercial law documents
The effectiveness of document Information Extraction (IE) is greatly affected by the structure and layout of the documents being considered. In the case of legal documents relating to commercial law, an additional challenge is the many different and varied formats, structures and layouts used. In this paper, we present work on a flexible and scalable IE environment, the CLIEL (Commercial Law Information Extraction based on Layout) environment, for application to commercial law documentation that allows layout rules to be derived and then utilised to support IE. The proposed CLIEL environment operates using NLP (Natural Language Processing) techniques, JAPE (Java Annotation Patterns Engine) rules and some GATE (General Architecture for Text Engineering) modules. The system is fully described and evaluated using a commercial law document corpus. The results demonstrate that considering the layout is beneficial for extracting data point instances from legal document collections.