Zara Nasar, Shahmin Sharafat, Muhammad Azhar, S. W. Jaffry
{"title":"Civil Data Mining using Machine Learning","authors":"Zara Nasar, Shahmin Sharafat, Muhammad Azhar, S. W. Jaffry","doi":"10.1109/ICEET56468.2022.10007237","DOIUrl":null,"url":null,"abstract":"Ever-growing digitalization and availability of massive data have revolutionized our world. This abundant digitized data is currently being processed using modern AI techniques for effective and automatic processing for the betterment of humanity. Following this revolution, as the amount of legal data also keeps on increasing due to many verdicts being passed every day, the current study deals with the automatic information mining from this data. These passed verdicts and cases are the primary source of information for judges and lawyers. Hence, there exists a wide margin of research in this domain to better serve the needs of legal stakeholders and the public. Therefore, in this study, Information Extraction is applied to extract potential entities from five hundred reported civil judgments from Lahore High Court, Pakistan. This is being carried out using a variety of algorithms, including statistical sequence labeling techniques (Hidden Markov Models, Maximum Entropy Models, and Conditional Random Fields (CRF)) as well as state-of-the-art deep learning systems (hybrid deep architectures and transformers). In addition, experiments are carried out using two widely used annotation schemes. Experiments resulted in an F1 score of more than 95 percent without using domain-specific features.","PeriodicalId":241355,"journal":{"name":"2022 International Conference on Engineering and Emerging Technologies (ICEET)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Engineering and Emerging Technologies (ICEET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEET56468.2022.10007237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Ever-growing digitalization and availability of massive data have revolutionized our world. This abundant digitized data is currently being processed using modern AI techniques for effective and automatic processing for the betterment of humanity. Following this revolution, as the amount of legal data also keeps on increasing due to many verdicts being passed every day, the current study deals with the automatic information mining from this data. These passed verdicts and cases are the primary source of information for judges and lawyers. Hence, there exists a wide margin of research in this domain to better serve the needs of legal stakeholders and the public. Therefore, in this study, Information Extraction is applied to extract potential entities from five hundred reported civil judgments from Lahore High Court, Pakistan. This is being carried out using a variety of algorithms, including statistical sequence labeling techniques (Hidden Markov Models, Maximum Entropy Models, and Conditional Random Fields (CRF)) as well as state-of-the-art deep learning systems (hybrid deep architectures and transformers). In addition, experiments are carried out using two widely used annotation schemes. Experiments resulted in an F1 score of more than 95 percent without using domain-specific features.