{"title":"Lawsuit category prediction based on machine learning","authors":"Yuru Xu, Mingming Zhang, Shaowu Wu, Junfeng Hu","doi":"10.1109/ISI.2019.8823328","DOIUrl":null,"url":null,"abstract":"In this paper, based on the comprehensive information of companies, 612 characteristic parameters are extracted and mined, and two prediction models of the categories of lawsuits are established. The first model is the combinatorial prediction model, which transforms the classification problem into a single-category regression problem. After the Laplace Smoothing treatment of the training label, LightGBM model was used for the 5-fold cross-validation for each of the categories. The Top 1 and Top 2 accuracy of the final combined model was 40.868% and 21.826%, respectively. The second model is Artificial Neural Network (ANN) model, which directly treats the problem as a classification problem. The ANN model with five layers is used to classify and predict the categories of lawsuits, and its Top 1 accuracy is 40.803%, and Top 2 accuracy is 21.243%. Although the accuracy is not ideal, but the method is feasible and can be used for reference. Finally, this paper analyzes the categories of misclassified lawsuits in detail.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI.2019.8823328","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, based on the comprehensive information of companies, 612 characteristic parameters are extracted and mined, and two prediction models of the categories of lawsuits are established. The first model is the combinatorial prediction model, which transforms the classification problem into a single-category regression problem. After the Laplace Smoothing treatment of the training label, LightGBM model was used for the 5-fold cross-validation for each of the categories. The Top 1 and Top 2 accuracy of the final combined model was 40.868% and 21.826%, respectively. The second model is Artificial Neural Network (ANN) model, which directly treats the problem as a classification problem. The ANN model with five layers is used to classify and predict the categories of lawsuits, and its Top 1 accuracy is 40.803%, and Top 2 accuracy is 21.243%. Although the accuracy is not ideal, but the method is feasible and can be used for reference. Finally, this paper analyzes the categories of misclassified lawsuits in detail.