The Corporation Lawsuit Prediction based on Guiding Learning and Collaborative Filtering Recommendation

2019 IEEE International Conference on Intelligence and Security Informatics (ISI) Pub Date : 2019-07-01 DOI:10.1109/ISI.2019.8823537

Zhenyu Wu, Guangda Chen, Jingjing Yao

{"title":"The Corporation Lawsuit Prediction based on Guiding Learning and Collaborative Filtering Recommendation","authors":"Zhenyu Wu, Guangda Chen, Jingjing Yao","doi":"10.1109/ISI.2019.8823537","DOIUrl":null,"url":null,"abstract":"It is meaningful to use data mining technology to predict the type of lawsuit which a company may receive so that enterprises can avoid lawsuit risks. So we propose a corporation lawsuit prediction algorithm based on guiding learning and collaborative filtering recommendation. Firstly, we use the adaptive synthetic sampling approach (ADASYN) to generate more synthetic data for different minority classes according to their different level of difficulty in learning, so that the training would focus on these minority classes that are difficulty to learn and reduce the learning bias introduced by the imbalance of data distribution. Secondly, for the sake of solving the problem that the insufficient samples make it difficult for the model to learn enough knowledge resulting in a large fluctuation of final scores during the training and poor model stability, we use guiding learning to integrate the basic knowledge of all types of lawsuit a company may receive in the future obtained by the multi-label classification model into the training process of TOP-1 and TOP-2 predictive models. Finally, in order to further improve the prediction accuracy, we use the collaborative filtering recommendation algorithm (CFRA) to select the most similar sample with each test sample from the training set, and the lawsuit type of the selected sample is directly used as the predicted lawsuit type of the corresponding test sample, thereby improving the total prediction accuracy. The experimental results show that the proposed algorithm can effectively predict the most probable lawsuit types of the Top2 for corporations.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"215 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI.2019.8823537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

It is meaningful to use data mining technology to predict the type of lawsuit which a company may receive so that enterprises can avoid lawsuit risks. So we propose a corporation lawsuit prediction algorithm based on guiding learning and collaborative filtering recommendation. Firstly, we use the adaptive synthetic sampling approach (ADASYN) to generate more synthetic data for different minority classes according to their different level of difficulty in learning, so that the training would focus on these minority classes that are difficulty to learn and reduce the learning bias introduced by the imbalance of data distribution. Secondly, for the sake of solving the problem that the insufficient samples make it difficult for the model to learn enough knowledge resulting in a large fluctuation of final scores during the training and poor model stability, we use guiding learning to integrate the basic knowledge of all types of lawsuit a company may receive in the future obtained by the multi-label classification model into the training process of TOP-1 and TOP-2 predictive models. Finally, in order to further improve the prediction accuracy, we use the collaborative filtering recommendation algorithm (CFRA) to select the most similar sample with each test sample from the training set, and the lawsuit type of the selected sample is directly used as the predicted lawsuit type of the corresponding test sample, thereby improving the total prediction accuracy. The experimental results show that the proposed algorithm can effectively predict the most probable lawsuit types of the Top2 for corporations.

查看原文本刊更多论文

基于引导学习和协同过滤推荐的公司诉讼预测

利用数据挖掘技术对企业可能面临的诉讼类型进行预测，对企业规避诉讼风险具有重要意义。为此，我们提出了一种基于引导学习和协同过滤推荐的公司诉讼预测算法。首先，我们采用自适应合成采样方法(ADASYN)，根据不同学习难度的小众类生成更多的合成数据，使训练集中在这些难学习的小众类上，减少数据分布不平衡带来的学习偏差。其次，为了解决样本不足导致模型难以学习到足够的知识，导致训练过程中最终分数波动较大，模型稳定性较差的问题，我们采用指导性学习的方法，将多标签分类模型获得的公司未来可能会收到的各类诉讼的基本知识，整合到TOP-1和TOP-2预测模型的训练过程中。最后，为了进一步提高预测精度，我们使用协同过滤推荐算法(CFRA)从训练集中选择与每个测试样本最相似的样本，并将所选样本的诉讼类型直接用作相应测试样本的预测诉讼类型，从而提高总预测精度。实验结果表明，该算法可以有效地预测企业Top2中最可能发生的诉讼类型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE International Conference on Intelligence and Security Informatics (ISI)

自引率

0.00%

发文量