Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)最新文献_第2页

What is the Profile of American Inmate Misconduct Perpetrators? A Machine Learning Analysis 美国囚犯不当行为犯罪者的特征是什么?机器学习分析

Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227777

F. M. de Oliveira, M. Balbino, Luis E. Zárate, C. Nobre

{"title":"What is the Profile of American Inmate Misconduct Perpetrators? A Machine Learning Analysis","authors":"F. M. de Oliveira, M. Balbino, Luis E. Zárate, C. Nobre","doi":"10.5753/kdmile.2022.227777","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227777","url":null,"abstract":"Correctional institutions often develop rehabilitation programs to reduce the likelihood of inmates committing internal offenses and criminal recidivism after release. Therefore, it is necessary to identify the profile of each offender, both for the appropriate indication of a rehabilitation program and the level of internal security to which he must be submitted. In this context, this work aims to discover, from Machine Learning methods and the SHAP approach, which are the most significant characteristics in the prediction of misconduct by prisoners. For this, a database produced in 2004 through the Survey of Inmates in State and Federal Correctional Facilities in the United States of America, which provides nationally representative data on prisoners in state and federal facilities, was used. The predictive model based on Random Forest had the best performance; therefore, SHAP was applied to it to interpret the results. In addition, the attributes related to the type of crime committed, age at first arrest, drug use, mental or emotional health problems, having children, and being abused before arrest are more relevant in predicting internal misconduct. Thus, it is expected to contribute to the prior classification of an inmate, on time, use of programs and practices that aim to improve the lives of offenders, their reintegration into society, and, consequently, the reduction of criminal recidivism.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121131006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tractable Classification with Non-Ignorable Missing Data Using Generative Random Forests 基于生成随机森林的不可忽略缺失数据可处理分类

Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227969

Julissa Villanueva, D. Mauá

引用次数: 1

Fault Location in Transmission Lines based on LSTM Model 基于LSTM模型的输电线路故障定位

Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227805

L. A. Ensina, P. E. M. Karvat, E. C. de Almeida, L. E. S. de Oliveira

{"title":"Fault Location in Transmission Lines based on LSTM Model","authors":"L. A. Ensina, P. E. M. Karvat, E. C. de Almeida, L. E. S. de Oliveira","doi":"10.5753/kdmile.2022.227805","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227805","url":null,"abstract":"Transmission lines are fundamental components of the electric power system, demanding special attention from the protection system due to the vulnerability of these lines. This paper presents a method for fault location in transmission lines using data for a single terminal without requiring explicit feature engineering by a domain expert. The fault location task provides an approximate position of the point of the line where the failure occurred, serving as information to the operators to dispatch a maintenance staff to this location to reclose the transmission line with better reliability and safety. In our method, we extract two post-fault cycles of the three-phase current and voltage signals to serve as input to a model based on the LSTM algorithm. We defined the model's architecture with empirical experiments searching for the best structure to estimate the fault distance. For this purpose, we used a dataset with diversified failure events, also available to the scientific community. The results demonstrate the effectiveness of the proposed method with a mean error of 0.1309 km +- 0.4897 km, representing 0.0316% +- 0.1183% of the transmission line extension.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126993608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Textual Representation Based on Bag-of-Concepts and Thesaurus for Legal Information Retrieval 基于概念袋和同义词库的法律信息检索文本表示

Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227779

Wagner M. Costa, G. V. Pedrosa

{"title":"A Textual Representation Based on Bag-of-Concepts and Thesaurus for Legal Information Retrieval","authors":"Wagner M. Costa, G. V. Pedrosa","doi":"10.5753/kdmile.2022.227779","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227779","url":null,"abstract":"The retrieval of similar textual documents is a challenging task for the legal area due to its peculiar language with unique characteristics. This paper presents a new approach, called BoC-Th, proposed to represent legal documents based on the Bag-of-Concept (BoC) approach, which generates concept through clustering word vectors generated from a basic neural network model, and compute the frequencies of these concept clusters to represent document vectors. The novel contribution of the BoC-Th is to generate weighted histograms of concepts defined from the distance of the word to its respective similar term within a thesaurus. The idea is to emphasize those words that have more significance for the context, thus generating more discriminative vectors. Experimental evaluations were performed by comparing the proposed approach with the traditional BoW and BoC approaches, both popular techniques for document representation. The proposed method obtained the best result among the evaluated techniques for retrieving judgments and jurisprudence documents. The BoC-Th increased the mAP (mean Average Precision) in 51% compared to the traditional BoC approach, while being up to 3.4 times faster than the traditional BoW representation.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127992322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Forgetting on Evolving Graphs for Accurate and Diverse Stream-Based Recommendation 遗忘在进化图上的准确和多样化的基于流的推荐

Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227804

Murilo F. L. Schmitt, E. Spinosa

引用次数: 0

Market Movement Prediction Algorithm Selection by Metalearning 基于元学习的市场运动预测算法选择

Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227947

A. V. P. M. Bandeira, G. M. Ferracioli, M. R. dos Santos, A. C. P. L. F. de Carvalho

引用次数: 0

Automatic identification of similar judicial precedents 类似司法判例的自动识别

Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227943

Igor Stemler, M. Ladeira, T. P. Faleiros

{"title":"Automatic identification of similar judicial precedents","authors":"Igor Stemler, M. Ladeira, T. P. Faleiros","doi":"10.5753/kdmile.2022.227943","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227943","url":null,"abstract":"Brazilian Code of Civil Procedure has been reformulated in 2015 and created new institutes of judicial precedents to allow the Courts of Appeal to decide about similar cases based on one main case, which is considered the paradigm for similar cases that remain suspended. This mechanism aims to avoid legal uncertainty in the lower courts, but, uncertainty can be taken to the Courts of Appeal, since different courts can judge similar legal matter in the opposite way. The identification of similar judicial cases is hard because Courts of Appeal work independently and the number of cases is high. We propose the use of computational intelligence techniques to automatically identify similar judicial precedents. Our hypothesis is that algorithms based on semantic approaches, such as Latent Semantic Indexing and Latent Dirichlet Allocation, perform better than those that use only syntactic approach, as (Okapi) BM25 ranking function. The best-performing model is extended with named entities to verify if its performance increases. The performance of the models is evaluated using similarity metrics and with the assistance of a specialist. We test this approach with the database of judicial precedent of the National Council of Justice. Our approach correctly grouped more than 90% of judicial precedents and found similar precedents with divergent decisions or precedents that should be suspended due to the existence of appeals into superior courts of same subject matter. Models based on syntactic approach presented the best results, as it required lower computational cost and fewer parameter tuning compared to the others.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128372130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Genetic Programming-based AutoML for EEG Signal Classification - A Comparative Study 基于遗传规划的脑电信号自动学习分类比较研究

Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227815

I. M. Miranda, C. Aranha, A. P. L. de Carvalho, L. P. F. Garcia

{"title":"Genetic Programming-based AutoML for EEG Signal Classification - A Comparative Study","authors":"I. M. Miranda, C. Aranha, A. P. L. de Carvalho, L. P. F. Garcia","doi":"10.5753/kdmile.2022.227815","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227815","url":null,"abstract":"End-to-end Machine Learning (ML) applications using complex data often need to investigate several alternatives for the data modeling pipeline before a good solution is found. This process, which is time-consuming and subjective, can benefit from an automated solution design by using Automated Machine Learning (AutoML). End-toend AutoML allows automated data preparation, modeling, and evaluation of ML pipelines, increasing the chances of arriving at a good solution. AutoML can implement this optimization with different strategies. Among them, Genetic Programming (GP) stands out for its ability to create pipelines of arbitrary format, allowing high interpretability and the customization of information from the data context. This paper proposes and compares two approaches of end-to-end AutoML optimized with GP for a time series classification problem, the classification of Electroencephalogram (EEG) signals. We selected this dataset because of the signals’ high complexity, spatial and temporal co-variance, and nonstationarity. For the AutoML experiments, four different domain-based data characterization measures are evaluated. The analysis of the data characterization measures shows that using only spectral or time-domain features does not lead to pipelines with good predictive performance. Our experimental results also reveal how AutoML can generate more accurate and interpretable solutions than the literature’s complex and ad hoc models. The proposed approach makes it easier to analyze dimensional reduction through fitness convergence, tree depth, and extracted features.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125313763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

New State-of-the-Art for Question Answering on Portuguese SQuAD v1.1 葡萄牙队v1.1的新技术问答

Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227787

E. H. M. Da Silva, J. Laterza, T. P. Faleiros

引用次数: 0

One-Class Recommendation through Unsupervised Graph Neural Networks for Link Prediction 基于无监督图神经网络的单类推荐链接预测

Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227810

M. Gôlo, Leonardo G. Moraes, R. Goularte, R. Marcacini

{"title":"One-Class Recommendation through Unsupervised Graph Neural Networks for Link Prediction","authors":"M. Gôlo, Leonardo G. Moraes, R. Goularte, R. Marcacini","doi":"10.5753/kdmile.2022.227810","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227810","url":null,"abstract":"Recommender systems play a key role in every online platform to provide users a better experience. Many classic recommendation approaches might find issues, mainly modeling user relations. Graphs can naturally model these relations since we can connect users interacting with items. On the other hand, when we model user-item relations through graphs, we do not have interactions between all users and items. In addition, there are few non-recommendation interactions, which makes it challenging to cover this scope. Also, the scope of what will not be recommended for the user is greater than what will be recommended. An alternative is One-Class Learning (OCL) which is able to recommend or not an item for a user only to train with recommendations, mitigating the needing to cover the scope of non-recommendations. However, OCL and Recommender Systems need appropriate, adequate, and robust representations to perform the recommendations in the best possible way. Therefore, we propose the one-class recommendation via representations learned by unsupervised graph neural networks (GNNs) for link prediction to generate a more robust and meaningful representation of users and items. In the results, our GNNs for link prediction outperform other methods to represent the users and items in the one-class recommendation. Furthermore, our proposal also outperforms a GNN for link prediction. Thus, our proposal recommended better and learned more robust representations.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124179735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1