Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)最新文献

筛选
英文 中文
What is the Profile of American Inmate Misconduct Perpetrators? A Machine Learning Analysis 美国囚犯不当行为犯罪者的特征是什么?机器学习分析
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227777
F. M. de Oliveira, M. Balbino, Luis E. Zárate, C. Nobre
{"title":"What is the Profile of American Inmate Misconduct Perpetrators? A Machine Learning Analysis","authors":"F. M. de Oliveira, M. Balbino, Luis E. Zárate, C. Nobre","doi":"10.5753/kdmile.2022.227777","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227777","url":null,"abstract":"Correctional institutions often develop rehabilitation programs to reduce the likelihood of inmates committing internal offenses and criminal recidivism after release. Therefore, it is necessary to identify the profile of each offender, both for the appropriate indication of a rehabilitation program and the level of internal security to which he must be submitted. In this context, this work aims to discover, from Machine Learning methods and the SHAP approach, which are the most significant characteristics in the prediction of misconduct by prisoners. For this, a database produced in 2004 through the Survey of Inmates in State and Federal Correctional Facilities in the United States of America, which provides nationally representative data on prisoners in state and federal facilities, was used. The predictive model based on Random Forest had the best performance; therefore, SHAP was applied to it to interpret the results. In addition, the attributes related to the type of crime committed, age at first arrest, drug use, mental or emotional health problems, having children, and being abused before arrest are more relevant in predicting internal misconduct. Thus, it is expected to contribute to the prior classification of an inmate, on time, use of programs and practices that aim to improve the lives of offenders, their reintegration into society, and, consequently, the reduction of criminal recidivism.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121131006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tractable Classification with Non-Ignorable Missing Data Using Generative Random Forests 基于生成随机森林的不可忽略缺失数据可处理分类
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227969
Julissa Villanueva, D. Mauá
{"title":"Tractable Classification with Non-Ignorable Missing Data Using Generative Random Forests","authors":"Julissa Villanueva, D. Mauá","doi":"10.5753/kdmile.2022.227969","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227969","url":null,"abstract":"Missing data is abundant in predictive tasks. Typical approaches assume that the missingness process is ignorable or non-informative and handle missing data either by marginalization or heuristically. Yet, data is often missing in a non-ignorable way, which introduce bias in prediction. In this paper, we develop a new method to perform tractable predictive inference under non-ignorable missing data using probabilistic circuits derived from Decision Tree Classifiers and a partially specified response model of missingness. We show empirically that our method delivers less biased (probabilistic) classifications than approaches that assume missing at random and are more determinate than similar existing overcautious approaches.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124163239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fault Location in Transmission Lines based on LSTM Model 基于LSTM模型的输电线路故障定位
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227805
L. A. Ensina, P. E. M. Karvat, E. C. de Almeida, L. E. S. de Oliveira
{"title":"Fault Location in Transmission Lines based on LSTM Model","authors":"L. A. Ensina, P. E. M. Karvat, E. C. de Almeida, L. E. S. de Oliveira","doi":"10.5753/kdmile.2022.227805","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227805","url":null,"abstract":"Transmission lines are fundamental components of the electric power system, demanding special attention from the protection system due to the vulnerability of these lines. This paper presents a method for fault location in transmission lines using data for a single terminal without requiring explicit feature engineering by a domain expert. The fault location task provides an approximate position of the point of the line where the failure occurred, serving as information to the operators to dispatch a maintenance staff to this location to reclose the transmission line with better reliability and safety. In our method, we extract two post-fault cycles of the three-phase current and voltage signals to serve as input to a model based on the LSTM algorithm. We defined the model's architecture with empirical experiments searching for the best structure to estimate the fault distance. For this purpose, we used a dataset with diversified failure events, also available to the scientific community. The results demonstrate the effectiveness of the proposed method with a mean error of 0.1309 km +- 0.4897 km, representing 0.0316% +- 0.1183% of the transmission line extension.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126993608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Textual Representation Based on Bag-of-Concepts and Thesaurus for Legal Information Retrieval 基于概念袋和同义词库的法律信息检索文本表示
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227779
Wagner M. Costa, G. V. Pedrosa
{"title":"A Textual Representation Based on Bag-of-Concepts and Thesaurus for Legal Information Retrieval","authors":"Wagner M. Costa, G. V. Pedrosa","doi":"10.5753/kdmile.2022.227779","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227779","url":null,"abstract":"The retrieval of similar textual documents is a challenging task for the legal area due to its peculiar language with unique characteristics. This paper presents a new approach, called BoC-Th, proposed to represent legal documents based on the Bag-of-Concept (BoC) approach, which generates concept through clustering word vectors generated from a basic neural network model, and compute the frequencies of these concept clusters to represent document vectors. The novel contribution of the BoC-Th is to generate weighted histograms of concepts defined from the distance of the word to its respective similar term within a thesaurus. The idea is to emphasize those words that have more significance for the context, thus generating more discriminative vectors. Experimental evaluations were performed by comparing the proposed approach with the traditional BoW and BoC approaches, both popular techniques for document representation. The proposed method obtained the best result among the evaluated techniques for retrieving judgments and jurisprudence documents. The BoC-Th increased the mAP (mean Average Precision) in 51% compared to the traditional BoC approach, while being up to 3.4 times faster than the traditional BoW representation.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127992322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Forgetting on Evolving Graphs for Accurate and Diverse Stream-Based Recommendation 遗忘在进化图上的准确和多样化的基于流的推荐
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227804
Murilo F. L. Schmitt, E. Spinosa
{"title":"Forgetting on Evolving Graphs for Accurate and Diverse Stream-Based Recommendation","authors":"Murilo F. L. Schmitt, E. Spinosa","doi":"10.5753/kdmile.2022.227804","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227804","url":null,"abstract":"Stream-based recommender systems are an active research field, relying on incremental algorithms to update models by incorporating new data on a single pass, discarding such data after processing. A limitation of solely including new data is the accumulation of obsolete concepts, which eventually raises accuracy and scalability concerns. In this work, we propose a gradual forgetting technique for incremental neighborhood-based methods that locally forgets items based on recency and popularity, by decreasing importance of neighborhood of items for every incoming observation to emphasize more recent and reinforced ones. The technique includes parameters to increase diversity, by retaining less popular yet relevant items, and scalability, by pruning obsolete connections not reinforced by new data. Experiments conducted by extending a recent incremental graph-based approach highlight the effectiveness of the proposed technique, as its application improved scalability and diversity, outperforming baselines.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131945477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Market Movement Prediction Algorithm Selection by Metalearning 基于元学习的市场运动预测算法选择
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227947
A. V. P. M. Bandeira, G. M. Ferracioli, M. R. dos Santos, A. C. P. L. F. de Carvalho
{"title":"Market Movement Prediction Algorithm Selection by Metalearning","authors":"A. V. P. M. Bandeira, G. M. Ferracioli, M. R. dos Santos, A. C. P. L. F. de Carvalho","doi":"10.5753/kdmile.2022.227947","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227947","url":null,"abstract":"The prediction of market price movement is an essential tool for decision-making in trading scenarios. However, there are several candidate methods for this task. Metalearning can be an important ally for the automatic selection of methods, which can be machine learning algorithms for classification tasks, named here classification algorithms. In this work, we present an empirical evaluation of the metalearning application for the classification algorithms selection in the market movement prediction task. Different setups and metrics were evaluated for the meta-target selection. Cumulative return was the metric that achieved the best meta and base-level results. According to the experimental results, metalearning was a competitive selection strategy for predicting market price movement.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114248204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic identification of similar judicial precedents 类似司法判例的自动识别
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227943
Igor Stemler, M. Ladeira, T. P. Faleiros
{"title":"Automatic identification of similar judicial precedents","authors":"Igor Stemler, M. Ladeira, T. P. Faleiros","doi":"10.5753/kdmile.2022.227943","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227943","url":null,"abstract":"Brazilian Code of Civil Procedure has been reformulated in 2015 and created new institutes of judicial precedents to allow the Courts of Appeal to decide about similar cases based on one main case, which is considered the paradigm for similar cases that remain suspended. This mechanism aims to avoid legal uncertainty in the lower courts, but, uncertainty can be taken to the Courts of Appeal, since different courts can judge similar legal matter in the opposite way. The identification of similar judicial cases is hard because Courts of Appeal work independently and the number of cases is high. We propose the use of computational intelligence techniques to automatically identify similar judicial precedents. Our hypothesis is that algorithms based on semantic approaches, such as Latent Semantic Indexing and Latent Dirichlet Allocation, perform better than those that use only syntactic approach, as (Okapi) BM25 ranking function. The best-performing model is extended with named entities to verify if its performance increases. The performance of the models is evaluated using similarity metrics and with the assistance of a specialist. We test this approach with the database of judicial precedent of the National Council of Justice. Our approach correctly grouped more than 90% of judicial precedents and found similar precedents with divergent decisions or precedents that should be suspended due to the existence of appeals into superior courts of same subject matter. Models based on syntactic approach presented the best results, as it required lower computational cost and fewer parameter tuning compared to the others.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128372130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genetic Programming-based AutoML for EEG Signal Classification - A Comparative Study 基于遗传规划的脑电信号自动学习分类比较研究
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227815
I. M. Miranda, C. Aranha, A. P. L. de Carvalho, L. P. F. Garcia
{"title":"Genetic Programming-based AutoML for EEG Signal Classification - A Comparative Study","authors":"I. M. Miranda, C. Aranha, A. P. L. de Carvalho, L. P. F. Garcia","doi":"10.5753/kdmile.2022.227815","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227815","url":null,"abstract":"End-to-end Machine Learning (ML) applications using complex data often need to investigate several alternatives for the data modeling pipeline before a good solution is found. This process, which is time-consuming and subjective, can benefit from an automated solution design by using Automated Machine Learning (AutoML). End-toend AutoML allows automated data preparation, modeling, and evaluation of ML pipelines, increasing the chances of arriving at a good solution. AutoML can implement this optimization with different strategies. Among them, Genetic Programming (GP) stands out for its ability to create pipelines of arbitrary format, allowing high interpretability and the customization of information from the data context. This paper proposes and compares two approaches of end-to-end AutoML optimized with GP for a time series classification problem, the classification of Electroencephalogram (EEG) signals. We selected this dataset because of the signals’ high complexity, spatial and temporal co-variance, and nonstationarity. For the AutoML experiments, four different domain-based data characterization measures are evaluated. The analysis of the data characterization measures shows that using only spectral or time-domain features does not lead to pipelines with good predictive performance. Our experimental results also reveal how AutoML can generate more accurate and interpretable solutions than the literature’s complex and ad hoc models. The proposed approach makes it easier to analyze dimensional reduction through fitness convergence, tree depth, and extracted features.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125313763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New State-of-the-Art for Question Answering on Portuguese SQuAD v1.1 葡萄牙队v1.1的新技术问答
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227787
E. H. M. Da Silva, J. Laterza, T. P. Faleiros
{"title":"New State-of-the-Art for Question Answering on Portuguese SQuAD v1.1","authors":"E. H. M. Da Silva, J. Laterza, T. P. Faleiros","doi":"10.5753/kdmile.2022.227787","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227787","url":null,"abstract":"In the Natural Language Processing field (NLP), Machine Reading Comprehension (MRC), which involves teaching computers to read a text and understand its meaning, has been a major research goal over the last few decades. A natural way to evaluate whether a computer can fully understand a piece of text or, in other words, test a machine’s reading comprehension, is to require it to answer questions about the text. In this sense, Question Answering (QA) has received increasing attention among NLP tasks. For this study, we fine-tuned BERT Portuguese language models (BERTimbau Base and BERTimbau Large) on SQuAD-BR - the SQuAD v.1.1 dataset translated to Portuguese by the Deep Learning Brazil group - for Extractive QA task, in order to achieve better performance than other existing models trained on the dataset. As a result, we accomplished our objective, establishing the new state-of-the-art on SQuAD-BR dataset using BERTimbau Large fine-tuned model.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124267641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
One-Class Recommendation through Unsupervised Graph Neural Networks for Link Prediction 基于无监督图神经网络的单类推荐链接预测
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227810
M. Gôlo, Leonardo G. Moraes, R. Goularte, R. Marcacini
{"title":"One-Class Recommendation through Unsupervised Graph Neural Networks for Link Prediction","authors":"M. Gôlo, Leonardo G. Moraes, R. Goularte, R. Marcacini","doi":"10.5753/kdmile.2022.227810","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227810","url":null,"abstract":"Recommender systems play a key role in every online platform to provide users a better experience. Many classic recommendation approaches might find issues, mainly modeling user relations. Graphs can naturally model these relations since we can connect users interacting with items. On the other hand, when we model user-item relations through graphs, we do not have interactions between all users and items. In addition, there are few non-recommendation interactions, which makes it challenging to cover this scope. Also, the scope of what will not be recommended for the user is greater than what will be recommended. An alternative is One-Class Learning (OCL) which is able to recommend or not an item for a user only to train with recommendations, mitigating the needing to cover the scope of non-recommendations. However, OCL and Recommender Systems need appropriate, adequate, and robust representations to perform the recommendations in the best possible way. Therefore, we propose the one-class recommendation via representations learned by unsupervised graph neural networks (GNNs) for link prediction to generate a more robust and meaningful representation of users and items. In the results, our GNNs for link prediction outperform other methods to represent the users and items in the one-class recommendation. Furthermore, our proposal also outperforms a GNN for link prediction. Thus, our proposal recommended better and learned more robust representations.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124179735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信