Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)最新文献

筛选
英文 中文
A Machine Learning with an Inlier/Outlier Separation Approach for the Prediction of Wagon Maintenance Times 基于离群值分离的机器学习货车维修时间预测方法
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227789
Josemar Coelho Felix, Vanessa Miranda Oliveira, Rodrigo Silva
{"title":"A Machine Learning with an Inlier/Outlier Separation Approach for the Prediction of Wagon Maintenance Times","authors":"Josemar Coelho Felix, Vanessa Miranda Oliveira, Rodrigo Silva","doi":"10.5753/kdmile.2022.227789","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227789","url":null,"abstract":"Time spent in wagons maintenance consumes a significant part of a rail freight company's budget. Thus, knowing how much time it is going to be spent in a maintenance procedure is critical for their management and planning. A common approach used to predict these time expenditures is the so called chronoanalysis. Despite their wide spread use, they may be inaccurate in some scenarios. Thus, in this paper, we try to replace it with machine leaning models which did not work at first. Then we propose a methodology that uses the chronoanalysis to divide the maintenance procedures into outliers and inliers. Hence, we were able to create independent models for each class. With this approach, the average mean absolute error was reduced from about 6 man-hour to a little above 2 man-hours. The best tested configuration presented an average mean absolute error of 0.417 man-hours compared with a 4.490 man-hours from the chronoanalysis.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122090296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tourism Recommendation System using complex network approaches 基于复杂网络方法的旅游推荐系统
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227941
A. P. S. Alves, Lucas G. S. Félix, C. M. Barbosa, V. D. F. Vieira, C. R. Xavier
{"title":"Tourism Recommendation System using complex network approaches","authors":"A. P. S. Alves, Lucas G. S. Félix, C. M. Barbosa, V. D. F. Vieira, C. R. Xavier","doi":"10.5753/kdmile.2022.227941","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227941","url":null,"abstract":"The amount of available data on the web has grown exponentially, mostly due to the emergence of the Collaborative Internet, in mid-2006, which turns the process of obtaining information into a hard task. This way, several computational techniques have been used in order to automate the exploitation and analysis of data, such as Text Mining techniques, Topic Modeling (TM), which establishes relationships between text documents and discussion topics through the present words, and Sentiment Analysis (SA), whose objective is to identify sentences' polarity; Complex Networks modeling, which seek to capture the dynamics of complex systems, present in social networks; and Recommendation Systems, which assist with decision making and whose operation resides in the suggestion of items that have not yet been evaluated by a user, such as traveling to a new place or trying another meal from a menu. The Tourism scenario is also included in the context of massive data generation and advances in techniques to deal with them. In this case, specialized travel platforms, like Tripadvisor, have a major role since they concentrate a large amount of data about users and their experience in Points-of-Interest (POI). Therefore, this work proposes a new approach to a predictive model for POI recommendation systems based on the construction of a Complex Network and the use of specific techniques for its structural analysis. The city chosen to validate these objectives was the city of Tiradentes, Minas Gerais, whose geographic proximity and tourism-oriented economy make it a good choice. The results obtained show that a predictive model based on Complex Networks does not overcome the error obtained by baseline algorithms, however, it brings a good ranking correlation between what was predicted and the real result, which makes it a good option for recommendation systems.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"06 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129127294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Successful Youtube video identification using multimodal deep learning 使用多模态深度学习成功识别Youtube视频
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227792
Lucas de Souza Rodrigues, K. Sakiyama, Leozitor Floro de Souza, E. Matsubara, B. Nogueira
{"title":"Successful Youtube video identification using multimodal deep learning","authors":"Lucas de Souza Rodrigues, K. Sakiyama, Leozitor Floro de Souza, E. Matsubara, B. Nogueira","doi":"10.5753/kdmile.2022.227792","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227792","url":null,"abstract":"Text from titles and audio transcriptions, image thumbnails, number of likes, dislikes, and views are examples of available data in a YouTube video. Despite the variability, most standard Deep Learning models use only one type of data. Moreover, the simultaneous use of multiple data sources for such problems is still rare. To shed light on these problems, we empirically evaluate eight different multimodal fusion operations using embeddings extracted from image thumbnails and video titles of YouTube videos using standard Deep Learning models, ResNet-based SE-Net for image feature extraction, and BERT to NLP. Experimental results show that simple operations such as sum or subtract embeddings can improve the accuracy of models. The multimodal fusion operations in this dataset achieved 81.3% accuracy, outperforming the unimodal models by 3.86% (text) and 5.79% (video).","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134493278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Threshold Feature Selection PCA 阈值特征选择PCA
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227718
Felipe de Melo Battisti, T. B. A. de Carvalho
{"title":"Threshold Feature Selection PCA","authors":"Felipe de Melo Battisti, T. B. A. de Carvalho","doi":"10.5753/kdmile.2022.227718","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227718","url":null,"abstract":"Classification algorithms encounter learning difficulties when data has non-discriminant features. Dimensionality reduction techniques such as PCA are commonly applied. However, PCA has the disadvantage of being an unsupervised method, ignoring relevant class information on data. Therefore, this paper proposes the Threshold Feature Selector (TFS), a new supervised dimensionality reduction method that employs class thresholds to select more relevant features. We also present the Threshold PCA (TPCA), a combination of our supervised technique with standard PCA. During experiments, TFS achieved higher accuracy in 90% of the datasets compared with the original data. The second proposed technique, TPCA, outperformed the standard PCA in accuracy gain in 70% of the datasets.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114552595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Value Estimation of Properties Administered by the Brazilian Army Using Machine Learning and Spatial Components 使用机器学习和空间组件对巴西军队管理的属性进行价值估计
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227798
José Nilo Alves de Sousa Neto, M. Ladeira
{"title":"Value Estimation of Properties Administered by the Brazilian Army Using Machine Learning and Spatial Components","authors":"José Nilo Alves de Sousa Neto, M. Ladeira","doi":"10.5753/kdmile.2022.227798","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227798","url":null,"abstract":"The valuation of an institution’s patrimony represents a necessary condition for an efficient management of its assets. The execution and analysis of real estate appraisal reports are essential to the achievement of some strategic objectives of the Brazilian Army, but they are also quite costly in terms of time, labor and financial resources. Sometimes, great effort is required for the aforementioned steps to take place and the market value finally obtained is inconsistent with what was initially imagined by the authorities, causing the technical study carried out to not be effectively used in negotiations by the organization. This work proposes the development of predictive models capable of building estimates of real estate values, so that the formal requests of the managers that imply the stages of execution and analysis of appraisal reports can occur with this information as an initial input. Counting on linear and nonlinear approaches and on machine learning techniques, the models have a reasonable level of assertiveness and national geographic coverage when generate estimated market values of Union real estate assets. Intrinsic and extrinsic variables to the properties were considered, including tests of aggregation of spatial components on some of them. As the interpretability of the proposed solution is an important requirement in both linear and nonlinear approaches, the Shapley value was adopted as a tool to support the guarantee of explainability and a PLS-SEM conceptual model was built to select attributes in a reasoned manner. These two considerations associated with modeling of real estate prices at a national level represent an innovation of this work in relation to the scientific literature analyzed.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117279579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differentiable Planning with Indefinite Horizon 具有无限视界的可微分规划
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227974
Daniel B. Dias, Leliane N. de Barros, Karina V. Delgado, D. Mauá
{"title":"Differentiable Planning with Indefinite Horizon","authors":"Daniel B. Dias, Leliane N. de Barros, Karina V. Delgado, D. Mauá","doi":"10.5753/kdmile.2022.227974","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227974","url":null,"abstract":"With the recent advances in automated planning based on deep-learning techniques, Deep Reactive Policies (DRPs) have been shown as a powerful framework to solve Markov Decision Processes (MDPs) with a certain degree of complexity, like MDPs with continuous action-state spaces and exogenous events. Some differentiable planning algorithms can learn these policies through policy-gradient techniques considering a finite horizon MDP. However, for certain domains, we do not know the ideal size of the horizon needed to find an optimal solution, even when we have a planning goal description, that can either be a simple reachability goal or a complex goal involving path optimization. This work aims to solve a continuous MDP through differentiable planning, considering the problem horizon as a hyperparameter that can be adjusted for a DRP training process. This preliminary investigation show that it is possible to find better policies by choosing a horizon that encompasses the planning goal.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121864584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Named Entity Recognition Approaches Applied to Legal Document Segmentation 命名实体识别方法在法律文件分割中的应用
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227949
F. X. B. da Silva, G. M. C. Guimarães, R. Marcacini, A. L. Queiroz, V. R. P. Borges, T. P. Faleiros, L. P. F. Garcia
{"title":"Named Entity Recognition Approaches Applied to Legal Document Segmentation","authors":"F. X. B. da Silva, G. M. C. Guimarães, R. Marcacini, A. L. Queiroz, V. R. P. Borges, T. P. Faleiros, L. P. F. Garcia","doi":"10.5753/kdmile.2022.227949","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227949","url":null,"abstract":"Document Segmentation is a method of dividing a document into smaller parts, known as segments, which share similarities that allow machines to distinguish between them. It might be useful to classify these segments, making it a problem with two steps: (I) the extraction of the segments; and (II) the annotation of these segments. The Named Entity Recognition problem's goal is to identify and classify entities within a text, having also to deal with those two questions: extraction and classification. In this study, we tackle the problem of Document Segmentation and the annotation of these segments through NER approaches, using CRF, CNN-CNN-LSTM and CNN-biLSTM-CRF models. The study is focused on Brazilian legal documents, proposing a data set of 127 annotated Portuguese texts from the Official Gazette of the Federal District, published between 2001 and 2015. The experiments were made using word-based and sentence-based models, with CRF sentence-based model showing the best results.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125397412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Meta-Learning Approach for Noise Filter Algorithm Recommendation 基于元学习的噪声滤波算法推荐
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227958
P. B. Pio, L. P. F. Garcia, A. Rivolli
{"title":"Meta-Learning Approach for Noise Filter Algorithm Recommendation","authors":"P. B. Pio, L. P. F. Garcia, A. Rivolli","doi":"10.5753/kdmile.2022.227958","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227958","url":null,"abstract":"Preprocessing techniques can increase the quality or even enable Machine Learning algorithms. However, it is not simple to identify the preprocessing algorithms we should apply. This work proposes a methodology to recommend a noise filtering algorithm based on Meta-Learning, predicting which algorithm should be chosen based on a set of features calculated from a dataset. From synthetics datasets, we created the meta-data from an extracted set of meta-features and the f1-score performance metric calculated from the DT, KNN, and RF classifiers. To perform the suggestion, we used a meta-ranker that returns the rank of the best algorithms. We selected three noise filtering algorithms, HARF, GE, and ORBoost. To predict the f1-score, we used the PCT, RF, and KNN algorithms as meta-rankers. Our results indicate that the proposed solution acquired over 60% and 80% accuracy when considering a top-1 and top-2 approach. It also shows that the meta-rankers, when compared with a random choice and single algorithms as a baseline, provided an overall performance gain for the Machine Learning algorithm.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124642162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Link Prediction-Based Method Towards Lead Management 基于链接预测的潜在客户管理方法
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227800
G. P. Brugalli, A. Gonçalves, A. Bordin, L. S. Artese
{"title":"A Link Prediction-Based Method Towards Lead Management","authors":"G. P. Brugalli, A. Gonçalves, A. Bordin, L. S. Artese","doi":"10.5753/kdmile.2022.227800","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227800","url":null,"abstract":"Lead management is an essential part of the customer acquisition and retention stages. However, as the number of leads increases, data-driven management automation is critical for better customer acquisition and retention. In this context, the present work proposes a method that supports lead management to identify and recommend to the sales team, future interests of leads that already exist in an organizations database in order to acquire or retain customers. To fulfill this objective, the network representation learning and link prediction models are explored. A case study is presented to demonstrate the effectiveness of the proposed method. All generated models reached a value between 0.873 and 0.998 of ROC-AUC. However, the prediction models showed low coefficient values, far from 1, the ideal value. Nevertheless, the method shows promise to be investigated in practice. For future work, a deep understanding of technical capabilities of network learning is suggested to obtain better results from link prediction models.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114844013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Interpretable Classification Model for Identifying Individuals with Attention Defict Hyperactivity Disorder 识别注意缺陷多动障碍个体的可解释分类模型
Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022) Pub Date : 2022-11-28 DOI: 10.5753/kdmile.2022.227962
N. Ventura, P. Loures, C. Nicola, D. M. Oliveira, M. Romano, D. M. Miranda, A. C. Silva, G. Pappa, W. Meira Jr.
{"title":"An Interpretable Classification Model for Identifying Individuals with Attention Defict Hyperactivity Disorder","authors":"N. Ventura, P. Loures, C. Nicola, D. M. Oliveira, M. Romano, D. M. Miranda, A. C. Silva, G. Pappa, W. Meira Jr.","doi":"10.5753/kdmile.2022.227962","DOIUrl":"https://doi.org/10.5753/kdmile.2022.227962","url":null,"abstract":"Attention Deficit Hyperactivity Disorder (ADHD) is a psychiatric condition that affects around 5% of children around the world. The primary attention procedure is traditionally based on analysis of ratings collected in questionnaires called psychometrics. This work aims to investigate interpretable classification models capable of not only accurately identifying individuals with ADHD, but also explain it, by providing the evidences that lead to the outcome. We compare the performance of Explainable Boosting Machine (EBM) with 3 other classical decision tree-based models and observed similar results, with the distinction of EBM being a more interpretable model. We also assess explanations quantitative and qualitatively, demonstrating how they may actually help psychiatrists in their practice.","PeriodicalId":417100,"journal":{"name":"Anais do X Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2022)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127607791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信