Machine learning and knowledge extraction最新文献

筛选
英文 中文
Detection of Temporal Shifts in Semantics Using Local Graph Clustering 基于局部图聚类的语义时间偏移检测
Machine learning and knowledge extraction Pub Date : 2023-01-13 DOI: 10.3390/make5010008
N. Hwang, S. Chatterjee, Yanming Di, Sharmodeep Bhattacharyya
{"title":"Detection of Temporal Shifts in Semantics Using Local Graph Clustering","authors":"N. Hwang, S. Chatterjee, Yanming Di, Sharmodeep Bhattacharyya","doi":"10.3390/make5010008","DOIUrl":"https://doi.org/10.3390/make5010008","url":null,"abstract":"Many changes in our digital corpus have been brought about by the interplay between rapid advances in digital communication and the current environment characterized by pandemics, political polarization, and social unrest. One such change is the pace with which new words enter the mass vocabulary and the frequency at which meanings, perceptions, and interpretations of existing expressions change. The current state-of-the-art algorithms do not allow for an intuitive and rigorous detection of these changes in word meanings over time. We propose a dynamic graph-theoretic approach to inferring the semantics of words and phrases (“terms”) and detecting temporal shifts. Our approach represents each term as a stochastic time-evolving set of contextual words and is a count-based distributional semantic model in nature. We use local clustering techniques to assess the structural changes in a given word’s contextual words. We demonstrate the efficacy of our method by investigating the changes in the semantics of the phrase “Chinavirus”. We conclude that the term took on a much more pejorative meaning when the White House used the term in the second half of March 2020, although the effect appears to have been temporary. We make both the dataset and the code used to generate this paper’s results available.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"298 ","pages":"128-143"},"PeriodicalIF":0.0,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72541768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
E2H Distance-Weighted Minimum Reference Set for Numerical and Categorical Mixture Data and a Bayesian Swap Feature Selection Algorithm 数值和分类混合数据的E2H距离加权最小参考集及贝叶斯交换特征选择算法
Machine learning and knowledge extraction Pub Date : 2023-01-11 DOI: 10.3390/make5010007
Yuto Omae, Masaya Mori
{"title":"E2H Distance-Weighted Minimum Reference Set for Numerical and Categorical Mixture Data and a Bayesian Swap Feature Selection Algorithm","authors":"Yuto Omae, Masaya Mori","doi":"10.3390/make5010007","DOIUrl":"https://doi.org/10.3390/make5010007","url":null,"abstract":"Generally, when developing classification models using supervised learning methods (e.g., support vector machine, neural network, and decision tree), feature selection, as a pre-processing step, is essential to reduce calculation costs and improve the generalization scores. In this regard, the minimum reference set (MRS), which is a feature selection algorithm, can be used. The original MRS considers a feature subset as effective if it leads to the correct classification of all samples by using the 1-nearest neighbor algorithm based on small samples. However, the original MRS is only applicable to numerical features, and the distances between different classes cannot be considered. Therefore, herein, we propose a novel feature subset evaluation algorithm, referred to as the “E2H distance-weighted MRS,” which can be used for a mixture of numerical and categorical features and considers the distances between different classes in the evaluation. Moreover, a Bayesian swap feature selection algorithm, which is used to identify an effective feature subset, is also proposed. The effectiveness of the proposed methods is verified based on experiments conducted using artificially generated data comprising a mixture of numerical and categorical features.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"15 1","pages":"109-127"},"PeriodicalIF":0.0,"publicationDate":"2023-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85423013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
XAIR: A Systematic Metareview of Explainable AI (XAI) Aligned to the Software Development Process XAIR:与软件开发过程相结合的可解释AI (XAI)的系统元视图
Machine learning and knowledge extraction Pub Date : 2023-01-11 DOI: 10.3390/make5010006
Tobias Clement, Nils Kemmerzell, Mohamed Abdelaal, M. Amberg
{"title":"XAIR: A Systematic Metareview of Explainable AI (XAI) Aligned to the Software Development Process","authors":"Tobias Clement, Nils Kemmerzell, Mohamed Abdelaal, M. Amberg","doi":"10.3390/make5010006","DOIUrl":"https://doi.org/10.3390/make5010006","url":null,"abstract":"Currently, explainability represents a major barrier that Artificial Intelligence (AI) is facing in regard to its practical implementation in various application domains. To combat the lack of understanding of AI-based systems, Explainable AI (XAI) aims to make black-box AI models more transparent and comprehensible for humans. Fortunately, plenty of XAI methods have been introduced to tackle the explainability problem from different perspectives. However, due to the vast search space, it is challenging for ML practitioners and data scientists to start with the development of XAI software and to optimally select the most suitable XAI methods. To tackle this challenge, we introduce XAIR, a novel systematic metareview of the most promising XAI methods and tools. XAIR differentiates itself from existing reviews by aligning its results to the five steps of the software development process, including requirement analysis, design, implementation, evaluation, and deployment. Through this mapping, we aim to create a better understanding of the individual steps of developing XAI software and to foster the creation of real-world AI applications that incorporate explainability. Finally, we conclude with highlighting new directions for future research.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"32 1","pages":"78-108"},"PeriodicalIF":0.0,"publicationDate":"2023-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78826420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Learning Sentence-Level Representations with Predictive Coding 用预测编码学习句子级表示
Machine learning and knowledge extraction Pub Date : 2023-01-09 DOI: 10.3390/make5010005
Vladimir Araujo, M. Moens, Álvaro Soto
{"title":"Learning Sentence-Level Representations with Predictive Coding","authors":"Vladimir Araujo, M. Moens, Álvaro Soto","doi":"10.3390/make5010005","DOIUrl":"https://doi.org/10.3390/make5010005","url":null,"abstract":"Learning sentence representations is an essential and challenging topic in the deep learning and natural language processing communities. Recent methods pre-train big models on a massive text corpus, focusing mainly on learning the representation of contextualized words. As a result, these models cannot generate informative sentence embeddings since they do not explicitly exploit the structure and discourse relationships existing in contiguous sentences. Drawing inspiration from human language processing, this work explores how to improve sentence-level representations of pre-trained models by borrowing ideas from predictive coding theory. Specifically, we extend BERT-style models with bottom-up and top-down computation to predict future sentences in latent space at each intermediate layer in the networks. We conduct extensive experimentation with various benchmarks for the English and Spanish languages, designed to assess sentence- and discourse-level representations and pragmatics-focused assessments. Our results show that our approach improves sentence representations consistently for both languages. Furthermore, the experiments also indicate that our models capture discourse and pragmatics knowledge. In addition, to validate the proposed method, we carried out an ablation study and a qualitative study with which we verified that the predictive mechanism helps to improve the quality of the representations.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"12 1","pages":"59-77"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79531481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
IPPT4KRL: Iterative Post-Processing Transfer for Knowledge Representation Learning 知识表示学习的迭代后处理迁移
Machine learning and knowledge extraction Pub Date : 2023-01-06 DOI: 10.3390/make5010004
Weihang Zhang, O. Șerban, Jiahao Sun, Yike Guo
{"title":"IPPT4KRL: Iterative Post-Processing Transfer for Knowledge Representation Learning","authors":"Weihang Zhang, O. Șerban, Jiahao Sun, Yike Guo","doi":"10.3390/make5010004","DOIUrl":"https://doi.org/10.3390/make5010004","url":null,"abstract":"Knowledge Graphs (KGs), a structural way to model human knowledge, have been a critical component of many artificial intelligence applications. Many KG-based tasks are built using knowledge representation learning, which embeds KG entities and relations into a low-dimensional semantic space. However, the quality of representation learning is often limited by the heterogeneity and sparsity of real-world KGs. Multi-KG representation learning, which utilizes KGs from different sources collaboratively, presents one promising solution. In this paper, we propose a simple, but effective iterative method that post-processes pre-trained knowledge graph embedding (IPPT4KRL) on individual KGs to maximize the knowledge transfer from another KG when a small portion of alignment information is introduced. Specifically, additional triples are iteratively included in the post-processing based on their adjacencies to the cross-KG alignments to refine the pre-trained embedding space of individual KGs. We also provide the benchmarking results of existing multi-KG representation learning methods on several generated and well-known datasets. The empirical results of the link prediction task on these datasets show that the proposed IPPT4KRL method achieved comparable and even superior results when compared against more complex methods in multi-KG representation learning.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"11 1","pages":"43-58"},"PeriodicalIF":0.0,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91361686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Arabic Cyberbullying Tweets Using Machine Learning 使用机器学习检测阿拉伯网络欺凌推文
Machine learning and knowledge extraction Pub Date : 2023-01-05 DOI: 10.3390/make5010003
Alanoud Mohammed Alduailaj, A. Belghith
{"title":"Detecting Arabic Cyberbullying Tweets Using Machine Learning","authors":"Alanoud Mohammed Alduailaj, A. Belghith","doi":"10.3390/make5010003","DOIUrl":"https://doi.org/10.3390/make5010003","url":null,"abstract":"The advancement of technology has paved the way for a new type of bullying, which often leads to negative stigma in the social setting. Cyberbullying is a cybercrime wherein one individual becomes the target of harassment and hatred. It has recently become more prevalent due to a rise in the usage of social media platforms, and, in some severe situations, it has even led to victims’ suicides. In the literature, several cyberbullying detection methods are proposed, but they are mainly focused on word-based data and user account attributes. Furthermore, most of them are related to the English language. Meanwhile, only a few papers have studied cyberbullying detection in Arabic social media platforms. This paper, therefore, aims to use machine learning in the Arabic language for automatic cyberbullying detection. The proposed mechanism identifies cyberbullying using the Support Vector Machine (SVM) classifier algorithm by using a real dataset obtained from YouTube and Twitter to train and test the classifier. Moreover, we include the Farasa tool to overcome text limitations and improve the detection of bullying attacks.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"11 13 1","pages":"29-42"},"PeriodicalIF":0.0,"publicationDate":"2023-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79466260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Machine Learning and Knowledge Extraction: 7th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2023, Benevento, Italy, August 29 – September 1, 2023, Proceedings 机器学习与知识提取:第7届IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9国际跨领域会议,CD-MAKE 2023,贝内文托,意大利,2023年8月29日- 9月1日,论文集
Machine learning and knowledge extraction Pub Date : 2023-01-01 DOI: 10.1007/978-3-031-40837-3
{"title":"Machine Learning and Knowledge Extraction: 7th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2023, Benevento, Italy, August 29 – September 1, 2023, Proceedings","authors":"","doi":"10.1007/978-3-031-40837-3","DOIUrl":"https://doi.org/10.1007/978-3-031-40837-3","url":null,"abstract":"","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50988111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Skew Class-balanced Re-weighting for Unbiased Scene Graph Generation 用于无偏场景图生成的倾斜类平衡重加权
Machine learning and knowledge extraction Pub Date : 2023-01-01 DOI: 10.3390/make5010018
Haeyong Kang, C. D. Yoo
{"title":"Skew Class-balanced Re-weighting for Unbiased Scene Graph Generation","authors":"Haeyong Kang, C. D. Yoo","doi":"10.3390/make5010018","DOIUrl":"https://doi.org/10.3390/make5010018","url":null,"abstract":"An unbiased scene graph generation (SGG) algorithm referred to as Skew Class-Balanced Re-Weighting (SCR) is proposed for considering the unbiased predicate prediction caused by the long-tailed distribution. The prior works focus mainly on alleviating the deteriorating performances of the minority predicate predictions, showing drastic dropping recall scores, i.e., losing the majority predicate performances. It has not yet correctly analyzed the trade-off between majority and minority predicate performances in the limited SGG datasets. In this paper, to alleviate the issue, the Skew Class-Balanced Re-Weighting (SCR) loss function is considered for the unbiased SGG models. Leveraged by the skewness of biased predicate predictions, the SCR estimates the target predicate weight coefficient and then re-weights more to the biased predicates for better trading-off between the majority predicates and the minority ones. Extensive experiments conducted on the standard Visual Genome dataset and Open Image V4 and V6 show the performances and generality of the SCR with the traditional SGG models.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"16 1","pages":"287-303"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80049816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Synthetic Data Generation for Visual Detection of Flattened PET Bottles PET压扁瓶视觉检测的合成数据生成
Machine learning and knowledge extraction Pub Date : 2022-12-29 DOI: 10.3390/make5010002
Vitālijs Feščenko, Jānis Ārents, R. Kadikis
{"title":"Synthetic Data Generation for Visual Detection of Flattened PET Bottles","authors":"Vitālijs Feščenko, Jānis Ārents, R. Kadikis","doi":"10.3390/make5010002","DOIUrl":"https://doi.org/10.3390/make5010002","url":null,"abstract":"Polyethylene terephthalate (PET) bottle recycling is a highly automated task; however, manual quality control is required due to inefficiencies of the process. In this paper, we explore automation of the quality control sub-task, namely visual bottle detection, using convolutional neural network (CNN)-based methods and synthetic generation of labelled training data. We propose a synthetic generation pipeline tailored for transparent and crushed PET bottle detection; however, it can also be applied to undeformed bottles if the viewpoint is set from above. We conduct various experiments on CNNs to compare the quality of real and synthetic data, show that synthetic data can reduce the amount of real data required and experiment with the combination of both datasets in multiple ways to obtain the best performance.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"88 1","pages":"14-28"},"PeriodicalIF":0.0,"publicationDate":"2022-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81405429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multimodal AutoML via Representation Evolution 基于表示进化的多模态自动化
Machine learning and knowledge extraction Pub Date : 2022-12-23 DOI: 10.3390/make5010001
Blaž Škrlj, Matej Bevec, Nadine Lavrac
{"title":"Multimodal AutoML via Representation Evolution","authors":"Blaž Škrlj, Matej Bevec, Nadine Lavrac","doi":"10.3390/make5010001","DOIUrl":"https://doi.org/10.3390/make5010001","url":null,"abstract":"With the increasing amounts of available data, learning simultaneously from different types of inputs is becoming necessary to obtain robust and well-performing models. With the advent of representation learning in recent years, lower-dimensional vector-based representations have become available for both images and texts, while automating simultaneous learning from multiple modalities remains a challenging problem. This paper presents an AutoML (automated machine learning) approach to automated machine learning model configuration identification for data composed of two modalities: texts and images. The approach is based on the idea of representation evolution, the process of automatically amplifying heterogeneous representations across several modalities, optimized jointly with a collection of fast, well-regularized linear models. The proposed approach is benchmarked against 11 unimodal and multimodal (texts and images) approaches on four real-life benchmark datasets from different domains. It achieves competitive performance with minimal human effort and low computing requirements, enabling learning from multiple modalities in automated manner for a wider community of researchers.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"8 1","pages":"1-13"},"PeriodicalIF":0.0,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86545568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书