Katsiaryna Akhramovich, Estefanía Serral, Carlos Cetina
{"title":"A systematic literature review on the application of process mining to Industry 4.0","authors":"Katsiaryna Akhramovich, Estefanía Serral, Carlos Cetina","doi":"10.1007/s10115-023-02042-x","DOIUrl":"https://doi.org/10.1007/s10115-023-02042-x","url":null,"abstract":"<p>The transition to Industry 4.0 means a new era in manufacturing with a new level of production automation, human-to-machine cooperation and product customization. It provides many benefits and opportunities to both enterprises and consumers and allows for principally new level of cooperation. At the same time, the complexity of business processes, large volume and the complex structure of data generated and processed by different Industry 4.0 technologies create serious challenges for Business Process Management. Process mining (PM) can tackle these challenges. PM is a relatively young discipline that is positioned between process-centric and data-centric approaches and focuses on discovering, conformance checking and enhancement of end-to-end business processes. Moreover, new types of PM deal with performance analysis, comparative analysis of several processes, making predictions and triggering improvement actions. This systematic literature review studies the applicability of PM in Industry 4.0 and the benefits that PM can provide to each of the four aspects of Industry 4.0: smart factories, smart products, new business models and new customer services. Approaches of PM proposed in the selected studies are analysed and classified according to two dimensions of the study: PM and Industry 4.0. The research gaps identified while performing the systematic literature review show possible directions for further research in the area.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"25 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139481814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An automated approach for binary classification on imbalanced data","authors":"Pedro Marques Vieira, Fátima Rodrigues","doi":"10.1007/s10115-023-02046-7","DOIUrl":"https://doi.org/10.1007/s10115-023-02046-7","url":null,"abstract":"<p>Imbalanced data are present in various business sectors and must be handled with the proper resampling methods and classification algorithms. To handle imbalanced data, there are numerous resampling and learning method combinations; nonetheless, their effective use necessitates specialised knowledge. In this paper, several approaches, ranging from more accessible to more advanced in the domain of data resampling techniques, will be considered to handle imbalanced data. The application developed delivers recommendations of the most suitable combinations of techniques for a specific dataset by extracting and comparing dataset meta-feature values recorded in a knowledge base. It facilitates effortless classification and automates part of the machine learning pipeline with comparable or better results than state-of-the-art solutions and with a much smaller execution time.\u0000</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"16 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139461632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simple knowledge graph completion model based on PU learning and prompt learning","authors":"","doi":"10.1007/s10115-023-02040-z","DOIUrl":"https://doi.org/10.1007/s10115-023-02040-z","url":null,"abstract":"<h3>Abstract</h3> <p>Knowledge graphs (KGs) are important resources for many artificial intelligence tasks but usually suffer from incompleteness, which has prompted scholars to put forward the task of knowledge graph completion (KGC). Embedding-based methods, which use the structural information of the KG for inference completion, are mainstream for this task. But these methods cannot complete the inference for the entities that do not appear in the KG and are also constrained by the structural information. To address these issues, scholars have proposed text-based methods. This type of method improves the reasoning ability of the model by utilizing pre-trained language (PLMs) models to learn textual information from the knowledge graph data. However, the performance of text-based methods lags behind that of embedding-based methods. We identify that the key reason lies in the expensive negative sampling. Positive unlabeled (PU) learning is introduced to help collect negative samples with high confidence from a small number of samples, and prompt learning is introduced to produce good training results. The proposed PLM-based KGC model outperforms earlier text-based methods and rivals earlier embedding-based approaches on several benchmark datasets. By exploiting the structural information of KGs, the proposed model also has a satisfactory performance in inference speed.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"30 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139465446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Donghui Shi, Zhigang Li, Jozef Zurada, Andrew Manikas, Jian Guan, Pawel Weichbroth
{"title":"Ontology-based text convolution neural network (TextCNN) for prediction of construction accidents","authors":"Donghui Shi, Zhigang Li, Jozef Zurada, Andrew Manikas, Jian Guan, Pawel Weichbroth","doi":"10.1007/s10115-023-02036-9","DOIUrl":"https://doi.org/10.1007/s10115-023-02036-9","url":null,"abstract":"<p>The construction industry suffers from workplace accidents, including injuries and fatalities, which represent a significant economic and social burden for employers, workers, and society as a whole. The existing research on construction accidents heavily relies on expert evaluations, which often suffer from issues such as low efficiency, insufficient intelligence, and subjectivity. However, expert opinions provided in construction accident reports offer a valuable source of knowledge that can be extracted and utilized to enhance safety management. Today this valuable resource can be mined as the advent of artificial intelligence has opened up significant opportunities to advance construction site safety. Ontology represents an attractive representation scheme. Though ontology has been used in construction safety to solve the problem of information heterogeneity using formal conceptual specifications, the establishment and development of ontologies that utilize construction accident reports are currently in an early stage of development and require further improvements. Moreover, research on the exploration of incorporating deep learning methodologies into construction safety ontologies for predicting construction safety incidents is relatively limited. This paper describes a novel approach to improving the performance of accident prediction models by incorporating ontology into a deep learning model. First, a domain word discovery algorithm, based on mutual information and adjacency entropy, is used to analyze the causes of accidents mentioned in construction reports. This analysis is then combined with technical specifications and the literature in the field of construction safety to build an ontology encompassing unsafe factors related to construction accidents. By employing a Translating on Hyperplane (TransH) model, the reports are transformed into conceptual vectors using the constructed ontology. Building on this foundation, we propose a Text Convolutional Neural Network (TextCNN) model that incorporates the ontology specifically designed for construction accidents. We compared the performance of the TextCNN model against five traditional machine learning models, namely Naive Bayes, support vector machine, logistic regression, random forest, and multilayer perceptron, using three different data sets: One-Hot encoding, word vector, and conceptual vectors. The results indicate that the TextCNN model integrated with the ontology outperformed the other models in terms of performance achieving an impressive accuracy rate of 88% and AUC value of 0.92.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"209 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139421719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A label propagation community discovery algorithm combining seed node influence and neighborhood similarity","authors":"Miaomiao Liu, Jinyun Yang, Jingfeng Guo, Jing Chen","doi":"10.1007/s10115-023-02035-w","DOIUrl":"https://doi.org/10.1007/s10115-023-02035-w","url":null,"abstract":"<p>To address the problem of poor stability and low accuracy of community division caused by the randomness in the traditional label propagation algorithm (LPA), a community discovery algorithm that combines seed node influence and neighborhood similarity is proposed. Firstly, the K-shell values of neighbor nodes are combined with clustering coefficients to define node influence, the initial seed set is filtered by a threshold, and the less influential one in adjacent node pairs is removed to obtain the final seed set. Secondly, the connection strengths between non-seed nodes and seed nodes are defined based on their own weights, distance weights, and common neighbor weights. The labels of non-seed nodes are updated to the labels of seed nodes with which they have the maximum connection strength. Further, for the case that the connection strengths between a non-seed node and multiple seed nodes are the same, a new neighborhood similarity combining the information between the two types of nodes and their neighbors is proposed, thus avoiding the instability caused by randomly selecting the labels of seed nodes. Experiments are conducted on six classic real networks and eight artificial datasets with different complexities. The comparison and analysis with dozens of related algorithms are also done, which shows the proposed algorithm effectively improves the execution efficiency, and the community division results are stable and more accurate, with a maximum improvement in the modularity of about 87.64% and 47.04% over the LPA on real and artificial datasets, respectively.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"12 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139411471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On-grid and off-grid photovoltaic systems forecasting using a hybrid meta-learning method","authors":"Simona-Vasilica Oprea, Adela Bâra","doi":"10.1007/s10115-023-02037-8","DOIUrl":"https://doi.org/10.1007/s10115-023-02037-8","url":null,"abstract":"<p>In this paper, we investigate two types of photovoltaic (PV) systems (on-grid and off-grid) of different sizes and propose a reliable PV forecasting method. The novelty of our research consists in a weather data-driven feature engineering considering the operation of the PV systems in similar conditions and merging the results of deterministic and stochastic models, namely Machine Learning algorithms (Random Forest—RF, eXtreme Gradient Boost—XGB) and Deep Learning algorithms (Deep Neural Networks—DNN, Gated Recurrent Unit—GRU) into a Hybrid Meta-learning Forecasting method. It combines the estimations of the above-mentioned algorithms with relevant features to predict the PV output using a Long Short-Term Memory model. To design the PV forecast for off-grid systems, that are equally important for prosumers, and approximate the potential power of these systems, the level of load and charging state of the batteries are considered. In this context, feature engineering using the weather and PV output data, including PV characteristics, is relevant to obtaining a performant and robust PV forecast for various use cases taking into account the size and connectivity of the PV systems. On average, the Mean Absolute Error and Mean Absolute Percentage Error have halved compared to values obtained with deterministic methods and are 25% lower than the stochastic models.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"53 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139411259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An efficient automatic modulation recognition using time–frequency information based on hybrid deep learning and bagging approach","authors":"","doi":"10.1007/s10115-023-02041-y","DOIUrl":"https://doi.org/10.1007/s10115-023-02041-y","url":null,"abstract":"<h3>Abstract</h3> <p>Determining the type of modulation is an important task in military communications, satellite communications systems, and submarine communications. In this study, a new digital modulation classification model is presented for detecting various types of modulated signals. The continuous wavelet transform is used in the first step to create a visual representation of the spectral density of the frequencies of the modulation signals in a scalogram image. The subsequent stage involves the utilization of a deep convolutional neural network for feature extraction from the scalogram images. In the next step, the best features are chosen using the MRMR algorithm. MRMR algorithm increases the classification speed and the ability of interpret the classification model by reducing the dimensions of the features. In the fourth step, the modulations are classified using the group learning technique. In the simulations, modulated signals with different amounts of noise with SNR from 0 to 25 dB are considered. Then, accuracy, precision, recall, and F1-score are used to evaluate the performance of the proposed method. The results of the simulations prove that the proposed model with achieving above 99.9% accuracy performs well in the presence of different amounts of noise and provides better performance than other previous studies.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"38 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139411507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Logical assessment formula and its principles for evaluations with inaccurate ground-truth labels","authors":"Yongquan Yang","doi":"10.1007/s10115-023-02047-6","DOIUrl":"https://doi.org/10.1007/s10115-023-02047-6","url":null,"abstract":"<p>Evaluations with accurate ground-truth labels (AGTLs) have been widely employed to assess predictive models for artificial intelligence applications. However, in some specific fields, such as medical histopathology whole slide image analysis, it is quite usual the situation that AGTLs are difficult to be precisely defined or even do not exist. To alleviate this situation, we propose logical assessment formula (LAF) and reveal its principles for evaluations with inaccurate ground-truth labels (IAGTLs) via logical reasoning under uncertainty. From the revealed principles of LAF, we summarize the practicability of LAF: (1) LAF can be applied for evaluations with IAGTLs on a more difficult task, able to act like usual strategies for evaluations with AGTLs reasonably; (2) LAF can be applied for evaluations with IAGTLs from the logical perspective on an easier task, unable to act like usual strategies for evaluations with AGTLs confidently.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"36 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139373100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Noha Ahmed Bayomy, Ayman E. Khedr, Laila A. Abd-Elmegid
{"title":"A configurable mining approach for enhancing the business processes' performance","authors":"Noha Ahmed Bayomy, Ayman E. Khedr, Laila A. Abd-Elmegid","doi":"10.1007/s10115-023-02011-4","DOIUrl":"https://doi.org/10.1007/s10115-023-02011-4","url":null,"abstract":"<p>Business is a war to get the attention you deserve from your enemies, and many competitors strive to gain a prominent position. Organizations are constantly seeking innovative ways to work to stay in a competitive business environment. Business process reengineering (BPR) is one of the most management approaches that are adopted by many organizations in order to achieve a dramatic increase in performance and cost reduction. Since the risks enfolded and failure rates related to BPR projects are very high, it is necessary to find ways to support success of BPR in a systematic approach. The major target of this article is to find the implementation of the proposed model to reengineer business processes (BPs) successfully via integrating critical success factors (CSFs) of BPR and BPs' performance. It is created to detect the inefficiencies and bottlenecks in the business process, decrease costs, time and increase quality of business processes, enhance financial environment and make an effective and efficient performance of the business process. It also applies a mining technique of association rule to examine the link between CSFs and several BPs and measures business processes' performance (intended BPR success) by process time, cycle time, quality and cost pre and post reengineering BPs. Thus, it uses a way to select the appropriate model for each business process. The proposed model will be implemented using a real world the Egyptian tax authority case study in order to prove its usefulness and efficiency. Then, inferred CSFs were applied according to each process, which proved the validity and success of the proposed model.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"37 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139078524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimized neural attention mechanism for aspect-based sentiment analysis framework with optimal polarity-based weighted features","authors":"","doi":"10.1007/s10115-023-01998-0","DOIUrl":"https://doi.org/10.1007/s10115-023-01998-0","url":null,"abstract":"<h3>Abstract</h3> <p>In recent years, sentimental analysis has been broadly investigated to extract information to identify whether it is positive, negative or neutral. Sentimental analysis can be broadly performed in social media content, survey response and review. Still, it faces issues while detecting and analyzing social media content. Moreover, a social media network contains indirect sentiments and natural language ambiguities make it complicated to classify the words. Thus, the aspect-based sentiment analysis (ABSA) is emerged to develop explicating extraction methods by utilizing the syntactic parsers to make use of the relation among sentiments and aspects terms. Along with this extraction method, the word embedding is performed through Word2Vec methods to attain a low-dimensional vector depiction of text, which could not capture valuable information. Thus, it aims to design a novel ABSA model using the optimized neural network along with optimal text feature extraction. Initially, various data is collected through the benchmark dataset are given to the image pre-processing. Then, it might undergo different techniques like stemming, stop word removal as well as punctuation removal. Then, the preprocessed data are further given into the feature extraction phase to attain adequate extracted aspects. Then, it further undergoes for deep feature extraction stage, where the text conventional neural network and Glove embedding are utilized to obtain the deep features. Further, the feature concatenation is done to attain the optimization for polarity-based weighted features utilized by the enhanced hybrid optimization algorithm called hybrid Chameleon rat swarm optimization (HCRSO) for improving the performance in sentiment analysis. The optimal features are selected by the HCRSO that provides the polarity-based-weight features; thus, it separates the polarity, and the weighted features are occurred by multiplying the weight with polarities. Especially, the optimized features of polarity-based weighted features and also the parameters of epochs and hidden neuron count of neural attention mechanism-based long short-term network (NAM-LSTM) are optimized using the HCRSO algorithm. The weighted feature is applied by incorporating the NAM-LSTM and proposed HCRSO algorithm for improving the model efficiency. The empirical outcome of the recommended method shows 94% and 93% regarding accuracy and specificity. Thus, the experimental outcomes of the proposed ABSA model reveal the model’s efficiency while validating with other conventional approaches. </p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"138 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139373247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}