Satya Samudrala, N. Behl, Vikrant Shimpi, Deepa Vaidyanathan, M. Natu
{"title":"Predicting batch process to prevent business outages","authors":"Satya Samudrala, N. Behl, Vikrant Shimpi, Deepa Vaidyanathan, M. Natu","doi":"10.1109/DSAA53316.2021.9564237","DOIUrl":"https://doi.org/10.1109/DSAA53316.2021.9564237","url":null,"abstract":"Today's enterprises heavily rely on their batch systems to ensure smooth business operation. Any delay in these processes has direct impact on sales, revenue, customer experience, and brand image. In this paper, we present an approach to predict these batch processes and generate ahead-of-time notifications of potential batch-induced outages. We present several case-studies to demonstrate the effectiveness of the proposed solution in various real-world scenarios.","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124233327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Safa Boudabous, S. Clémençon, H. Labiod, Julian Garbiso
{"title":"Dynamic Graph Convolutional LSTM application for traffic flow estimation from error-prone measurements: results and transferability analysis","authors":"Safa Boudabous, S. Clémençon, H. Labiod, Julian Garbiso","doi":"10.1109/DSAA53316.2021.9564245","DOIUrl":"https://doi.org/10.1109/DSAA53316.2021.9564245","url":null,"abstract":"The technological advances in the transportation and automotive industry led to the use of new types of sensing systems more cost-effective and adapted to large-scale dense deployment. Those sensing techniques allow continuously gathering traffic measurements times series in different geospatial locations. The accuracy of the obtained raw measurements is often hindered by different factors related to the sensing environment and the sensing process itself and thus fail to capture the short-term traffic variations crucial for real-time traffic monitoring. In this paper, we propose the DGC-LSTM model for area-wide traffic estimation from error-prone measurements time series. The backbone of the DGC-LSTM model is a graph convolutional Long Short Term Memory model with a dynamic adjacency matrix. The adjacency matrix is learned and optimized during the model training. The adjacency matrix values are estimated from the set of contextual features that impact the dynamicity of the dependencies in both the spatial and temporal dimensions. Experiments on a realistic synthetic labelled Bluetooth counts dataset is used for model evaluation. Lastly, we highlight the importance of transfer learning methods to improve the model applicability by ensuring model adaptation to the new deployment site while avoiding the extensive data-labelling effort.","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121273374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Venkata Sai Vivek Uddagiri, Shankaralingam Ramalingam, M. Rahat, P. Mashhadi
{"title":"Predicting hybrid vehicles' fuel and electric consumption using multitask learning","authors":"Venkata Sai Vivek Uddagiri, Shankaralingam Ramalingam, M. Rahat, P. Mashhadi","doi":"10.1109/DSAA53316.2021.9564121","DOIUrl":"https://doi.org/10.1109/DSAA53316.2021.9564121","url":null,"abstract":"Predicting energy (fuel and electric) consumption of hybrid vehicles is important on different levels: vehicle industry as a whole, individuals, and can also pave the way towards a more sustainable future. Despite its importance, providing accurate predictions is quite a challenging task. Many essential factors impacting energy consumption, including travel time, average speed, etc., needless to say, these features are not available beforehand. However, these factors are available in our data-set. To use these factors effectively, in this paper, we propose including them as different tasks in a multitask setting to help our main problem of energy consumption. The promise of this approach is that since these tasks are relevant, learning them together would provide a common feature space sharing information about all tasks. More importantly, this shared feature space would carry important information helping energy consumption in particular. In multitask learning, two important issues are task dominance and conflicting gradients of different tasks. Different studies have addressed these two separately. In this paper, we propose a method tackling these two problems simultaneously. We show experimentally the success of this method in comparison to state-of-the-art.","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116882296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leilani H. Gilpin, Vishnu Penubarthi, Lalana Kagal
{"title":"Explaining Multimodal Errors in Autonomous Vehicles","authors":"Leilani H. Gilpin, Vishnu Penubarthi, Lalana Kagal","doi":"10.1109/DSAA53316.2021.9564178","DOIUrl":"https://doi.org/10.1109/DSAA53316.2021.9564178","url":null,"abstract":"Complex machines, such as autonomous vehicles, are unable to reconcile conflicting behaviors between their underlying subsystems, which leads to accidents and other negative consequences. Existing approaches to error and anomaly detection are not equipped to detect and mitigate inconsistencies among parts. In this paper, we present “Anomaly Detection through Explanations” or ADE, a multimodal monitoring architecture to reconcile critical discrepancies under uncertainty. ADE uses symbolic explanations as a debugging language, by examining underlying reasons for those decisions. Further, when decisions conflict, our method uses a synthesizer, along with a priority hierarchy, to process subsystem outputs along with their underlying reasons and transparently judges the conflicts. We show the accuracy and performance of ADE on autonomous vehicle scenarios and data, and discuss other error evaluations for future work.","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"46 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132982448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Explainable Artificial Intelligence for Data Science on Customer Churn","authors":"C. Leung, Adam G. M. Pazdor, Joglas Souza","doi":"10.1109/DSAA53316.2021.9564166","DOIUrl":"https://doi.org/10.1109/DSAA53316.2021.9564166","url":null,"abstract":"Machine learning, as a tool, has become critical for decision-making mechanisms in the modern world. It has applications in a wide range of areas, including finance, healthcare, justice, and transportation. Unfortunately, machine learning is often considered as a “black box”. As such, recommendations made by machine learning techniques, as well as the reasoning behind those recommendations, are not easily understood by humans. In this paper, we present an explainable artificial intelligence (XAI) solution that integrates and enhances state-of-the-art techniques to produce understandable and practical explanations to end-users. To evaluate the effectiveness of our XAI solution for data science, we conduct a case study on applying our solution to explaining a random forest-based predictive model on customer churn. Results show the practicality and usefulness of our XAI solution in practical applications such as data science on customer churn.","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133306817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Narjes Davari, Bruno Veloso, Rita P. Ribeiro, P. Pereira, J. Gama
{"title":"Predictive maintenance based on anomaly detection using deep learning for air production unit in the railway industry","authors":"Narjes Davari, Bruno Veloso, Rita P. Ribeiro, P. Pereira, J. Gama","doi":"10.1109/DSAA53316.2021.9564181","DOIUrl":"https://doi.org/10.1109/DSAA53316.2021.9564181","url":null,"abstract":"Predictive maintenance methods assist early detection of failures and errors in machinery before they reach critical stages. This study proposes a data-driven predictive maintenance framework for the air production unit (APU) system of a train of Metro do Porto by deep learning based on a sparse autoencoder (SAE) network that efficiently detects abnormal data and considerably reduces the false alarm rate. Several analog and digital sensors installed on the APU system allow the detection of behavioral changes and deviations from the normal pattern by analyzing the collected data. We implemented two versions of the SAE network in which we inputted analog sensors data and digital sensors data, and the experimental results show that the failures due to air leakage problems are predicted by analog sensors data while other types of failures are identified by digital sensors data. A low pass filter is applied to the output of the SAE network, and a sequence of abnormal data is used as an alarm for the APU system failure. Performance indicators of the SAE network with digital sensors data, in terms of F1 Score, Recall, and Precision, are respectively, about 33.6%, 42%, and 28% better than those of the SAE network with analog sensors data. For comparison purposes, we also implemented a variational autoencoder (VAE). The results show that SAE performance is better than that of VAE by 14%, 77%, and 37% respectively, for Recall, Precision and F1 Score.","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132964990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mayukh Bhattacharjee, Hema Sri Kambhampati, Paula Branco, L. Torgo
{"title":"Active Learning for Imbalanced Domains: the ALOD and ALOD-RE Algorithms","authors":"Mayukh Bhattacharjee, Hema Sri Kambhampati, Paula Branco, L. Torgo","doi":"10.1109/DSAA53316.2021.9564145","DOIUrl":"https://doi.org/10.1109/DSAA53316.2021.9564145","url":null,"abstract":"Active learning strategies are used to acquire an enlarged labelled training set that allows the learner to achieve a better performance. To this end, unlabelled instances are carefully selected for labelling by an human expert in order to achieve the best performance with the smallest number of questions to this oracle. Several techniques exist to select the most informative samples within active learning strategies. However, the effectiveness of these methods when applied to problems with imbalanced classes was not studied before. In this paper, we focus on the improvement of instance selection strategies for active learning techniques in the presence of class imbalance. In an imbalanced setting, learning algorithms have difficulties to focus on the minority class due to its under-representation. Still, this is typically the class of interest for the end-user. This mismatch between the classes distribution and the goals of the end user is known as the class imbalance problem. In an active learning setting for class imbalance problems, this becomes an even more challenging issue. We propose two novel active learning algorithms, ALOD and ALOD-RE centered around selecting the most informative samples to be labelled, while considering the selection of possible minority class cases and the generation of synthetic minority class examples to improve the learner performance. To this end, our two proposed solutions combine: active learning procedures, resampling strategies, and anomaly detection methods. Through an extensive set of experiments we show that the incorporation of outlier detection and resampling techniques in the active learning procedure benefits the learners performance on imbalanced domains. The performance advantages of our proposed ALOD and ALOD-RE algorithms are clearly supported by our experimental results.","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133303126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Jakubowski, P. Stanisz, Szymon Bobek, G. J. Nalepa
{"title":"Explainable anomaly detection for Hot-rolling industrial process*","authors":"J. Jakubowski, P. Stanisz, Szymon Bobek, G. J. Nalepa","doi":"10.1109/DSAA53316.2021.9564228","DOIUrl":"https://doi.org/10.1109/DSAA53316.2021.9564228","url":null,"abstract":"Anomaly detection is emerging trend in manufacturing processes and may be considered as part of the Industry 4.0 revolution. It can serve both as diagnostic tool in predictive maintenance task, as well as trace back mechanism for assessing quality of production or services. In this paper we describe and approach for explainable anomaly detection in industrial data which contains sequential and static features. We based our solution on modified autoencoder architecture with Long Short-Term Memory layers. To address a problem of explinability in deep learning and find origin of the anomalies we have engaged the SHAP method, which gives both local and global explanations of the model. Analysis of SHAP explanations allowed us to determine the source of majority of anomalies detected by deep learning model. We demonstrated the feasibility of our approach on synthetic, reproducible dataset and on real-life data gathered from hot rolling industrial process.","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133322858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Opportunistic Resource Allocation driven by pattern recognition to attain High Availability in Vision based Edge Compute Systems","authors":"Muhammad Sibtain Hamayun, O. Ivina","doi":"10.1109/DSAA53316.2021.9564215","DOIUrl":"https://doi.org/10.1109/DSAA53316.2021.9564215","url":null,"abstract":"Managing High Availability (HA) among Computer Vision Systems deployed on Edge devices requires expensive hardware redundancy. The cost of scaling such systems becomes prohibitive. We present an alternative approach for managing high availability by building this capacity at an application layer. To this end, we propose the use of pattern recognition components to make the application context aware. This intelligent application is then itself able to decide which Edge modules are running redundant or less significant processes. For example, if the module is processing uneventful video streams, the application may decide to make this compute available to some other module. In the event of hardware failures, this application can then provision workloads on idle hardware (or hardware running less significant workload).","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"22 6S 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122814706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Data-Driven Approach based on Tensor Completion for Replacing “Physical Sensors” with “Virtual Sensors”","authors":"Noorali Raeeji Yanehsari, Hadi Fanaee-T, M. Rahat","doi":"10.1109/DSAA53316.2021.9564118","DOIUrl":"https://doi.org/10.1109/DSAA53316.2021.9564118","url":null,"abstract":"Sensors are being used in many industrial applications for equipment health monitoring and anomaly detection. However, sometimes operation and maintenance of these sensors are costly. Thus companies are interested in reducing the number of required sensors as much as possible. The straightforward solution is to check the prediction power of sensors and eliminate those sensors with limited prediction capabilities. However, this is not an optimal solution because if we discard the identified sensors. As a result, their historical data also will not be utilized anymore. However, typically such historical data can help improve the remaining sensors' signal power, and abolishing them does not seem the right solution. Therefore, we propose the first data-driven approach based on tensor completion for re-utilizing data of removed sensors and the remaining sensors to create virtual sensors. We applied the proposed method on vibration sensors of high-speed separators, operating with five sensors. The producer company was interested in reducing the sensors to two. But with the aid of tensor completion-based virtual sensors, we show that we can safely keep only one sensor and use four virtual sensors that give almost equal detection power when we keep only two physical sensors.","PeriodicalId":129612,"journal":{"name":"2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121085831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}