Doruk Tıktıklar, Gürsel Baltaoğlu, Efsa Çakır, Zeynep Küçük, M. Aktaş
{"title":"On the Comparative Analysis of Sequence Mining Algorithms: Case Study in Telecommunications","authors":"Doruk Tıktıklar, Gürsel Baltaoğlu, Efsa Çakır, Zeynep Küçük, M. Aktaş","doi":"10.1109/UBMK52708.2021.9558935","DOIUrl":null,"url":null,"abstract":"This paper examines existing sequence mining algorithms. Sequence mining algorithms are used in many domains, including cyber-security, telecommunications, user behaviour, and air quality patterns. We draw the underlying principles of the representative sequence mining algorithms and introduce a comparative analysis methodology for them. To test the methodology, we provide a prototype testing framework. We conduct a comprehensive experimental study on publicly available data sets, real-life telecommunication data set and data sets generated by a data generator. We compare GSP, PrefixSpan and CMRules algorithms. Comparing these sequence mining algorithms, we conclude that the fastest among the targeted three algorithms may differ for different data sets. Furthermore, we search for situations where sequential pattern mining algorithms can be used instead of sequential rule mining algorithms.","PeriodicalId":106516,"journal":{"name":"2021 6th International Conference on Computer Science and Engineering (UBMK)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 6th International Conference on Computer Science and Engineering (UBMK)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UBMK52708.2021.9558935","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper examines existing sequence mining algorithms. Sequence mining algorithms are used in many domains, including cyber-security, telecommunications, user behaviour, and air quality patterns. We draw the underlying principles of the representative sequence mining algorithms and introduce a comparative analysis methodology for them. To test the methodology, we provide a prototype testing framework. We conduct a comprehensive experimental study on publicly available data sets, real-life telecommunication data set and data sets generated by a data generator. We compare GSP, PrefixSpan and CMRules algorithms. Comparing these sequence mining algorithms, we conclude that the fastest among the targeted three algorithms may differ for different data sets. Furthermore, we search for situations where sequential pattern mining algorithms can be used instead of sequential rule mining algorithms.