S. Brimzhanova, S. K. Atanov, Moldamurat Khuralay, D. Kalmanova, T. Tabys
{"title":"Problems of detecting fuzzy duplicates","authors":"S. Brimzhanova, S. K. Atanov, Moldamurat Khuralay, D. Kalmanova, T. Tabys","doi":"10.1145/3330431.3330455","DOIUrl":"https://doi.org/10.1145/3330431.3330455","url":null,"abstract":"This article discusses the problem of detecting fuzzy duplicates. Recently, much attention has been paid to the development of methods for reducing the computational complexity of the algorithms being created by choosing various heuristics. With the use of approximate approaches is observed a decrease in the detection rate of duplicates. An important factor affecting the accuracy and completeness of the duplicates definition in comparison problems is the selection of the substantive part. Another key requirement for the quality of detection algorithms for fuzzy duplicates is their resistance to \"small\" data changes and the ability to process them. One of the first studies in the field of finding fuzzy duplicates is the work of U. Manber and N. Heintze. In these works, sequences of adjacent letters are used to construct the sample. Dactogram includes all text substrings of a fixed length. Completeness, accuracy and F-measure were chosen as the main indicators of the quality of the algorithms. It was supposed to compare the algorithms by these parameters, and also to determine their mutual correlation and joint coverage by different algorithms combinations of the initial set of pairs for the fuzzy duplicates.","PeriodicalId":196960,"journal":{"name":"Proceedings of the 5th International Conference on Engineering and MIS","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126178140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel approach for unsupervised learning of software components","authors":"C. Srinivas, C. V. Rao","doi":"10.1145/3330431.3330461","DOIUrl":"https://doi.org/10.1145/3330431.3330461","url":null,"abstract":"Clustering and classification are two important tasks in data mining and machine learning. These tasks have various applications in other related areas of research such as software engineering, text mining, image processing, and bio-informatics. Clustering is an NP-Hard problem, i.e. there is no proved polynomial time algorithm that can cluster a given set of input instances. However, approaches for evaluating cluster quality exist in the literature. This paper gives a new approach for software component learning by introducing an incremental learning approach for component clustering. Experiments are conducted by applying proposed approach on synthetic dataset and results proved the importance of proposed approach in terms of execution time and memory consumed.","PeriodicalId":196960,"journal":{"name":"Proceedings of the 5th International Conference on Engineering and MIS","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121310934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel approach for unsupervised learning of transaction data","authors":"M. PhridviRaj, C. V. Rao","doi":"10.1145/3330431.3330464","DOIUrl":"https://doi.org/10.1145/3330431.3330464","url":null,"abstract":"Incremental clustering is a technique which can be applied when the dataset is not constant and keeps updating. Normally when kmeans clustering is applied and if the dataset is modified then the clustering must be done from start. Similarly, for maximum capture procedure proposed in our previous research the clustering task must be carried from the start. In this paper, we propose an incremental approach for clustering transaction data which can be used for customer segmentation and other related applications. Experiments are conducted and three approaches are compared in terms of CPU utilization. It is observed that incremental approach required less CPU utilization.","PeriodicalId":196960,"journal":{"name":"Proceedings of the 5th International Conference on Engineering and MIS","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124376388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Radhakrishna, Shadi A. Aljawarneh, P. Kumar, V. Janaki, Aravind Cheruvu
{"title":"Tree based data fusion approach for mining temporal patterns","authors":"V. Radhakrishna, Shadi A. Aljawarneh, P. Kumar, V. Janaki, Aravind Cheruvu","doi":"10.1145/3330431.3330463","DOIUrl":"https://doi.org/10.1145/3330431.3330463","url":null,"abstract":"Discovering time profiled temporal patterns from time stamped transaction datasets is addressed in our previous research works which includes proposing new support estimation techniques, similarity measures for computing similarity between temporal patterns. This paper proposes a novel approach for discovering temporal pattern by introducing the concept of data fusion w.r.t the temporal pattern tree. The tree is generated for each timeslot and then the trees obtained for individual timeslots are merged or fused to get the overall tree for the entire dataset. The concept of tree based data fusion helps to prune elements efficiently and well ahead during pattern mining process. A pruning function is also introduced in this paper to prune invalid temporal patterns.","PeriodicalId":196960,"journal":{"name":"Proceedings of the 5th International Conference on Engineering and MIS","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131013300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Brimzhanova, S. K. Atanov, Moldamurat Khuralay, K. S. Kobelekov, L. Gagarina
{"title":"Cross-platform compilation of programming language Golang for raspberry pi","authors":"S. Brimzhanova, S. K. Atanov, Moldamurat Khuralay, K. S. Kobelekov, L. Gagarina","doi":"10.1145/3330431.3330441","DOIUrl":"https://doi.org/10.1145/3330431.3330441","url":null,"abstract":"Within this article creates a cross-platform compilation of the Golang programming language for raspberry pi. Golang, or Go supports type safety, the ability to dynamically enter data, and also contains a rich standard library of functions and built-in data types like arrays with dynamic size and associative arrays. With the help of multi-threading mechanisms, Go simplifies the distribution of computations and network interactions, while modern data types open up to the programmer a world of flexible and modular code. The program quickly compiles, while there is a trash collector and reflection is maintained. Golang is a fast, statically typed, compiled language. Dealing with it you have the impression of using dynamically typed and interpreted language. A cross-platform compilation of the Golang programming language was created for raspberry pi - a single-board computer.","PeriodicalId":196960,"journal":{"name":"Proceedings of the 5th International Conference on Engineering and MIS","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125884801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Kystaubayeva, R. Sharipov, M. T. Gabdullin, B. Utelbayev, E. Suleimenov
{"title":"Fundamentals of innovation in chemical engineering","authors":"N. Kystaubayeva, R. Sharipov, M. T. Gabdullin, B. Utelbayev, E. Suleimenov","doi":"10.1145/3330431.3330467","DOIUrl":"https://doi.org/10.1145/3330431.3330467","url":null,"abstract":"In his work, M. Faraday noted that in all physical and chemical processes there is an analogy of energy phenomena (heat exchange and electric current between material objects, combustion, phase transitions, etc.).). For example, in the history of chemistry there was a theory of heat transfer from one material object to another with the help of some liquid, which was called \"heat\" or \"phlogiston\". Under phlogiston means hypothetical \"ultra-thin substance\", allegedly filling all combustible substances and released from them during combustion. Phlogiston was presented as a weightless liquid, evaporate substances during combustion. Some experiments with heated bodies to some extent were so well described in the framework of this\" phlogiston theory\" that it was even possible to predict the results of the process, if the initial conditions were known. The reason for the rejection of these views were experiments in which it was found that the \"amount of heat\" is not saved. That is, performing work, external forces can produce \"heat\" in arbitrary quantities. In the 1770s, the theory of \"heat\" was refuted by the work of Antoine Lavoisier. However, speaking of the theory of phlogiston, Soddy said: \"the spirit of chemistry pushed her to pure materialism. Later, the defenders of the theory of phlogiston made a fatal mistake, materializing it. With the ascension of the Scales and weighing as a criterion of material existence, phlogiston, as a material substance, was rejected, and the theory itself fell into a completely undeserved disgrace. In this paper, there are very remarkable circumstances: the heat carrier as a material substance was rejected with the addition of the science of Weights and weighing - in other words, an attempt was made to put an experiment to determine the weight of \"caloric content\"; when mixing water, it was believed that it does not undergo any physical changes, since it was dominated by the theory of Arrhenius and the structure of water was considered continuous. We have clearly shown that the structure of water has a molecular structure, those when the water is shaken, the triboelectric effects come to the fore, which cause a change in water temperature when shaken. And the value of triboelectric effects depends on the\"spent\"...power or energy.\"The use of the postulates of M. Faraday, the classical equations of thermodynamics, the works of D. Mendeleev and other classics of science allows us to improve the theoretical foundations of chemical engineering.","PeriodicalId":196960,"journal":{"name":"Proceedings of the 5th International Conference on Engineering and MIS","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128414297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"F0 contour prediction for the Kazakh language","authors":"A. Kaliyev, Yuri N. Matveev, E. Lyakso, S. Rybin","doi":"10.1145/3330431.3330436","DOIUrl":"https://doi.org/10.1145/3330431.3330436","url":null,"abstract":"The article presents work on predicting the fundamental frequency (F0) values for the Kazakh language. The fundamental frequency plays one of the most important roles in the perception of speech, and at the same time modelling continuous F0 is one of the most difficult tasks in the development of intonational speech synthesis systems. The main and obvious difficulty is that a person is able to say the same sentence with different intonations and with different tones. In this work, we used deep neural networks for accurate and qualitative prediction F0 values as close as possible to the natural sounding of Kazakh speech.","PeriodicalId":196960,"journal":{"name":"Proceedings of the 5th International Conference on Engineering and MIS","volume":"21 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126035981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High dimensional document classification using novel similarity function","authors":"K. Kumar, R. Srinivasan, Elijah Blessing Singh","doi":"10.1145/3330431.3330462","DOIUrl":"https://doi.org/10.1145/3330431.3330462","url":null,"abstract":"Document dimensionality is a major concern and worrying factor when high dimensionality documents are used for classification. Reducing the dimensionality can have both positive and negative effects. If dimensionality reduction is not appropriate then the classification performed using the reduced dimensionality documents may not give good classification results. Our previous research was focused on addressing dimensionality reduction using novel similarity function, but it did not address text classification. This paper addresses the classification task performed by applying the proposed similarity function. Experiment results prove the classifier performance with dimensionality reduction is better to the performance without dimensionality reduction.","PeriodicalId":196960,"journal":{"name":"Proceedings of the 5th International Conference on Engineering and MIS","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121415508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. S. Duisebekova, Zhibek N. Sarsenova, Viktor T. Pyagay, Zamira N. Tuyakova, N. Duzbayev, A. Aitmagambetov, S. Amanzholova
{"title":"Environmental monitoring system for analysis of climatic and ecological changes using LoRa technology","authors":"K. S. Duisebekova, Zhibek N. Sarsenova, Viktor T. Pyagay, Zamira N. Tuyakova, N. Duzbayev, A. Aitmagambetov, S. Amanzholova","doi":"10.1145/3330431.3330446","DOIUrl":"https://doi.org/10.1145/3330431.3330446","url":null,"abstract":"In this article, the problem of monitoring of climatic and ecological condition of the region is considered. The problem of environmental pollution in large cities is very significant, and modern monitoring systems have a number of significant drawbacks: low speed of deployment, large size of stations, and high cost of maintenance. The authors propose a new approach to the construction of such systems using the technologies of the \"Internet of things\". This will make it possible to create easily scalable low-cost systems with high energy efficiency through the use of modern communication technologies.","PeriodicalId":196960,"journal":{"name":"Proceedings of the 5th International Conference on Engineering and MIS","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130028353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A recent survey on challenges in security and privacy in internet of things","authors":"Shadi A. Aljawarneh, V. Radhakrishna, G. R. Kumar","doi":"10.1145/3330431.3330457","DOIUrl":"https://doi.org/10.1145/3330431.3330457","url":null,"abstract":"Computing environment in IoT (Internet of Things) is surrounded with huge amounts of heterogeneous data fulfilling many services in everyone's daily life. Since, communication process in IoT takes place using different devices such as smart phones, sensors, mobile devices, household devices, embedded equipment etc. With the use of these variety of devices, the exchange of data in open internet environment is prone to vulnerabilities. The main cause for these vulnerabilities is the weaknesses in the design of software components and hardware components. Bridging communications gaps in the IoT is a complex process as the data is from heterogeneous sources. An effort is made in this paper to discuss various challenges that are being faced in security and privacy of data. This will be very much helpful for researchers who want to pursue research.","PeriodicalId":196960,"journal":{"name":"Proceedings of the 5th International Conference on Engineering and MIS","volume":"195 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116224880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}