{"title":"Extractive Question Answering for Kazakh Language","authors":"Magzhan Shymbayev, Yermek Alimzhanov","doi":"10.1109/SIST58284.2023.10223508","DOIUrl":"https://doi.org/10.1109/SIST58284.2023.10223508","url":null,"abstract":"This article provides research and development of an extractive question answering system based on the BERT-like model for the Kazakh language. Developing an extractive question answering system requires large training datasets - tens of thousands of annotated question-answer pairs. Such datasets are not available in the majority of languages, including Kazakh. To address this issue, the Kazakh Question Answering Dataset (KazQA) is introduced, which is based on the Stanford Question Answering Dataset (SQuAD) and generated through machine translation using the Google Cloud Translation API. Different large pretrained contextual language models are used as the baseline models - ALBERT and multilingual BERT and are compared with the newly trained monolingual Kazakh model KazBERT. The results demonstrate that the proposed approach can effectively generate question answering systems in low-resourced Kazakh language.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123423428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Voice Recognition Methods and Modules for the Development of an Intelligent Virtual Consultant Integrated with WEB-ERP","authors":"A. Abdildayeva, D. Zhyilyssova, G. Nazar","doi":"10.1109/SIST58284.2023.10223552","DOIUrl":"https://doi.org/10.1109/SIST58284.2023.10223552","url":null,"abstract":"There were explored the methods and modules of voice recognition for the development of an intelligent virtual consultant, which plays an important role as an auxiliary tool when working with business process information systems and more precisely with the ERP system in this article. This study is intended to characterize an intelligent virtual consultant in evaluating the usefulness and effectiveness of using specific functions in a simulated enterprise resource planning (ERP) business process software system. The intelligent virtual consultant is integrated with the Web-ERP prototype. The Mel-frequency-Cepstral coefficient algorithm was utilized in conjunction with the Levenberg-Marquardt (LM) and Broyden-Fletcher-Goldfarb-Shanno (BFGS) methods as the underlying basis for the analysis and extraction of features. A comparative analysis of these methods was also given to identify the effectiveness of used methods. A great benefit of this virtual consultant is the voice input of long text data fields when working on Web-ERP. This system is relevant because it significantly expands the search for business workflows. The developed system makes it possible to convert speech into text by extracting information system instructions. These texts are passed to an ontological database in which a query is executed using terms in order to get a set of available commands for its execution. When implementing an intelligent virtual consultant, it increases the capabilities of the application using not only user navigation through the system, but also makes it possible to navigate and explain data using speech synthesis. This prototype developed in the research will be used in a combination of an intelligent virtual consultant and data analysis.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125330764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zh. E. Temirbekova, Nurdaulet Kabdygaliyev, Z. Abdiakhmetova, Gulzat Turken
{"title":"Preservation of Confidentiality Based on Homomorphic Encryption Library for IoT","authors":"Zh. E. Temirbekova, Nurdaulet Kabdygaliyev, Z. Abdiakhmetova, Gulzat Turken","doi":"10.1109/SIST58284.2023.10223494","DOIUrl":"https://doi.org/10.1109/SIST58284.2023.10223494","url":null,"abstract":"The Internet of Things (IoT) is the term used to describe the connectivity of different physical objects, including sensors, appliances, vehicles, and other devices, which are equipped with electronics, software, and sensors to gather and share data. These devices produce a significant volume of confidential information that requires safeguarding against unauthorized access or disclosure. Homomorphic encryption can play a significant role in preserving confidentiality in IoT environments. It enables data to be processed without revealing the underlying data or the results of the computation. This means that sensitive data generated by IoT devices can be encrypted using homomorphic encryption, and computations can be performed on the encrypted data, while maintaining confidentiality. A homomorphic encryption library designed for IoT can allow IoT devices to securely and efficiently perform computations on encrypted data, without exposing the data to any third party or intermediary. This can help prevent data breaches and unauthorized access to sensitive IoT data, which is crucial in many industries, such as healthcare, finance, and transportation. The number of smart devices is growing at an incredible rate, as is the interest of people in these devices. Devices send data wirelessly and are good targets for hackers. To prevent such situations, you must first protect the microcontroller built into the devices. For this purpose, the fully homomorphic encryption library «HomomorphicControllerVersion_01» was developed. The article proposes the architecture of the library of fully homomorphic operations on integers. The library supports basic homomorphic operations (addition, subtraction, multiplication, division) on integers. On the basis of the proposed method of homomorphic division and the architecture of the library, the library of homomorphic operations on integers was implemented. The article also provides measurements of the time required to perform certain operations on encrypted data and analyzes the efficiency of the developed implementation of the library.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115082631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Gladka, Oleksandr Kuchanskyi, Mykola Kostikov, Rostyslav Lisnevskyi
{"title":"Method of Allocation of Labor Resources for IT Project Based on Expert Assessements of Delphi","authors":"M. Gladka, Oleksandr Kuchanskyi, Mykola Kostikov, Rostyslav Lisnevskyi","doi":"10.1109/SIST58284.2023.10223549","DOIUrl":"https://doi.org/10.1109/SIST58284.2023.10223549","url":null,"abstract":"An essential component of the development and implementation of information systems is a clear definition of the project team, which is formed from possible options of company employees or freelancers in conditions of limited company resources. Based on the method of expert evaluation of Delphi, a study of qualifications and characteristics that are crucial for the implementation of project work on the development and implementation of IT projects, the level of significance of each parameter on the impact of the complex project is assessed, they are ranked according to the degree of importance for the project implementation; the critical parameters that entail the most significant project risks are identified. The importance of the selection of employees in the project team with certain levels of competence following the qualifications for the role performed and the appointment of responsible executors for project work according to their compliance is substantiated; the value of the qualification rank for different employees is taken into account in determining the type of work for which it will be assigned for the optimal distribution of all project work. The scientific novelty is to substantiate and evaluate the comparative importance of factors that limit the appointment of each developer to perform design work using the method of group expert evaluation of Delphi.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"231 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122675904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nursultan Jyeniskhan, Karina Shaimergenova, Md. Hazrat Ali, E. Shehab
{"title":"Digital Twin for Additive Manufacturing: Challenges and Future Research Direction","authors":"Nursultan Jyeniskhan, Karina Shaimergenova, Md. Hazrat Ali, E. Shehab","doi":"10.1109/SIST58284.2023.10223556","DOIUrl":"https://doi.org/10.1109/SIST58284.2023.10223556","url":null,"abstract":"Digital twin (DT) and additive manufacturing (AM), also known as 3D printers, are most important practices in industry 4.0. 3D printers are the best candidate for manufacturing geometrically challenging products due to the increase in customized product. However, there are limitations and issues regarding product quality and process optimizations. Owing to a digital twin technology's ability to provide maximum benefits to the manufacturing field, especially additive manufacturing, it is considered one of the suitable technologies to integrate with. In recent years, digital twin gets more attention from both academia and industry. However, there are implementation challenges of digital twin technology. Thus, identifying and understanding these challenges are significant. Many challenges are mapped out from research papers and work in academia in this paper through narrative literature review. Identified challenges have been classified into eight key categories to formulate the future research direction. It is important to investigate the identified challenges and provide possible solutions to elevate the functionality of the digital twin model and improve additive manufacturing productivity and efficiency, ultimately achieve smart manufacturing.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"313 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122932928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aigerim Mansurova, Talgat Baimaganbetov, A. Nugumanova
{"title":"Topic Modeling for Recent Newspaper Publications on Blockchain Technology in Kazakhstan","authors":"Aigerim Mansurova, Talgat Baimaganbetov, A. Nugumanova","doi":"10.1109/SIST58284.2023.10223489","DOIUrl":"https://doi.org/10.1109/SIST58284.2023.10223489","url":null,"abstract":"This paper aims to investigate the recent discussions and trends related to blockchain technology in Kazakhstan through topic modeling of newspaper publications. Utilizing advanced text analytics techniques, the study provides an in-depth analysis of the themes and topics discussed in the Kazakhstani media regarding blockchain and its potential applications in various sectors. The results of the study will shed light on the current state of blockchain awareness and adoption in Kazakhstan, as well as provide insights into the areas that require further attention and investment. The paper aims to serve as a valuable resource for policymakers, businesses, and technology enthusiasts looking to understand the impact of blockchain on Kazakhstan's economy and society.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121888480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fundamental Laws and Space-Time Measurements of Ecological Urban-Planning Systems as the Basis of Advanced Analytics for the Sustainable Urbanized Territories Development","authors":"I. Ustinova","doi":"10.1109/SIST58284.2023.10223517","DOIUrl":"https://doi.org/10.1109/SIST58284.2023.10223517","url":null,"abstract":"The results of the study of the urbanized territories sustainable development methodological foundations as ecological urban-planning systems “population ↔ environment” are presented, which established that these systems are the subject to the action of the sustainable development fundamental law of open systems – the law of conservation of capacity (Lagrange, 1788; Maxwell, 1855). which in the investigated plane is revealed by the ecosystem self-regulation law. Changes in the main condition parameters of these systems (territory, population, demographic capacity, their ratio and the dynamics of changes) obtain certain electromagnetic properties, which made it possible to translate the measurements of these parameters into the language of universal space-time physical quantities (the territory has an area measurement, population size – measure of mass, demographic capacity – measure of power, population density – measure of acceleration and others).","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122068987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yeonghyeon Park, Myung Jin Kim, Wonseok Park, Juneho Yi
{"title":"Recycling for Recycling: RoI Cropping by Recycling a Pre-Trained Attention Mechanism for Accurate Classification of Recyclables","authors":"Yeonghyeon Park, Myung Jin Kim, Wonseok Park, Juneho Yi","doi":"10.1109/SIST58284.2023.10223525","DOIUrl":"https://doi.org/10.1109/SIST58284.2023.10223525","url":null,"abstract":"Automated classification of recyclable waste is necessary to process a huge amount of recyclables for reuse. This research features recycling a pre-trained attention mechanism for cropping region of interest (RoI) for efficient classification of recyclable waste. We report that an attention mechanism pre-trained with the MNIST dataset, followed by simple morphological operations, successfully provides a bounding box for a recyclable object to be fed into object recognition models such as ResNet50 and EffNetB0. This way, we avoid the cost of annotating large datasets to train state-of-the-art object detection models such as YOLO and R-CNN. Experimental results using the Recyclable Solid Waste Dataset (RSWD) show that our attention-based RoI cropping method is effective enough to separate an object for recognition to achieve accurate classification of recyclables.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116823323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amalia Utamima, Alexander Alangghya, Tarisa A. Hakim, Aryageraldi Pajung
{"title":"Improving the Classification Result of Rice Varieties Using Gradient Boosting Methods","authors":"Amalia Utamima, Alexander Alangghya, Tarisa A. Hakim, Aryageraldi Pajung","doi":"10.1109/SIST58284.2023.10223511","DOIUrl":"https://doi.org/10.1109/SIST58284.2023.10223511","url":null,"abstract":"An accurate identification of rice grain is crucial for classifying rice varieties. This study classifies five distinct rice types that share morphological characteristics using four different machine learning methods. A total of seventy-five thousand records, consisting of fifteen thousand for each variety of rice grains, are collected from previous research. Machine learning methods that are used in this study are the Gradient Boosting method and its variances. The experimental results show that Light Gradient Boosting Machine was the algorithm with the most significant classification success rate compared to other methods, with an accuracy of 98,14%.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117263175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gulzat Turken, Van Pey, Z. Abdiakhmetova, Zh. E. Temirbekova
{"title":"Research on Creating a Data Warehouse Based on E-Commerce","authors":"Gulzat Turken, Van Pey, Z. Abdiakhmetova, Zh. E. Temirbekova","doi":"10.1109/SIST58284.2023.10223542","DOIUrl":"https://doi.org/10.1109/SIST58284.2023.10223542","url":null,"abstract":"With the popularization of the internet and the rapid development of science and technology, “online shopping” has become the norm in people's lives, and the e-commerce industry is booming, n addition, it has led to an increase in logistics. in today's business Wars, many companies strive for better development in enterprises of the same type, which continue to improve their information capabilities and level. This paper in order to solve the problems such as the increasing of massive data of e-commerce logistics and the phenomenon of data isolation in various business systems. The overall data warehouse is designed and constructed on the Hadoop cluster environment and data warehouse tool Hive is used to process data. Extraction of data from ETL, Sqoop and Flume tools is used for retrieving business data and log data and other aspects of ETL, we use Scala and Java to easily process and filter data and upload it to HDFS. The data warehouse is divided into levels and subject areas to simplify data management. Under the design of the entire system and data warehouse architecture, the conceptual, logical, and physical models of the data warehouse are developed and the star model is selected as a dimensional model. Finally, the application and implementation of data warehouse based on e-commerce logistics will be demonstrated. The development of a data warehouse based on e-commerce logistics not only ensures that e-commerce companies receive logistics information in a timely manner, but also forces decision makers to adjust logistics strategies in a timely manner based on data information, which can also improve user satisfaction and experience, and reduce costs.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128387833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}