{"title":"Parallel Processing Framework for Efficient Computation of Analyst Consensus Estimates and Measurement of Forecast Accuracy","authors":"Kheng Kua, A. Ignjatović","doi":"10.1109/ICoDSA55874.2022.9862846","DOIUrl":"https://doi.org/10.1109/ICoDSA55874.2022.9862846","url":null,"abstract":"Forecasting of earnings is an integral component in the valuation of companies. Financial analysts provide such forecasts in the form of earnings estimates. Academic study has shown analyst forecasts to be more accurate than timeseries forecasts. Historically this has been based on a consensus forecast computed as the mean of analyst forecasts. In our research we consider alternative methods of aggregating consensus forecasts. We take inspiration from iterative filtering methods from Physics, as applied to other fields such as the aggregation of sensor readings and online reviews. In this paper we discuss the challenges of adapting iterative filtering algorithms to the aggregation of analyst earnings estimates. This encompasses modelling as well as technological challenges. We present our solution to the afore-mentioned challenges and develop a general framework for the systematic assessment of consensus aggregation algorithms. We show that a naïve implementation of this computation takes approximately 4 days to complete. Our framework performing the same computation takes a significantly reduced time of approximately 2 hours. We then apply this framework to the assessment of iterative filtering algorithms in the context of aggregating consensus earnings estimates. We present preliminary results of our study of the application of iterative filtering algorithms against a simple mean consensus.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128032392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effect of p-value on LEACH Protocol Performance for Wireless Sensor Networks","authors":"M. Fauzan, R. Munadi, S. Sumaryo, H. Nuha","doi":"10.1109/ICoDSA55874.2022.9862887","DOIUrl":"https://doi.org/10.1109/ICoDSA55874.2022.9862887","url":null,"abstract":"Today's technology has been able to provide a monitoring system for arable land, the sensors are arranged in such a way to report field conditions so that predictions can be made for better environmental conditions. The larger the cultivated area, the number of sensors needed also increases, with the increase in the number of sensors, data processing and computing, problems arise in the energy consumption of network performance. The use of sensors and the Internet of Things (IoT) is key to moving the world's agriculture on a more productive and sustainable path. Recent advances in IoT, Wireless Sensor Networks (WSN), and Information and Communication Technology (ICT) have the potential to address some of the environmental, economic, and technical challenges and opportunities. Computing that used to be done in the cloud can now be done at the edge (close to objects without needing to be sent and processed to the cloud). Monitoring, control, sensors and actuators can be defined locally, data transmission and further analysis data can be continued from the edge to the cloud. Protocol Low Energy Adaptive Clustering Hierarchy (LEACH) has been widely used as a protocol for WSN. Therefore, the authors are interested in presenting the effect of the p-value on LEACH on WSN performance. The experimental results show that the main effect of the parameter is given by p=0.05, which produces the best performance for 100 nodes.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"291 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127492007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Object Detection Analysis Study in Images based on Deep Learning Algorithm","authors":"Christian Hary, Satria Mandala","doi":"10.1109/ICoDSA55874.2022.9862922","DOIUrl":"https://doi.org/10.1109/ICoDSA55874.2022.9862922","url":null,"abstract":"Deep learning is a subfield of machine learning. Computer vision is one of the technological advances that utilizes deep learning in image processing, object classification, and object detection. In the Object Detection, there have been various models that can detect objects with different characteristics, and with so many models that have been developed, it takes longer to determine which model is suitable for the needs of a project because it requires comparisons between each model. In this study, an analysis was conducted by comparing three models that utilize Deep Learning to detect car and bus objects, namely Faster-RCNN with ResNet50, SSD with MobileNet, and EfficientDet with D0. Each model is run using TensorFlow Object Detection. The models will be trained using a custom dataset containing of 52 images and will be trained in 3000 steps. Based on experiments, it is known that from the comparison of mAP, Faster-RCNN ResNet50 has the highest score of 0.453, and the lowest is EfficientDet D0 with 0.274; for the comparison of Average Recall, Faster-RCNN ResNet50 has the highest score with 0.337, and the lowest is EfficientDet D0 with 0.190, as well as for model size comparison, EfficientDet D0 has the smallest size with 290 MB, and the largest is Faster-RCNN ResNet50 with 1280 MB.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123613382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angeline Karen, Michael Christopher, Vania Natalie Aherman, Nunung Nurul Qomariyah, Maria Seraphina Astriani
{"title":"Analyzing the Impact of Age and Gender for Targeted Advertisements Prediction Model","authors":"Angeline Karen, Michael Christopher, Vania Natalie Aherman, Nunung Nurul Qomariyah, Maria Seraphina Astriani","doi":"10.1109/ICoDSA55874.2022.9862531","DOIUrl":"https://doi.org/10.1109/ICoDSA55874.2022.9862531","url":null,"abstract":"The practice of targeted advertisements has been gaining popularity, especially in this digital era. There are a lot of aspects to take into consideration when creating an efficiently targeted advertisement, such as advertisement details and user backgrounds. Using this information can increase the likelihood of sending the right advertisements to the right demographic. In this paper, we will explore which features have an influence towards the click-through rate of these targeted advertisements. The best models in our experiment are LightGBM and XGBoost with the ROC-AUC score of 0.76 for LightGBM and 0.78 for XGboost. Adding age and gender can improve the results. Our experiment can be insightful for making a better marketing strategy to reach more segmented users in display advertisements.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134427676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Liman Gambo, A. Zainal, Mohamad Nizam Kassim
{"title":"A Convolutional Neural Network Model for Credit Card Fraud Detection","authors":"Muhammad Liman Gambo, A. Zainal, Mohamad Nizam Kassim","doi":"10.1109/ICoDSA55874.2022.9862930","DOIUrl":"https://doi.org/10.1109/ICoDSA55874.2022.9862930","url":null,"abstract":"Nowadays, online transactions through various ecommerce platforms are becoming more prevalent, and Credit Card (CC) is significantly used in various online transactions. However, Credit Card Fraud (CCF) strategies continue to evolve with the business transformation, causing customers as well as the financial institutions to lose billions of dollars annually. Hence, effective detection of fraudulent transactions initiated by fraudsters from the voluminous array of normal transactions is ever necessary. Hence, a Convolutional Neural Network (CNN) model for credit card fraud detection is proposed in this study using Adaptive Synthetic (ADASYN) sampling technique to address the imbalance dataset. The proposed model has achieved 0.9982, 0.9965, and 0.9999, accuracy, precision, and recall, respectively compared to other existing studies.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132699692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Riva Yudisa Ikhsanurahman, N. Ikhsan, I. Kurniawan
{"title":"Classification of CDK2 Inhibitor as Anti-Cancer Agent by Using Simulated Annealing-Support Vector Machine Methods","authors":"Riva Yudisa Ikhsanurahman, N. Ikhsan, I. Kurniawan","doi":"10.1109/ICoDSA55874.2022.9862929","DOIUrl":"https://doi.org/10.1109/ICoDSA55874.2022.9862929","url":null,"abstract":"Cancer is a disease that occurs when normal cells divide uncontrollably and attack healthy tissue. This disease is one of the leading causes of death worldwide. There are 10 million cases of cancer deaths based on data from the World Health Organization (WHO). Chemotherapy as a cancer treatment began in 1940 and has been successful since its inception. However, this treatment can be bad for the body in the long term. So, new drug designs are needed to overcome these impacts. Generally, anti-cancer drugs can be developed by considering Cyclin-Dependent Kinases 2 (CDK2) as the target. In designing a new drug, one method that can be used to accelerate the process is the quantitative structure-activity relationships (QSAR) method. This study aims to build a QSAR model for classifying anti-cancer agents from CDK2 inhibitors by using the simulated annealing (SA) and support vector machine (SVM) method. The SA method was used for feature selection, while SVM was used for the model prediction. We utilized the data set used that obtained from the ChemBL database with a total of 1.554 samples. Based on the results, we found that the best prediction model is obtained from SVM with linear and polynomial kernels with accuracy and F-1 score are 0.986 and 0.987, respectively.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134142370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Designing Green Hospital Non-Medical Waste Management System Based on ERP","authors":"Annisa Fitriani, A. Ridwan, Lutfia Septiningrum","doi":"10.1109/ICoDSA55874.2022.9862867","DOIUrl":"https://doi.org/10.1109/ICoDSA55874.2022.9862867","url":null,"abstract":"The waste produced by hospitals is broadly divided into medical waste and non-medical waste. Non-medical waste is generated from activities in hospitals outside of medical, which comes from kitchens, offices, parks, and yards that can be reused or destroyed. In addition to being a health facility, hospitals can also cause negative impacts, one of which is the waste they produce. These negative impacts include the place of disease transmission, causing environmental pollution and health problems. Many general hospitals still do not have an integrated non-medical waste management system that supports green hospitals. The non-medical waste management process, from the preparation to reporting stages, is still done manually. This manual process can cause several problems, such as non-medical waste monitoring, which cannot be done regularly, and the recording of the waste generated and the waste that has been managed is prone to errors. To support green hospitals in public hospitals, it is necessary to manage the waste generated by hospitals in an integrated manner to facilitate monitoring, reporting, and evaluation. This research will focus on designing and developing a non-medical waste management module using an ERP system that will integrate it with other elements, such as inventory, to make it easier for public hospitals to classify processed waste in storage warehouses. Not only that, but this system will simplify the preparation process for management, help determine the amount of waste generated, and can be monitored through non-medical waste indicators, which are then reported in the form of documents, making it easier for companies to analyze results and assist in further decision making.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"224 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132356924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Noise Reduction and Speech Enhancement Using Wiener Filter","authors":"H. Nuha, Ahmad Abo Absa","doi":"10.1109/ICoDSA55874.2022.9862912","DOIUrl":"https://doi.org/10.1109/ICoDSA55874.2022.9862912","url":null,"abstract":"Digital data transmission rate may reach over 2.5 Tb/s using the orthogonal frequency division multiplexing (OFDM). Digital speech enhancement is crucial during the pandemic era. This is due to most of information and communication is performed online. However, not all people have private room form digital communication. Therefore, background noise from the indoor condition may distort the speech during the recording. Speech denoising has many benefits for instance in voice communication or voice recognition where fast denoising process are needed. This paper evaluates the use of Wiener Filter for noise reduction. Enhancement of distorted speech by additive noise with only single observation has been done and still a challenging problem. We add the noise to the sample clean speech to obtain noisy speech. We generate noise level for SNR 0 up to 0.5dB with increment 0.01dB. We choose low SNR to represent high additive noise. We further apply Wiener Noise Reduction to the noisy speech to obtain filtered noisy speech. Finally, we compare the Mean Square Error (MSE) of filtered speech and the original speech for every noise level. The results show that the noise has been decreased. The non-speech parts now appear better since the noisy part have been suppressed. Our experiment shows that the proposed technique successfully improves the speech in noisy environment up to order of .","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130262520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Aspect Extraction on Restaurant Reviews using Domain-Specific Word Embedding","authors":"Ahmad Satriamulya, A. Romadhony","doi":"10.1109/ICoDSA55874.2022.9862856","DOIUrl":"https://doi.org/10.1109/ICoDSA55874.2022.9862856","url":null,"abstract":"Reviews on the internet can be an important part of a business and can influence owners or consumers for their decision making. Easy access to information in the form of opinions, experiences, and feedback from others can be used as a reference for taking an action. For businesses in the food and beverage sector, consumers usually provide reviews with negative or positive sentiments based on several aspects of the related business. The taste of the food, atmosphere, price, service are examples of aspects that are commonly written in a review. In this work, aspect extraction on consumer reviews of restaurants in Indonesia is going to carried out. Reviews on the internet usually contains words that are informal and very domain specific. This is where Domain Specific Word embedding can be used to reduce the amount of out-of-vocabulary word (OOV) and give the model more context of the review text given. The model used is Deep Learning with Recurrent Neural Network architecture, using Domain Specific Embedding as Word Embedding, and several attempts to reduce out of vocabulary in the model. The model used is able to reduce OOV from 17.16% (based on previous research) to 3.62%, with an evaluation of the F1-Score model of 79.54% using the Bi-LSTM model.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126278647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alief Muhsin M, Dedy Rahman Wijaya, Elis Hernawati, Asti Widayanti
{"title":"AdaBoost Algorithm for Marketplace Product Similarity Detection","authors":"Alief Muhsin M, Dedy Rahman Wijaya, Elis Hernawati, Asti Widayanti","doi":"10.1109/ICoDSA55874.2022.9862816","DOIUrl":"https://doi.org/10.1109/ICoDSA55874.2022.9862816","url":null,"abstract":"The marketplace is a platform that has a duty as an intermediary between sellers who want to sell and buyers who want to buy a product with an online transaction process. So a marketplace website only acts as a third party in handling product transactions in terms of ordering products and several online payment methods provided by the marketplace. It can be seen in several marketplaces such as Shopee, Lazada, Tokopedia, and so on. Of course, they have a lot of products, for example, clothing, staple foods, electronic devices, and many others. With so many products in a marketplace, of course, many products look the same but users or buyers often or don't even know that one product and several other products are the same. In this study, the author uses a product similarity dataset and uses the AdaBoost algorithm to get high classification results. In the dataset used, to classify, the author uses product titles and images which will later be used to distinguish one product from another. For the classification results using the AdaBoost algorithm, an accuracy of 91.81% is obtained, with the accuracy of the score, which means that the model developed by the author has a very good performance in detecting product similarities based on product titles and images in a marketplace.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"193 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121020012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}