2021 the 5th International Conference on Information System and Data Mining最新文献

筛选
英文 中文
Weighted Ensemble of Neural and Probabilistic Graphical Models for Click Prediction 点击预测的神经和概率图形模型的加权集成
2021 the 5th International Conference on Information System and Data Mining Pub Date : 2021-05-27 DOI: 10.1145/3471287.3471307
Kritarth Bisht, Seba Susan
{"title":"Weighted Ensemble of Neural and Probabilistic Graphical Models for Click Prediction","authors":"Kritarth Bisht, Seba Susan","doi":"10.1145/3471287.3471307","DOIUrl":"https://doi.org/10.1145/3471287.3471307","url":null,"abstract":"Predicting user behavior in web mining is an important concept with commercial implications. The user response to search engine results is crucial for understanding the relative popularity of websites and market trends. The most popular way of understanding user interests is via click models that can predict whether a user will click on a search engine result or not, based on past observations. There are two main categories of click models, namely, the neural network based models and the probabilistic graphical models. In this paper, we combine the goodness of both approaches by presenting a weighted ensemble of both types of models. The weighted sum of softmax scores integrates the predictions of the individual models. Assigning higher weights to the neural models is found to improve the performance of the ensemble. The AUC and perplexity scores of our weighted ensemble model are higher than the state of the art, as proved by experiments on the benchmark Tiangong-ST dataset.","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"46 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123519173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Characterization of the organizational climate in public schools from the teacher's perception using the Estanones scale 利用Estanones量表从教师的感知来表征公立学校的组织氛围
2021 the 5th International Conference on Information System and Data Mining Pub Date : 2021-05-27 DOI: 10.1145/3471287.3471310
Karim Roca, Belinda Navarro, Hector Carlos, Edwin Delgado, M. Ore
{"title":"Characterization of the organizational climate in public schools from the teacher's perception using the Estanones scale","authors":"Karim Roca, Belinda Navarro, Hector Carlos, Edwin Delgado, M. Ore","doi":"10.1145/3471287.3471310","DOIUrl":"https://doi.org/10.1145/3471287.3471310","url":null,"abstract":"The objective of this work is to determine the degree, direction and significance of the relationship that exists between transformational leadership and the organizational climate in teachers in public management educational institutions. The randomized stratified sample consisted of 120 teachers. The research had a quantitative approach, correlational type, and cross-sectional design. The information was collected with the Transformational Leadership Scale and the Organizational Climate Scale, on the other hand, the content validity and reliability of the instruments were corroborated according to the standards of the scientific community with the Aiken Validity coefficient, the Alpha coefficient of Cronbach and the Kuder-Richardson coefficient (KR-20), respectively.The statistical analysis of the data was done with the Stanonese scale for the description of the qualitative levels of the variables, and the parametric test Pearson's correlation coefficient (r) for the hypothesis test. The results showed direct correlations of moderate intensity; while the dimension of inspirational communication shows a low direct correlation with the organizational climate. Finally, the findings turned out to be statistically significant at a probability level of 0.05.","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131674107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Application of Generative Adversarial Networks for Robust Inference in Computational Fluid Dynamics 生成对抗网络在计算流体力学鲁棒推理中的应用
2021 the 5th International Conference on Information System and Data Mining Pub Date : 2021-05-27 DOI: 10.1145/3471287.3471304
Chaity Banerjee, Chad Lilian, D. Reasor, E. Pasiliao, Tathagata Mukherjee
{"title":"An Application of Generative Adversarial Networks for Robust Inference in Computational Fluid Dynamics","authors":"Chaity Banerjee, Chad Lilian, D. Reasor, E. Pasiliao, Tathagata Mukherjee","doi":"10.1145/3471287.3471304","DOIUrl":"https://doi.org/10.1145/3471287.3471304","url":null,"abstract":"In this paper we propose a robust learning pipeline for inference in computational fluid dynamics (CFD) systems in the presence of faulty sensor data. The standard methods for handling faulty sensor data involve outlier detection techniques which assume that the faulty data is generated from the tail regions of the underlying data distribution and hence can be eliminated by modeling the high probability regions of the distribution. However this assumption is not always true and subtle faults in sensors can lead to recording of faulty data which can be thought of as being generated from a subtly perturbed version of the underlying distribution. Methods based on outlier detection techniques will fail to work under these settings and hence novel approaches are required for eliminating faulty data in such systems. In this work we explore the use of a Generative Adversarial Network (GAN) for this purpose. We train the generator network of the GAN to generate “fake” sensor data that mimics the distribution of the real data, albeit, a slightly perturbed one. We use this to train a discriminator network which learns to distinguish between the “real” and “fake” data generated from the generator. This discriminator is then used to filter out faulty sensor data generated from a perturbed version of the distribution generating the real data. We also build a simple regressor that uses the trained discriminator to perform robust regression on the CFD data after eliminating faulty sensor data. We tested the robust regression pipeline with CFD data for predicting fluid flow characteristics (specifically the angle of attack (AoA)) over a 2D foil. Our discriminator trained in a GAN framework could eliminate faulty sensor data, generated using the trained generator, with ∼ 100 % efficiency. The filtered data is then used for inference of the fluid flow parameters using the regressor.","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122851072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The Application of Machine Learning in Bitcoin Ransomware Family Prediction 机器学习在比特币勒索软件家族预测中的应用
2021 the 5th International Conference on Information System and Data Mining Pub Date : 2021-05-27 DOI: 10.1145/3471287.3471300
Shengyun Xu
{"title":"The Application of Machine Learning in Bitcoin Ransomware Family Prediction","authors":"Shengyun Xu","doi":"10.1145/3471287.3471300","DOIUrl":"https://doi.org/10.1145/3471287.3471300","url":null,"abstract":"In recent years, ransomware attacks have become increasingly rampant, resulting in many large companies or financial institutions suffering heavy losses from ransomware attacks. Bitcoin, is a means of payment demanded by the Ransomware Family. By comparing and analyzing the characteristics of bitcoin transactions, we can predict the types of Ransomware Family. Therefore, in this paper, the algorithm of machine learning is used to put forward the prediction method of Ransomware Family, so as to achieve the better effect of helping the attacked institutions to avoid being extorted effectively. In the traditional method, the judgment of Ransomware Family can only rely on human experience and subjective judgment, instead of accurate and batch analysis of Bitcoin transactions and prediction results. In this paper, a large number of known data sets of bitcoin's transaction features are used for analysis and modeling. First, we carried out descriptive statistical analysis to explore the differences between different Ransomware Families in bitcoin trading behavior. Next, we used a series of machine learning models to build the prediction model of Ransomware Family and conduct identification and classification, so as to help avoid financial losses from the Ransomware. Finally, we found that Ransomware family species were most significantly affected by year. In addition, it can be found that the accuracy of the Boosting model is the highest, and the test error is only about 3%.","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133348989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Selection and Verification of Privacy Parameters for Local Differentially Private Data Aggregation 局部差分私有数据聚合中隐私参数的选择与验证
2021 the 5th International Conference on Information System and Data Mining Pub Date : 2021-05-27 DOI: 10.1145/3471287.3471306
Snehkumar Shahani, Abraham Jibi, R. Venkateswaran
{"title":"Selection and Verification of Privacy Parameters for Local Differentially Private Data Aggregation","authors":"Snehkumar Shahani, Abraham Jibi, R. Venkateswaran","doi":"10.1145/3471287.3471306","DOIUrl":"https://doi.org/10.1145/3471287.3471306","url":null,"abstract":"Acquiring and aggregating data from a group of individuals is crucial for studying their general behavior. Differentially Private (DP) techniques, characterized by the parameter ϵ, help to protect Individually Identifiable Data (IID) of individuals participating in such data collection. However, such techniques affect the usefulness of the data leading to a trade-off between usefulness and privacy, thereby making the selection of ϵ an important problem before data acquisition. In this work, we use a mathematical formalism to estimate usefulness and privacy for sum query as aggregate analysis for the local model of privacy. The mathematical relation enables the application of a variety of optimization techniques, discussed in the work, to select an optimal value of ϵ. Existing methods for selecting ϵ are based on financial parameters, but they heavily rely on past data and domain knowledge which may not be available in many cases. To address this, we have provided Knee-point based recommendations along with a selection criterion to choose the method of recommendation depending on the availability of information. This allows analysts to take enlightened decisions while negotiating the value of ϵ. Our experiments on synthetic and real-world datasets unambiguously demonstrate the strength of the mathematical model and the recommended values","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121042492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Development of Virtual Skill Trainers and Their Validation Study Analysis Using Machine Learning 基于机器学习的虚拟技能培训师开发及其验证研究分析
2021 the 5th International Conference on Information System and Data Mining Pub Date : 2021-05-27 DOI: 10.1145/3471287.3471296
Seema Shedage, Jake Farmer, Doga Demirel, Tansel Halic, S. Kockara, V. Arikatla, K. Sexton, Shahryar Ahmadi
{"title":"Development of Virtual Skill Trainers and Their Validation Study Analysis Using Machine Learning","authors":"Seema Shedage, Jake Farmer, Doga Demirel, Tansel Halic, S. Kockara, V. Arikatla, K. Sexton, Shahryar Ahmadi","doi":"10.1145/3471287.3471296","DOIUrl":"https://doi.org/10.1145/3471287.3471296","url":null,"abstract":"Minimally invasive skills assessment is important in developing competent surgical simulators and executing reliable skills evaluation [9]. Arthroscopy and Laparoscopy surgeries are considered Minimally Invasive Surgeries (MIS). In MIS, the surgeon operates through small incisions with specialized narrow instruments, fiberoptic lights, and a monitor. Arthroscopy surgery is used to diagnose and treat joints problems, and Laparoscopic procedures are performed on the abdominal cavity. Due to non-natural hand-eye coordination, narrow field-of-view, and limited instrument control, MIS training is challenging to master. We are analyzing two simulators' data, Virtual Arthroscopic Tear Diagnosis and Evaluation Platform (VATDEP) and Gentleness Simulator. Both simulators went through the validation studies with human subjects. We recorded simulation data during the validation studies, such as tool motion, position, and task time. Recorded data went through the data preprocessing; after the data cleaning, we extracted the recoded data features and normalized them. Normalized features were used to input various machine learning algorithms, including K-nearest neighbor (KNN), Support vector machine (SVM), and Logistic regression (LR). The average accuracy was evaluated through k-fold cross-validation. The proposed methods validated using 10 subjects (5 experts, 5 novices) for the VATDEP simulator. 23 subjects (4 experts and 19 novices) for the Gentleness Simulator. The result shows a significant difference between the expert and novice population with the p < 0.05 using the Mann-Whitney U-test. The VATDEP simulator's classification algorithms' average accuracy is 74% and 80% for the Gentleness Simulator. The results show that the normalized features and with KNN, SVM, and LR classifiers can provide accurate classification of experts and novices. The evaluation technique proposed in this study can develop surgical training by providing appropriate feedback to trainees to evaluate proficiency.","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124388115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nepal Stock Market Movement Prediction with Machine Learning 尼泊尔股市走势预测与机器学习
2021 the 5th International Conference on Information System and Data Mining Pub Date : 2021-05-27 DOI: 10.1145/3471287.3471289
Shu-Fei Zhao
{"title":"Nepal Stock Market Movement Prediction with Machine Learning","authors":"Shu-Fei Zhao","doi":"10.1145/3471287.3471289","DOIUrl":"https://doi.org/10.1145/3471287.3471289","url":null,"abstract":"Financial market predicting is a popular theme of lots of researches in recent years. However, the majority of previous studies are focus on markets in great countries like China and United States, while some small countries are drawn less attention. To cover this shortage in current literature, we determined to use and compare 17 types of machine learning models to foresee Nepal market in this paper. Based on stock prices, 10 technical indicators were computed as input features. In addition, we also added emotional factors extracted from financial news to improve the prediction performance, which was evaluated by accuracy and F1 score. We predicted whether the closing price would rise or descend after three horizons: 1-day movement, 15-day movement and 30-day movement. From our experiment results, we found that linear SVM and XGBoost perform best and are the best options for further consideration in the trading process.","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132802087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recommender System: Personalizing User Experience or Scientifically Deceiving Users? 推荐系统:个性化用户体验还是科学欺骗用户?
2021 the 5th International Conference on Information System and Data Mining Pub Date : 2021-05-27 DOI: 10.1145/3471287.3471303
Ramachandran Trichur Narayanan
{"title":"Recommender System: Personalizing User Experience or Scientifically Deceiving Users?","authors":"Ramachandran Trichur Narayanan","doi":"10.1145/3471287.3471303","DOIUrl":"https://doi.org/10.1145/3471287.3471303","url":null,"abstract":"Recommender system is taking the lead among many things that the digital world offers today, to every customer visiting online portals for any service. Since its popularity from the time of Netflix competition, recommender system has become more visible and an important marketing and sales tool for corporates augmenting their offers online. Ongoing research initiatives in recommender systems, large datasets available for users across the globe, and corporate collaborations have led to improvised algorithms, and reduced errors in estimating predictions. Software and hardware tools that enable easy gathering of implicit and explicit data have helped recommender system to quickly adapt to the needs of the users. It is in this background the possibility of recommender system inducing the customer to pre-determined items by presenting fabricated predictions, as if it is resultant of scientific principles, need to be considered. In this paper, we give an overview of the recommender system, discuss how various components of the recommender system may be manipulated to allure innocent customers with false ratings, and also discuss the importance of engaging stakeholders to develop a robust recommender system.","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123845560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Email Clustering & Generating Email Templates Based on Their Topics 电子邮件聚类&基于主题生成电子邮件模板
2021 the 5th International Conference on Information System and Data Mining Pub Date : 2021-05-27 DOI: 10.1145/3471287.3471298
Fatih Coşkun, C. Gezer, V. C. Gungor
{"title":"Email Clustering & Generating Email Templates Based on Their Topics","authors":"Fatih Coşkun, C. Gezer, V. C. Gungor","doi":"10.1145/3471287.3471298","DOIUrl":"https://doi.org/10.1145/3471287.3471298","url":null,"abstract":"Email templates have a significant impact on users in terms of productivity. Using an email template that is produced successfully is going to transfer the main information with a considerable impression. While the previous studies were focused on the email generation by text-differences in the content of the emails, generated templates based on email topics can provide better productivity for the companies. This article proposes a system, in which user emails are clustered according to the topics of the emails, and introduces an email template generation system that utilizes the sample emails belonging to the formed email clusters. For this purpose, the Enron email dataset has been used and the performance of different text preprocessing and topic modeling algorithms, such as DMM, GPU-DMM, GPU-PDMM, LF-DMM, LDA, LF-LDA, BTM, WNTM, PTM, SATM, have been investigated and compared to determine the most efficient one. After obtaining the email topics, the system shows the examples of the emails representing the selected topics and enables the authorized users to create templates that generalize these topics.","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114497443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LASTD: A Manually Annotated and Tested Large Arabic Sentiment Tweets Dataset LASTD:一个手动标注和测试的大型阿拉伯语情感推文数据集
2021 the 5th International Conference on Information System and Data Mining Pub Date : 2021-05-27 DOI: 10.1145/3471287.3471293
Kariman Elshakankery, M. Fayek, Mona Farouk
{"title":"LASTD: A Manually Annotated and Tested Large Arabic Sentiment Tweets Dataset","authors":"Kariman Elshakankery, M. Fayek, Mona Farouk","doi":"10.1145/3471287.3471293","DOIUrl":"https://doi.org/10.1145/3471287.3471293","url":null,"abstract":"With the growing attention towards Arabic Sentiment Analysis (SA), the availability of annotated dataset has raised. Although acquiring dataset from social media platforms, microblogs and so on is an easy task, annotation is the hard part. Dataset annotation requires a lot of manual tedious work which stands as a major problem. In addition to that, some datasets are built in house and aren't available for public access. This paper introduces the LASTD which is a manually annotated dataset for Arabic tweets sentiment analysis along with an insight of its statistics and benchmarks. It consists of more than 15K Arabic tweets annotated as positive, negative and neutral. Using 10-cross validation, three different classifiers were trained and tested for 3-class classification problem and 2-class classification problem. The support vector machine (SVM) classifier tends to have the highest accuracy. LASTD is made public for academic research.","PeriodicalId":306474,"journal":{"name":"2021 the 5th International Conference on Information System and Data Mining","volume":"164 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116328218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信