Proceedings of the 7th ACM IKDD CoDS and 25th COMAD最新文献_第2页

Robust Learning of Multi-Label Classifiers under Label Noise 标签噪声下多标签分类器的鲁棒学习

Proceedings of the 7th ACM IKDD CoDS and 25th COMAD Pub Date : 2020-01-05 DOI: 10.1145/3371158.3371169

Himanshu Kumar, Naresh Manwani, P. Sastry

引用次数: 6

A Unified System for Aggression Identification in English Code-Mixed and Uni-Lingual Texts 英语语码混合和单语文本攻击识别的统一系统

Proceedings of the 7th ACM IKDD CoDS and 25th COMAD Pub Date : 2020-01-05 DOI: 10.1145/3371158.3371165

Anant Khandelwal, Niraj Kumar

{"title":"A Unified System for Aggression Identification in English Code-Mixed and Uni-Lingual Texts","authors":"Anant Khandelwal, Niraj Kumar","doi":"10.1145/3371158.3371165","DOIUrl":"https://doi.org/10.1145/3371158.3371165","url":null,"abstract":"Wide usage of social media platforms has increased the risk of aggression, which results in mental stress and affects the lives of people negatively like psychological agony, fighting behavior, and disrespect to others. Majority of such conversations contains code-mixed languages[28]. Additionally, the way used to express thought or communication style also changes from one social media platform to another platform (e.g., communication styles are different in twitter and Facebook). These all have increased the complexity of the problem. To solve these problems, we have introduced a unified and robust multi-modal deep learning architecture which works for English code-mixed dataset and uni-lingual English dataset both. The devised system, uses psycho-linguistic features and very basic linguistic features. Our multi-modal deep learning architecture contains, Deep Pyramid CNN, Pooled BiLSTM, and Disconnected RNN(with Glove and FastText embedding, both). Finally, the system takes the decision based on model averaging. We evaluated our system on English Code-Mixed TRAC1 2018 dataset and uni-lingual English dataset obtained from Kaggle2. Experimental results show that our proposed system outperforms all the previous approaches on English code-mixed dataset and uni-lingual English dataset.","PeriodicalId":360747,"journal":{"name":"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114137477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Active Learning for Air Quality Station Location Recommendation 主动学习空气质量站选址建议

Proceedings of the 7th ACM IKDD CoDS and 25th COMAD Pub Date : 2020-01-05 DOI: 10.1145/3371158.3371208

S. Deepak Narayanan, Apoorv Agnihotri, Nipun Batra

{"title":"Active Learning for Air Quality Station Location Recommendation","authors":"S. Deepak Narayanan, Apoorv Agnihotri, Nipun Batra","doi":"10.1145/3371158.3371208","DOIUrl":"https://doi.org/10.1145/3371158.3371208","url":null,"abstract":"Motivation: Recent years have seen a decline in air quality across the planet, with studies suggesting that a significant proportion of global population has reduced life expectancy by up to 4 years [1, 2, 5]. To tackle this increasing growth in air pollution and its adverse effects, governments across the world have set up air quality monitoring stations that measure concentrations of various pollutants like NO2, SO2 and PM2.5, of which PM2.5 especially has significant health impact and is used for measuring air quality. One major issue with the deployment of these stations is the massive cost involved. Owing to the high installation and maintenance costs, the spatial resolution of air quality monitoring is generally poor. In this current work, we propose active learning methods to choose the next location to install an air quality monitor, motivated by sparse spatial air quality monitoring and expensive sensing equipment. Related Work: Previous work has predominantly focused on interpolation and forecasting of air quality [7, 8]. Work on air quality station location recommendation has largely been limited [4]. Previous work [4, 7, 8] has shown that installing air quality stations uniformly to maximize spatial coverage does not work well in practice, which acts as a major motivation for our work. Problem Statement: Given a set S of air quality monitoring stations, along with their corresponding values of PM2.5 over a period of time {d1,d2, ....dn }, where di represents day i , we want to choose a new location s ′, such that installing a station at s ′ gives us the best estimate of air quality at unknown locations. Approach: We perform active learning using Query by Committee (QBC) [6].Wemaintain three sets of stations the train set, the test set, and the pool set. The train set contains currently monitored locations, test set contains the locations where we wish to estimate the air quality and the pool set contains candidate stations for querying, i.e., we query from the pool set and observe how our estimation improves on the test set. To query from the pool set, we need a measure of uncertainty for the stations in the pool set. To obtain this uncertainty, we train an ensemble of learners, and take the standard deviation of their predictions for each station in the pool set. We add the station with maximum standard deviation to our train set, and remove the same station from the pool set. We repeat this process as time progresses. We use K Neighbors Regressor (KNN) as our main model inspired by the fact that nearby days will likely have similar air quality (temporal locality), and so will nearby stations (spatial","PeriodicalId":360747,"journal":{"name":"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117167341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Adversarial Demotion of Bias in Natural Language Generation 自然语言生成中偏见的对抗性降格

Proceedings of the 7th ACM IKDD CoDS and 25th COMAD Pub Date : 2020-01-05 DOI: 10.1145/3371158.3371229

M. Jegadeesan

引用次数: 0

Predicting Outcomes in Limited-Overs Cricket Matches 在有限的板球比赛预测结果

Proceedings of the 7th ACM IKDD CoDS and 25th COMAD Pub Date : 2020-01-05 DOI: 10.1145/3371158.3371166

Natwar Modani, Manoj Kilaru, Anjan Kaur, Ritwik Sinha, Harsh Khetan

{"title":"Predicting Outcomes in Limited-Overs Cricket Matches","authors":"Natwar Modani, Manoj Kilaru, Anjan Kaur, Ritwik Sinha, Harsh Khetan","doi":"10.1145/3371158.3371166","DOIUrl":"https://doi.org/10.1145/3371158.3371166","url":null,"abstract":"Cricket is a popular sport in the commonwealth countries, particularly the limited over formats. As with any sport, predicting the outcome of the game of cricket is of popular interest. For the first innings, the task is to predict the eventual score that the team batting first will reach. For the second innings, the task is to predict the match result. Existing algorithms for predicting the outcome of limited over cricket matches are simplistic and their performance leaves room for improvement. In this paper, we provide novel features including team strength indicators that capture the situation of the match more comprehensively and accurately. We use a collection of state-of-the-art supervised Machine Learning (ML) approaches for the prediction tasks. Further, we also present an approach based on Long-Short Term Memory (LSTM) Networks to incorporate the oft-mentioned concept of 'momentum' for predicting the outcomes. We show with real data that the mentioned ML models outperform the current state of art (WASP) in outcome prediction for cricket. Further, we show that incorporating the proposed features improves prediction accuracy. Finally, the LSTM model outperforms all other models with the same set of features, thereby confirming that 'momentum' indeed helps us in better prediction of outcomes.","PeriodicalId":360747,"journal":{"name":"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125441101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Topic Influence Graph Based Analysis of Seminal Papers 基于主题影响图的学术论文分析

Proceedings of the 7th ACM IKDD CoDS and 25th COMAD Pub Date : 2020-01-05 DOI: 10.1145/3371158.3371191

Abhirut Gupta, Sandipan Sikdar, P. Mohapatra, Niloy Ganguly

引用次数: 1

A Study of Efficacy of Cross-lingual Word Embeddings for Indian Languages 印度语跨语言词嵌入的有效性研究

Proceedings of the 7th ACM IKDD CoDS and 25th COMAD Pub Date : 2020-01-05 DOI: 10.1145/3371158.3371219

Jyotsana Khatri, V. Rudra Murthy, P. Bhattacharyya

引用次数: 4

Knowledge Graph based Automated Generation of Test Cases in Software Engineering 基于知识图谱的软件工程测试用例自动生成

Proceedings of the 7th ACM IKDD CoDS and 25th COMAD Pub Date : 2020-01-05 DOI: 10.1145/3371158.3371202

Anmol Nayak, Vaibhav Kesri, R. Dubey

引用次数: 12

Stance Detection in Hindi-English Code-Mixed Data 印地语-英语代码混合数据中的姿态检测

Proceedings of the 7th ACM IKDD CoDS and 25th COMAD Pub Date : 2020-01-05 DOI: 10.1145/3371158.3371226

Jethva Utsav, Dhaiwat Kabaria, Ribhu Vajpeyi, Mohit Mina, Vivek Srivastava

引用次数: 2

Interaction dynamics between hate and counter users on Twitter 推特上仇恨和反对用户之间的互动动态

Proceedings of the 7th ACM IKDD CoDS and 25th COMAD Pub Date : 2020-01-05 DOI: 10.1145/3371158.3371172

Binny Mathew, Navish Kumar, Pawan Goyal, Animesh Mukherjee

{"title":"Interaction dynamics between hate and counter users on Twitter","authors":"Binny Mathew, Navish Kumar, Pawan Goyal, Animesh Mukherjee","doi":"10.1145/3371158.3371172","DOIUrl":"https://doi.org/10.1145/3371158.3371172","url":null,"abstract":"Social media platforms usually tackle the proliferation of hate speech by blocking/suspending the message or account. One of the major drawback of such measures is the restriction of free speech. In this paper, we investigate the interaction of hatespeech and the responses that counter it (aka counter-speech). One of the prime contribution of this work is that we developed and released1 a dataset where we annotate pairs of hate and counter users. We perform several lexical, linguistic and psycholinguistic analysis on these annotated accounts and observe that the couterspeakers of the target communities employ different strategies to tackle the hatespeech. The hate users seem to be more popular as we observe that they are more subjective, express more negative sentiment, tweet more and have more followers. While the hate users seem to use words more about envy, hate, negative emotion, swearing terms, ugliness, the counter users use more words related to government, law, leader. Finally, we build a classifier to detect if a user is a hateful or counter speaker. This identification can help the platform to devise different incentive mechanisms to demote hate and promote counter speakers. Overall, our study unfolds for the first time, the interaction dynamics of the hate and counter users which could pave a more effective way for combating hate content on Twitter than just suspending the hate accounts.","PeriodicalId":360747,"journal":{"name":"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124376953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11