Proceedings of the 3rd IKDD Conference on Data Science, 2016最新文献

筛选
英文 中文
Learning to Collectively Link Entities 学习集体链接实体
Proceedings of the 3rd IKDD Conference on Data Science, 2016 Pub Date : 2016-03-13 DOI: 10.1145/2888451.2888454
Ashish Kulkarni, Kanika Agarwal, Pararth Shah, Sunny Raj Rathod, Ganesh Ramakrishnan
{"title":"Learning to Collectively Link Entities","authors":"Ashish Kulkarni, Kanika Agarwal, Pararth Shah, Sunny Raj Rathod, Ganesh Ramakrishnan","doi":"10.1145/2888451.2888454","DOIUrl":"https://doi.org/10.1145/2888451.2888454","url":null,"abstract":"Recently Kulkarni et al. [20] proposed an approach for collective disambiguation of entity mentions occurring in natural language text. Their model achieves disambiguation by efficiently computing exact MAP inference in a binary labeled Markov Random Field. Here, we build on their disambiguation model and propose an approach to jointly learn the node and edge parameters of such a model. We use a max margin framework, which is efficiently implemented using projected subgradient, for collective learning. We leverage this in an online and interactive annotation system which incrementally trains the model as data gets curated progressively. We demonstrate the usefulness of our system by manually completing annotations for a subset of the Wikipedia collection. We have made this data publicly available. Evaluation shows that learning helps and our system performs better than several other systems including that of Kulkarni et al.","PeriodicalId":136431,"journal":{"name":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134179521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning from Gurus: Analysis and Modeling of Reopened Questions on Stack Overflow 向大师学习:堆栈溢出重新开放问题的分析和建模
Proceedings of the 3rd IKDD Conference on Data Science, 2016 Pub Date : 2016-03-13 DOI: 10.1145/2888451.2888460
Rishabh Gupta, P. Reddy
{"title":"Learning from Gurus: Analysis and Modeling of Reopened Questions on Stack Overflow","authors":"Rishabh Gupta, P. Reddy","doi":"10.1145/2888451.2888460","DOIUrl":"https://doi.org/10.1145/2888451.2888460","url":null,"abstract":"Community-driven Question Answering (Q&A) platforms are gaining popularity now-a-days and the number of posts on such platforms are increasing tremendously. Thus, the challenge to keep these platforms noise-free is attracting the interest of research community. Stack Overflow is one such popular computer programming related Q&A platform. The established users on Stack Overflow have learnt the acceptable format and scope of questions in due course. Even if their questions get closed, they are aware of the required edits, therefore the chances of their questions being reopened increases. On the other hand, non-established users have not adapted to the Stack Overflow system and find difficulty in editing their closed questions. In this work, we aim to identify features which help differentiate editing approaches of established and non-established users, and motivate the need of recommendation model. Such a recommendation model can assist every user to edit their closed questions leveraging the edit-style of the established users of the platform.","PeriodicalId":136431,"journal":{"name":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121798588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Improving Urban Transportation through Social Media Analytics 通过社交媒体分析改善城市交通
Proceedings of the 3rd IKDD Conference on Data Science, 2016 Pub Date : 2016-03-13 DOI: 10.1145/2888451.2888478
Manjira Sinha, P. Varma, Gayatri Sivakumar, Mridula Singh, Tridib Mukherjee, D. Chander, K. Dasgupta
{"title":"Improving Urban Transportation through Social Media Analytics","authors":"Manjira Sinha, P. Varma, Gayatri Sivakumar, Mridula Singh, Tridib Mukherjee, D. Chander, K. Dasgupta","doi":"10.1145/2888451.2888478","DOIUrl":"https://doi.org/10.1145/2888451.2888478","url":null,"abstract":"Citizens tend to discuss issues in public forums, social media, and web blogs. Given that issues related to public transportation are most actively reported across web-based sources, we present a holistic framework for collection, categorization, aggregation and visualization of urban public transportation issues. The primary challenges in deriving useful insights from web-based sources, stem from -- (a) the number of reports; (b) incomplete or implicit spatio-temporal context; and the (c) unstructured nature of text in these reports. The work initiates with the formal complaint data from the largest public transportation agency in Bangalore, complemented by complaint reports from web-based and social media sources. Text data is categorized into different transportation related problems and spatio-temporal context is added to the text data for geo-tagging and identifying persistent issues. A well-organized dashboard is developed for efficient visualization. The dashboard is currently being piloted with the largest transportation agency in Bangalore.","PeriodicalId":136431,"journal":{"name":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116620735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
AMEO 2015: A dataset comprising AMCAT test scores, biodata details and employment outcomes of job seekers AMEO 2015:一个包含AMCAT测试分数、求职者生物数据细节和就业结果的数据集
Proceedings of the 3rd IKDD Conference on Data Science, 2016 Pub Date : 2016-03-13 DOI: 10.1145/2888451.2892037
V. Aggarwal, Shashank Srikant, Harsh Nisar
{"title":"AMEO 2015: A dataset comprising AMCAT test scores, biodata details and employment outcomes of job seekers","authors":"V. Aggarwal, Shashank Srikant, Harsh Nisar","doi":"10.1145/2888451.2892037","DOIUrl":"https://doi.org/10.1145/2888451.2892037","url":null,"abstract":"More than a million engineers enter the global workforce every year. A relevant question is what determines the jobs and salaries these engineers are offered right after graduation. Previous studies have shown the influence of various factors such as college reputation, grades, the field one specializes in and market conditions for specific industries. An important input which such analyses do not have is a standardized measures of job skills done at the time of completion of studies. We present here Aspiring Minds' Employability Outcomes 2015 (AMEO 2015), a unique dataset which provides engineering graduates' employment outcomes (salaries, job titles and job locations) together with standardized assessment scores in three fundamental areas - cognitive skills, technical skills and personality. Coupled with biodata information, AMEO 2015 provides an opportunity for a unique and comprehensive study of the entry level labor market. The data could be used to make an accurate salary predictor, but also understand what influences salary and job titles in the labor market. In this paper we describe the details of the dataset and discuss a spectrum of questions around meritocracy in labor markets, biases in labor selection and other prevalent market forces it can help uncover and answer. You can download the dataset at: http://research.aspiringminds.com/resources/","PeriodicalId":136431,"journal":{"name":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117091361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Events Describe Places: Tagging Places with Event Based Social Network Data 事件描述地点:用基于事件的社会网络数据标记地点
Proceedings of the 3rd IKDD Conference on Data Science, 2016 Pub Date : 2016-03-13 DOI: 10.1145/2888451.2888477
Vinod Hegde, A. Mileo, A. Pozdnoukhov
{"title":"Events Describe Places: Tagging Places with Event Based Social Network Data","authors":"Vinod Hegde, A. Mileo, A. Pozdnoukhov","doi":"10.1145/2888451.2888477","DOIUrl":"https://doi.org/10.1145/2888451.2888477","url":null,"abstract":"Location based services and Geospatial web applications have become popular in recent years due to wide adoption of mobile devices. Search and recommendation of places or Points of Interests (PoIs) are prominent services available on them. The effectiveness of these services crucially depends on the availability of tags that are descriptive of places. The major geospatial databases that contain data about places suffer from the lack of descriptive tags for places, since writing them is a time-consuming process and only a few users do it despite having knowledge about places. In order to tackle this issue and automatically generate descriptive tags for places, we propose a solution that utilizes data about a set of events that happen in a specific place and use it to extract meaningful descriptive tags for that place. We use data about events held at places on Meetup, a well known event based social network and apply Latent Dirichlet Allocation (LDA) to derive sets of probable descriptive tags for any place. In order to evaluate our approach, we measure semantic relatedness between tags derived for places on Meetup and manually assigned tags from Foursquare, a location based service. Results show that event data can be used to derive semantically relevant place tags. This shows that location based services can benefit from capturing data about events to derive place tags.","PeriodicalId":136431,"journal":{"name":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129671726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Detecting Community Structures in Social Networks by Graph Sparsification 基于图稀疏的社交网络社区结构检测
Proceedings of the 3rd IKDD Conference on Data Science, 2016 Pub Date : 2016-03-13 DOI: 10.1145/2888451.2888479
Partha Basuchowdhuri, Satyaki Sikdar, Sonu Sreshtha, S. Majumder
{"title":"Detecting Community Structures in Social Networks by Graph Sparsification","authors":"Partha Basuchowdhuri, Satyaki Sikdar, Sonu Sreshtha, S. Majumder","doi":"10.1145/2888451.2888479","DOIUrl":"https://doi.org/10.1145/2888451.2888479","url":null,"abstract":"Community structures are inherent in social networks and finding them is an interesting and well-studied problem. Finding community structures in social networks is similar to locating densely connected clusters of nodes in a graph. One of the popular methods for finding communities is to first find the inter-community edges and then removing them to reveal the communities. It is well-known that a network centrality measure named edge betweenness can be used to detect the inter-community edges. The edges with high edge betweenness are those that fall in a large number of shortest paths out of all possible pairs of shortest paths. Finding all-pair shortest paths is a computationally expensive task, especially for large-sized graphs. So we construct a t-spanner, a known graph sparsification technique, for finding edges with high betweenness and eventually find communities by removing such edges. Using the t-spanner, we then detect the inter-community edges in O(km) running time by building a distance oracle of size O(kn1+1/k), where t = 2k-1. Compared to the traditional community detection methods dependent on calculation of betweenness values, our algorithm runs much faster. Experiments show that our algorithm finds communities of quality comparable to the other state-of-the-art community detection algorithms.","PeriodicalId":136431,"journal":{"name":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134506644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
CitizenPulse: A Text Analytics framework for Proactive e-Governance - A Case Study of Mygov.in 公民脉动:主动电子政务的文本分析框架——以Mygov.in为例
Proceedings of the 3rd IKDD Conference on Data Science, 2016 Pub Date : 2016-03-13 DOI: 10.1145/2888451.2888463
Ankit Lamba, Deepak Yadav, A. Lele
{"title":"CitizenPulse: A Text Analytics framework for Proactive e-Governance - A Case Study of Mygov.in","authors":"Ankit Lamba, Deepak Yadav, A. Lele","doi":"10.1145/2888451.2888463","DOIUrl":"https://doi.org/10.1145/2888451.2888463","url":null,"abstract":"Indian Citizens are beginning to express themselves via social media on a regular basis on various issues. Government of India have started an initiated called as Mygov.in as a collaborative portal where citizens can voice their opinions via free form comments. Analyzing this free form data is a huge challenge. In this paper we present a work in progress called as CitizenPulse framework, capable of performing text analytics on unstructured text using off-the-shelf text analytics components like Named Entity Recognition, Part of Speech and Stemming to name a few. Apart from integrating the text analytics components, CitizenPulse framework abstracts these building blocks as Object, and such different objects can be dragged, dropped and connected to construct a text analytics pipeline called as Analytics Softcore. As a case study we report the analysis of the Mygov.in portal specifically for the topic of Cleanliness in School Curriculum.","PeriodicalId":136431,"journal":{"name":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123728898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Some algorithms for correlated bandits with non-stationary rewards: Regret bounds and applications 具有非平稳奖励的相关盗匪算法:后悔界及其应用
Proceedings of the 3rd IKDD Conference on Data Science, 2016 Pub Date : 2016-03-13 DOI: 10.1145/2888451.2888475
Prathamesh Mayekar, N. Hemachandra
{"title":"Some algorithms for correlated bandits with non-stationary rewards: Regret bounds and applications","authors":"Prathamesh Mayekar, N. Hemachandra","doi":"10.1145/2888451.2888475","DOIUrl":"https://doi.org/10.1145/2888451.2888475","url":null,"abstract":"We first propose an online learning model wherein rewards for different actions/arms used by the user can be correlated and the reward stream can be non-stationary. Thus, this extends the standard multi-armed bandit learning model. We propose two algorthims, Greedy and Regression based UCB, that attempt to minimize the expected regret. We also obtain non-trivial upper bounds for the expected regret through theoretical analysis. We also provide some evidence for sub-polynomial increase in expected regret upon appropriate tuning of algorithm input parameters. These models are motivated by the problem of dynamic pricing of a product faced by a typical online retailer.","PeriodicalId":136431,"journal":{"name":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126568183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信