A. Jain, Ashok Kumar, Javesh Garg, Utkarsh Patange, P. Jalan
{"title":"TraffTrend: Real time traffic updates and traffic trends using social media analytics","authors":"A. Jain, Ashok Kumar, Javesh Garg, Utkarsh Patange, P. Jalan","doi":"10.1145/2778865.2778875","DOIUrl":"https://doi.org/10.1145/2778865.2778875","url":null,"abstract":"Traffic management has had difficulty gaining insights about the traffic situation in a city. Here, we classify the data from social media into various cause-effect pairs to mark problems in a locality at a particular time along with its most prominent causes. For this, we classified data into multiple labels such as congestion, accidents, construction etc. using random forest classifier with an accuracy of 82.3%. Using these labels, we find the traffic problems and their probable causes and map it to the location and time of occurrence. Then, this mapping is used to extract useful traffic trends. Also, we show events happening in real time in our dashboard for a particular location so as to keep the common people updated about current traffic situation at various locations.","PeriodicalId":116839,"journal":{"name":"Proceedings of the 2nd IKDD Conference on Data Sciences","volume":"344 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116350101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computer Agents that Interact Proficiently with People","authors":"Sarit Kraus","doi":"10.1145/2778865.2778868","DOIUrl":"https://doi.org/10.1145/2778865.2778868","url":null,"abstract":"Automated agents that interact proficiently with people can be useful in supporting or replacing people in complex tasks. The inclusion of people presents novel problems for the design of automated agents' strategies. People do not necessarily adhere to the optimal, monolithic strategies that can be derived analytically. Their behavior is affected by a multitude of social and psychological factors. In this talk I will show how combining machine learning techniques for human modeling, human behavioral models, formal decision-making and game theory approaches enables agents to interact well with people. Applications include intelligent agents that help drivers reduce energy consumption, agents that support rehabilitation, employer-employee negotiation and agents that support a human operator in managing a team of low-cost mobile robots in search and rescue tasks..","PeriodicalId":116839,"journal":{"name":"Proceedings of the 2nd IKDD Conference on Data Sciences","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128704955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Semwal, Sonal Patil, Sainyam Galhotra, Akhil Arora, Narayanan Unny
{"title":"STAR: Real-time Spatio-Temporal Analysis and Prediction of Traffic Insights using Social Media","authors":"D. Semwal, Sonal Patil, Sainyam Galhotra, Akhil Arora, Narayanan Unny","doi":"10.1145/2778865.2778872","DOIUrl":"https://doi.org/10.1145/2778865.2778872","url":null,"abstract":"The steady growth of data from social networks has resulted in wide-spread research in a host of application areas including transportation, health-care, customer-care and many more. Owing to the ubiquity and popularity of transportation (more recently) the growth in the number of problems reported by the masses has no bounds. With the advent of social media, reporting problems has become easier than before. In this paper, we address the problem of efficient management of transportation related woes by leveraging the information provided by social media sources such as -- Facebook, Twitter etc. We develop techniques for viral event detection, identify frequently co-occurring problem patterns and their root-causes and mine suggestions to solve the identified problems. We predict the occurrence of different problems, (with an accuracy of ≈ 80%) at different locations and times leveraging the analysis done above along with weather information and news reports. In addition, we design a feature-packed visualization that significantly enhances the ability to analyse data in real-time.","PeriodicalId":116839,"journal":{"name":"Proceedings of the 2nd IKDD Conference on Data Sciences","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126377452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TrafficKarma: Estimating Effective Traffic Indicators using Public Data","authors":"Kireet Pant, Dibyendu Talukder, Pravesh Biyani","doi":"10.1145/2778865.2778871","DOIUrl":"https://doi.org/10.1145/2778865.2778871","url":null,"abstract":"TrafficKarma optimizes the use of publicly available information to monitor and estimate traffic data in order to map and visualize it for its effective use. This is achieved by using state of the art optimization and machine learning techniques coupled with insightful visualization of the data. Traffic Karma can be used by authorities in transportation, traffic police and various agencies that needs current(online) or past(statistical) information about traffic. This paper will concisely explain the features of the application proposed, strategy, data sources and methodology, system architecture and the data analysis techniques used.","PeriodicalId":116839,"journal":{"name":"Proceedings of the 2nd IKDD Conference on Data Sciences","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116010449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine Learning @ Amazon","authors":"R. Rastogi","doi":"10.1145/2778865.2778867","DOIUrl":"https://doi.org/10.1145/2778865.2778867","url":null,"abstract":"In this talk, I will first provide an overview of the key Machine Learning (ML) applications we are developing at Amazon. I will then describe a matrix factorization model that we have developed for making product recommendations âĂŞ the salient characteristics of the model are: (1) It uses a Bayesian approach to handle data sparsity, (2) It leverages user and item features to handle the cold start problem, and (3) It introduces latent variables to handle multiple personas associated with a user account (e.g. family members). Our experimental results with synthetic and real-life datasets show that leveraging user and item features, and incorporating user personas enables our model to provide lower RMSE and perplexity compared to baselines.","PeriodicalId":116839,"journal":{"name":"Proceedings of the 2nd IKDD Conference on Data Sciences","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127961664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Broad Data: Challenges on the emerging Web of data","authors":"J. Hendler","doi":"10.1145/2778865.2778870","DOIUrl":"https://doi.org/10.1145/2778865.2778870","url":null,"abstract":"\"Big Data\" usually refers to the very large datasets generated by scientists, to the many petabytes of data held by companies like Facebook and Google, and to analyzing real-time data assets like the stream of twitter messages emerging from events around the world. Key areas of interest include technologies to manage much larger datasets, technologies for the visualization and analysis of databases, cloud-based data management and data mining algorithms. Recently, however, we have begun to see the emergence of another, and equally compelling data challenge -- that of the \"Broad data\" that emerges from millions and millions of raw datasets available on the World Wide Web. For broad data the new challenges that emerge include Web-scale data search and discovery, rapid and potentially ad hoc integration of datasets, visualization and analysis of only-partially modeled datasets, and issues relating to the policies for data use, reuse and combination. In this talk, we present the broad data challenge and discuss potential starting points for solutions including those arising from research in the Semantic Web area. We illustrate these approaches using data from a \"meta-catalog\" of over 1,000,000 open datasets that have been collected from about two hundred governments around the world.","PeriodicalId":116839,"journal":{"name":"Proceedings of the 2nd IKDD Conference on Data Sciences","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121090394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tweeting Traffic: Analyzing Twitter for generating real-time city traffic insights and predictions","authors":"Priyam Tejaswin, Rohan Kumar, Siddharth Gupta","doi":"10.1145/2778865.2778874","DOIUrl":"https://doi.org/10.1145/2778865.2778874","url":null,"abstract":"Crowd sourced road traffic management is an open, unexplored problem in data science. With the growth of mobile communications and social media networks, more people are expressing their traffic situations in real-time. We explore how this social media data can be analyzed to generate valuable insights, useful for traffic management and city planning. Our method utilizes background knowledge from structured data repositories for entity extraction from tweets. We proceed to use this spatio-temporal data for traffic incident clustering and prediction. With accuracy and precision measurements providing encouraging results, we build on our methods and present our Continuous Traffic Management Dashboard (CTMD) system: an automated computer system for generating real-time, historic and predictive traffic insights.","PeriodicalId":116839,"journal":{"name":"Proceedings of the 2nd IKDD Conference on Data Sciences","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130774301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Apurva Pathak, Bidyut Kr. Patra, Arnab Chakraborty, Abhishek Agarwal
{"title":"A City Traffic Dashboard using Social Network Data","authors":"Apurva Pathak, Bidyut Kr. Patra, Arnab Chakraborty, Abhishek Agarwal","doi":"10.1145/2778865.2778873","DOIUrl":"https://doi.org/10.1145/2778865.2778873","url":null,"abstract":"With the growing urbanization and globalization, long commute and traffic problems have become the everyday nightmare of an Indian metro city dweller. The non-existence of a singular dashboard, which can provide holistic view of the city traffic, has aggravated this problem manifold for the traffic authorities and its citizens. This paper describes the methodology we employed for CoDS 2015 Data Challenge to solve this problem. We show how data from social network can derive useful information about the road and traffic issues in a city. We propose to design a dashboard for obtaining real-time view of the traffic data scattered across various user status updates, tweets and comments on social networks using state-of-the-art machine learning algorithms. We present empirical results and discuss various methods for extracting useful information from the social feeds. Proposed dashboard can provide a straight actionable information to the users and traffic authorities for handling traffic issues in efficient manner.","PeriodicalId":116839,"journal":{"name":"Proceedings of the 2nd IKDD Conference on Data Sciences","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133901505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MapReduce Algorithms","authors":"J. Ullman","doi":"10.1145/2778865.2778866","DOIUrl":"https://doi.org/10.1145/2778865.2778866","url":null,"abstract":"We begin with a sketch of how MapReduce works and how MapReduce algorithms differ from general parallel algorithms. While algorithm analysis usually centers on the serial or parallel running time of the algorithms that solve a given problem, in the MapReduce world, the critical issue is a tradeoff between interprocessor communication and the parallel running time. We examine a fundamental problem, in which the output depends on comparison of all pairs of inputs (the \"all-pairs\" problem), and show matching upper and lower bounds for the communication/time tradeoff. Finally, we consider special cases of all-pairs, where only a subset of the pairs of inputs are of interest; an example is the problem of similarity join.","PeriodicalId":116839,"journal":{"name":"Proceedings of the 2nd IKDD Conference on Data Sciences","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117103336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Resilient Cities and Urban Analytics: The Role of Big Data and High Performance Pervasive Computing","authors":"M. Marathe","doi":"10.1145/2778865.2778869","DOIUrl":"https://doi.org/10.1145/2778865.2778869","url":null,"abstract":"Developing practical informatics tools and decision support environments to analyze socio-technical systems that support our cities is complicated and scientifically challenging. The increased urbanization across the globe, specifically in the developing countries poses further challenges. Recent quantitative changes in high performance and pervasive computing, Bigdata and network science have created new opportunities for collecting, integrating, analyzing and accessing information related to coupled urban socio-technical systems. Innovative information systems that leverage this new capability have already proved immensely useful. After a brief overview, I will describe an urban analytics approach rooted in synthetic information, pervasive high performance computing and data analytics to study resilient and sustainable cities. Examples in public health epidemiology and urban transport planning and security will be used to guide the discussion. Computational challenges and directions for future research will be discussed.","PeriodicalId":116839,"journal":{"name":"Proceedings of the 2nd IKDD Conference on Data Sciences","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114014030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}