{"title":"Diversity Analysis of Web Search Results","authors":"Suneel Kumar Kingrani, M. Levene, Dell Zhang","doi":"10.1145/2786451.2786502","DOIUrl":"https://doi.org/10.1145/2786451.2786502","url":null,"abstract":"Are web search results usually dominated by major websites and therefore lacking diversity? In this paper, we aim to answer this question by quantitatively modelling the diversity of search results for popular queries using two diversity measures well-studied in ecology, namely Simpson's diversity index and Shannon's diversity index. Our theoretical analysis shows how the diversity of search results is determined by the Zipfian distribution of websites. Our empirical analysis reveals that comparing Google and Bing, the former is more diverse in the top-50 search results, while the latter is more diverse in the top-10 search results.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91169241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rafael Huber, B. Scheibehenne, Alexandre Chapiro, Seth Frey, R. Sumner
{"title":"The influence of visual salience on video consumption behavior: A survival analysis approach","authors":"Rafael Huber, B. Scheibehenne, Alexandre Chapiro, Seth Frey, R. Sumner","doi":"10.1145/2786451.2786507","DOIUrl":"https://doi.org/10.1145/2786451.2786507","url":null,"abstract":"In an increasingly competitive media environment, producers of online content need analytics that can predict the success of a video. In recent years the field of visual computation has produced a variety of mathematical models that quantify an image's salience, that is, its potential to capture attention. To test how a video's content might predict its success, we applied the standard saliency model of Itti, Koch, and Niebur [1] to more than 1000 video clips that were broadcast on a large video streaming website. We also obtained fine-grained data on the viewership of these clips. Based on a survival analysis, we find that people prefer more salient videos. The results were robust towards the inclusion of other predictors such as the genre of the video, but not to video length, which remains correlated with salience even after comparing videos only within show and genre. Our analyses suggest that visual salience provides an objective and easy-to-compute supplement to previously suggested predictors of video consumption behavior.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"57 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91435039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. D. Roure, C. Hooper, Kevin R. Page, S. Tarte, P. Willcox
{"title":"Observing Social Machines Part 2: How to Observe?","authors":"D. D. Roure, C. Hooper, Kevin R. Page, S. Tarte, P. Willcox","doi":"10.1145/2786451.2786475","DOIUrl":"https://doi.org/10.1145/2786451.2786475","url":null,"abstract":"Social machines are increasingly attracting study. In our paper \"Observing Social Machines Part 1: what to observe?\" we scoped the task of observing them. Several exercises that have followed have further informed our thinking and methodologies. Here, in Part 2, we reflect on how to observe? We promote a variety of methodologies that transcend the study of individual social machines, recognizing social machines as co-constituted processes within the evolving Web, and the intersection of social machines with the physical world through the Internet of Things. Our approaches emphasize the importance of sociality and human-centric perspectives.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"76 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74682875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Linked Data Scalability Challenge: Concept Reuse Leads to Semantic Decay","authors":"Paolo Pareti, Ewan Klein, A. Barker","doi":"10.1145/2786451.2786485","DOIUrl":"https://doi.org/10.1145/2786451.2786485","url":null,"abstract":"The increasing amount of available Linked Data resources is laying the foundations for more advanced Semantic Web applications. One of their main limitations, however, remains the general low level of data quality. In this paper we focus on a measure of quality which is negatively affected by the increase of the available resources. We propose a measure of semantic richness of Linked Data concepts and we demonstrate our hypothesis that the more a concept is reused, the less semantically rich it becomes. This is a significant scalability issue, as one of the core aspects of Linked Data is the propagation of semantic information on the Web by reusing common terms. We prove our hypothesis with respect to our measure of semantic richness and we validate our model empirically. Finally, we suggest possible future directions to address this scalability problem.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83831353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Dickinson, Miriam Fernández, Lisa A. Thomas, P. Mulholland, P. Briggs, Harith Alani
{"title":"Automatic Identification of Personal Life Events in Twitter","authors":"Thomas Dickinson, Miriam Fernández, Lisa A. Thomas, P. Mulholland, P. Briggs, Harith Alani","doi":"10.1145/2786451.2786513","DOIUrl":"https://doi.org/10.1145/2786451.2786513","url":null,"abstract":"New social media has led to an explosion in personal digital data that encompasses both those expressions of self chosen by the individual as well as reflections of self provided by other, third parties. The resulting Digital Personhood (DP) data is complex and for many users it is too easy to become lost in the mire of digital data. This paper studies the automatic detection of personal life events in Twitter. Six relevant life events are considered from psychological research including: beginning school; first full time job; falling in love; marriage; having children and parent's death. We define a variety of features (user, content, semantic and interaction) to capture the characteristics of those life events and present the results of several classification methods to automatically identify these events in Twitter.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80983220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Considering a Wider Web?: Employing Multimodal Critical Discourse Analysis in Exploration of Multiple Online Spaces","authors":"Rebecca Nash","doi":"10.1145/2786451.2786483","DOIUrl":"https://doi.org/10.1145/2786451.2786483","url":null,"abstract":"What sets the Web apart from 'traditional' mass media is almost instantaneous access to diverse spaces that users navigate in customized ways. Users are often bound up as producers and consumers of materials online [1]. As a result, new avenues for research have emerged for both large ('Big Data') and small-scale Web studies. Research across this spectrum, however, has tended to focus on singular types of Web platform (i.e. Twitter data, online forums etc.). Web users, conversely, are unlikely to relegate browsing to discrete types of Web space. What will be argued here -- with reference to an ongoing case study researching the role of the Web on production and consumption of aesthetic surgery - is usefulness and significance of multimodal critical discourse analysis (MMCDA) for qualitative research across multiple online spaces. MMCDA examines intersecting visual media and texts to recognize and comprehend (re)production of dominant meanings in various contexts. Employing MMCDA across a selection of different types of websites -- assembling a 'snapshot' of a topic(s) - enables wider qualitative exploration of complementary, competing, and contradictory visual and textual sources confronting users on an everyday, experiential level. This raises important epistemological and ethical issues pertinent to undertaking qualitative research on the Web. How do different Web spaces contribute to construction of dominant discourses? How do we - as researchers - gather, analyze and use various data ethically? From this emerges potential for developing more intricate understandings of diverse content available at the click of a hyperlink.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86928557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jan-Christoph Kalo, S. Homoceanu, J. Rose, Wolf-Tilo Balke
{"title":"Avoiding Chinese Whispers: Controlling End-to-End Join Quality in Linked Open Data Stores","authors":"Jan-Christoph Kalo, S. Homoceanu, J. Rose, Wolf-Tilo Balke","doi":"10.1145/2786451.2786466","DOIUrl":"https://doi.org/10.1145/2786451.2786466","url":null,"abstract":"Today Linked Open Data is a central trend in information provisioning. Data is collected in distributed data stores, individually curated with high quality, and made available over the Web for a wide variety of Web applications providing their own business logic for data utilization. Thus, the key promise of Linked Open Data is to provide a holistic view for a wide range of data items or entities. But parallel to the problems of database integration or schema matching, linking data over several sources remains a challenge and is currently severely hampering the vision of a working Semantic Web. One possible solution are instance matching systems that automatically create owl:sameAs links between data stores. According to existing benchmarks, the matching quality has even reached a satisfying level. However, our extensive analysis shows that instance matching systems are not yet ready for large-scale data interlinking. This is because query processors joining even via a single incorrectly created link implicitly use also all transitive owl:sameAs links that may in turn be mismatched again. The result is similar to the game Chinese Whispers: watered-down sameAs semantics step-by-step lead to a terrible end-to-end quality of joins. We develop innovative structural mechanisms on top of instance matching systems to significantly improve query processing avoiding Chinese Whispers.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88548976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Web Practice of Mathematicians on the Web: An Insight into Significant but Neglected Web Groups","authors":"Mandy Lo, H. Davis, J. Edwards, C. Bokhove","doi":"10.1145/2786451.2786498","DOIUrl":"https://doi.org/10.1145/2786451.2786498","url":null,"abstract":"In this paper, we describe the findings from a three-year multi-phased investigation into the Web practice of online mathematics communities. Our results indicate that the equivalent technologies that enable text-input or image-uploads without the need to understand programming languages have not been made available for the mathematics/ scientific communities to enable fluid communications. Given the global importance of mathematical and scientific collaborations, we argue that the mathematical and scientific communities are significant but neglected groups, and that more attention should be given to the user-interface designs to support fluid online mathematics communications.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82543421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing Discourse Communities with Distributional Semantic Models","authors":"Igor Brigadir, Derek Greene, P. Cunningham","doi":"10.1145/2786451.2786470","DOIUrl":"https://doi.org/10.1145/2786451.2786470","url":null,"abstract":"This paper presents a new corpus-driven approach applicable to the study of language patterns in social and political contexts, or Critical Discourse Analysis (CDA) using Distributional Semantic Models (DSMs). This approach considers changes in word semantics, both over time and between communities with differing viewpoints. The geometrical spaces constructed by DSMs or \"word spaces\" offer an objective, robust exploratory analysis tool for revealing novel patterns and similarities between communities, as well as highlighting when these changes occur. To quantify differences between word spaces built on different time periods and from different communities, we analyze the nearest neighboring words in the DSM, a process we relate to analyzing \"concordance lines\". This makes the approach intuitive and interpretable to practitioners. We demonstrate the usefulness of the approach with two case studies, following groups with opposing political ideologies in the Scottish Independence Referendum, and the US Midterm Elections 2014.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"48 18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90296620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ramine Tinati, Markus Luczak-Rösch, E. Simperl, N. Shadbolt, W. Hall
{"title":"'/Command' and Conquer: Analysing Discussion in a Citizen Science Game","authors":"Ramine Tinati, Markus Luczak-Rösch, E. Simperl, N. Shadbolt, W. Hall","doi":"10.1145/2786451.2786455","DOIUrl":"https://doi.org/10.1145/2786451.2786455","url":null,"abstract":"Citizen science is changing the process of scientific knowledge discovery. Successful projects rely on an active and able collection of volunteers. In order to attract, and sustain citizen scientists, designers are faced with the task of transforming complex scientific tasks into something accessible, interesting, and hopefully, engaging. In this paper, we examine the citizen science game EyeWire. Our analysis draws up a dataset of over 4,000,000 completed game and 885,000 chat entries, made by over 90,000 players. The analysis provides a detailed understanding of how features of the system facilitate player interaction and communication alongside completing the gamified scientific task. Based on the analysis we describe a set of behavioural characteristics which identify different types of players within the EyeWire platform.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91072764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}