CrowdMM '12Pub Date : 2012-10-29DOI: 10.1145/2390803.2390811
M. Lux, M. Taschwer, Oge Marques
{"title":"A closer look at photographers' intentions: a test dataset","authors":"M. Lux, M. Taschwer, Oge Marques","doi":"10.1145/2390803.2390811","DOIUrl":"https://doi.org/10.1145/2390803.2390811","url":null,"abstract":"Taking a photo is a process typically triggered by an intention. Some people want to document the progress of a task, others just want to capture the moment to re-visit the situation later on. In this contribution we present a novel, openly available dataset with 1,309 photos and annotations specifying the intentions of the photographers, which were eventually validated using Amazon Mechanical Turk.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115496072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CrowdMM '12Pub Date : 2012-10-29DOI: 10.1145/2390803.2390813
Raynor Vliegendhart, E. Dolstra, J. Pouwelse
{"title":"Crowdsourced user interface testing for multimedia applications","authors":"Raynor Vliegendhart, E. Dolstra, J. Pouwelse","doi":"10.1145/2390803.2390813","DOIUrl":"https://doi.org/10.1145/2390803.2390813","url":null,"abstract":"Conducting a conventional experiment to test an application's user interface in a lab environment is a costly and time-consuming process. In this paper, we show that it is feasible to carry out A/B tests for a multimedia application through Amazon's crowdsourcing platform Mechanical Turk involving hundreds of workers at low costs. We let workers test user interfaces within a remote virtual machine that is embedded within the HIT and we show that technical issues that arise in this approach can be overcome.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115697259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CrowdMM '12Pub Date : 2012-10-29DOI: 10.1145/2390803.2390816
Sunghyun Park, Gelareh Mohammadi, Ron Artstein, Louis-Philippe Morency
{"title":"Crowdsourcing micro-level multimedia annotations: the challenges of evaluation and interface","authors":"Sunghyun Park, Gelareh Mohammadi, Ron Artstein, Louis-Philippe Morency","doi":"10.1145/2390803.2390816","DOIUrl":"https://doi.org/10.1145/2390803.2390816","url":null,"abstract":"This paper presents a new evaluation procedure and tool for crowdsourcing micro-level multimedia annotations and shows that such annotations can achieve a quality comparable to that of expert annotations. We propose a new evaluation procedure, called MM-Eval (Micro-level Multimedia Evaluation), which compares fine time-aligned annotations using Krippendorff's alpha metric and introduce two new metrics to evaluate the types of disagreement between coders. We also introduce OCTAB (Online Crowdsourcing Tool for Annotations of Behaviors), a web-based annotation tool that allows precise and convenient multimedia behavior annotations, directly from Amazon Mechanical Turk interface. With an experiment using the above tool and evaluation procedure, we show that a majority vote among annotations from 3 crowdsource workers leads to a quality comparable to that of local expert annotations.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124928345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CrowdMM '12Pub Date : 2012-10-29DOI: 10.1145/2390803.2390807
César Moltedo, H. Astudillo, Marcelo Mendoza
{"title":"Tagging tagged images: on the impact of existing annotations on image tagging","authors":"César Moltedo, H. Astudillo, Marcelo Mendoza","doi":"10.1145/2390803.2390807","DOIUrl":"https://doi.org/10.1145/2390803.2390807","url":null,"abstract":"Crowdsourcing has been widely used to generate metadata for multimedia resources. By presenting partially described resources to human annotators, resources are tagged yielding better descriptions. Although significant improvements in metadata quality have been reported, as yet there is no understanding of how taggers are biased by previously acquired resource tags. We hypothesize that the number of existing annotations, which we take here to reflect the tag completeness degree, influence taggers: rather empty descriptions (initial tagging stages) encourage creating more tags, but better tags are created for fuller descriptions (later tagging stages). We explore empirically the relationship between tag quality/quantity and completeness degree by conducting a study on a set of human crowdsourcing annotators over a collection of images with different completeness degrees. Experimental results show a significant relation between completeness and image tagging. To the best of our knowledge, this study is the first to explore the impact of existing annotations on image tagging.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127162650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CrowdMM '12Pub Date : 2012-10-29DOI: 10.1145/2390803.2390808
A. Foncubierta-Rodríguez, H. Müller
{"title":"Ground truth generation in medical imaging: a crowdsourcing-based iterative approach","authors":"A. Foncubierta-Rodríguez, H. Müller","doi":"10.1145/2390803.2390808","DOIUrl":"https://doi.org/10.1145/2390803.2390808","url":null,"abstract":"As in many other scientific domains where computer--based tools need to be evaluated, also medical imaging often requires the expensive generation of manual ground truth. For some specific tasks medical doctors can be required to guarantee high quality and valid results, whereas other tasks such as the image modality classification described in this text can in sufficiently high quality be performed with simple domain experts. Crowdsourcing has received much attention in many domains recently as volunteers perform so--called human intelligence tasks for often small amounts of money, allowing to reduce the cost of creating manually annotated data sets and ground truth in evaluation tasks. On the other hand there has often been a discussion on the quality when using unknown experts. Controlling task quality has remained one of the main challenges in crowdsourcing approaches as potentially the persons performing the tasks may not be interested in results quality but rather their payment.\u0000 On the other hand several crowdsourcing platforms such as Crowdflower that we used allow creating interfaces and sharing them with only a limited number of known persons. The text describes the interfaces developed and the quality obtained through manual annotation of several domain experts and one medical doctor. Particularly the feedback loop of semi--automatic tools is explained. The results of an initial crowdsourcing round classifying medical images into a set of image categories were manually controlled by domain experts and then used to train an automatic system that visually classified these images. The automatic classification results were then used to manually confirm or refuse the automatic classes, reducing the time for the initial tasks.\u0000 Crowdsourcing platforms allow creating a large variety of interfaces for judgements. Whether used among known experts or paying for unknown persons, they allow increasing the speed of ground truth creation and limit the amount of money to be paid.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124083219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CrowdMM '12Pub Date : 2012-10-29DOI: 10.1145/2390803.2390820
Thi Phuong Nghiem, A. Carlier, Géraldine Morin, V. Charvillat
{"title":"Enhancing online 3D products through crowdsourcing","authors":"Thi Phuong Nghiem, A. Carlier, Géraldine Morin, V. Charvillat","doi":"10.1145/2390803.2390820","DOIUrl":"https://doi.org/10.1145/2390803.2390820","url":null,"abstract":"In this paper, we propose to build semantic links between a product's textual description and its corresponding 3D visualization. These links help gathering knowledge about a product and ease browsing its 3D model. Our goal is to support the common behavior that when reading a textual information of a product, users naturally imagine how it looks like in real life. We generate the association between a textual description and a 3D feature from crowdsourcing. A user study of 82 people assesses the usefulness of the association for subsequent users, both for correctness and efficiency. Users are asked to perform the identification of features on 3D models; from the traces, associations leading to recommended views are derived. This information (recommended view) is proposed to subsequent users for performing the same task. Whereas the associations could be simply given by an expert, crowdsourcing offers advantages: we have inexpensive experts in the crowd as well as a natural access to users' (eg. customers') preferences and opinions.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131181123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CrowdMM '12Pub Date : 2012-10-29DOI: 10.1145/2390803.2390812
M. Avlonitis, K. Chorianopoulos, David A. Shamma
{"title":"Crowdsourcing user interactions within web video through pulse modeling","authors":"M. Avlonitis, K. Chorianopoulos, David A. Shamma","doi":"10.1145/2390803.2390812","DOIUrl":"https://doi.org/10.1145/2390803.2390812","url":null,"abstract":"Semantic video research has employed crowdsourcing techniques on social web video data sets such as comments, tags, and annotations, but these data sets require an extra effort on behalf of the user. We propose a pulse modeling method, which analyzes implicit user interactions within web video, such as rewind. In particular, we have modeled the user information seeking behavior as a time series and the semantic regions as a discrete pulse of fixed width. We constructed these pulses from user interactions with a documentary video that has a very rich visual style with too many cuts and camera angles/frames for the same scene. Next, we calculated the correlation coefficient between dynamically detected user pulses at the local maximums and the reference pulse. We have found when people are actively seeking for information in a video, their activity (these pulses) significantly matches the semantics of the video. This proposed pulse analysis method complements previous work in content-based information retrieval and provides an additional user-based dimension for modeling the semantics of a web video.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130335542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CrowdMM '12Pub Date : 2012-10-29DOI: 10.1145/2390803.2390815
L. Gottlieb, Jaeyoung Choi, P. Kelm, T. Sikora, G. Friedland
{"title":"Pushing the limits of mechanical turk: qualifying the crowd for video geo-location","authors":"L. Gottlieb, Jaeyoung Choi, P. Kelm, T. Sikora, G. Friedland","doi":"10.1145/2390803.2390815","DOIUrl":"https://doi.org/10.1145/2390803.2390815","url":null,"abstract":"In this article we review the methods we have developed for finding Mechanical Turk participants for the manual annotation of the geo-location of random videos from the web. We require high quality annotations for this project, as we are attempting to establish a human baseline for future comparison to machine systems. This task is different from a standard Mechanical Turk task in that it is difficult for both humans and machines, whereas a standard Mechanical Turk task is usually easy for humans and difficult or impossible for machines. This article discusses the varied difficulties we encountered while qualifying annotators and the steps that we took to select the individuals most likely to do well at our annotation task in the future.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127958732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PodCastle and songle: crowdsourcing-based web services for spoken content retrieval and active music listening","authors":"Masataka Goto, J. Ogata, Kazuyoshi Yoshii, Hiromasa Fujihara, Matthias Mauch, Tomoyasu Nakano","doi":"10.1145/2390803.2390805","DOIUrl":"https://doi.org/10.1145/2390803.2390805","url":null,"abstract":"In this keynote talk, we describe two crowdsourcing-based web services, PodCastle (http://en.podcastle.jp for the English version and http://podcastle.jp for the Japanese version) and Songle (http://songle.jp). PodCastle and Songle collect voluntary contributions by anonymous users in order to improve the experiences of users listening to speech and music content available on the web. These services use automatic speech-recognition and music-understanding technologies to provide content analysis results, such as full-text speech transcriptions and music scene descriptions, that let users enjoy content-based multimedia retrieval and active browsing of speech and music signals without relying on metadata.\u0000 When automatic content analysis is used, however, errors are inevitable. PodCastle and Songle therefore provide an efficient error correction interface that let users easily correct errors by selecting from a list of candidate alternatives. Through these corrections, users gain a real sense of contributing for their own benefit and that of others and can be further motivated to contribute by seeing corrections made by other users.\u0000 Our services promote the popularization and use of speech-recognition and music-understanding technologies by raising user awareness. Users can grasp the nature of those technologies just by seeing results obtained when the technologies applied to speech data and songs available on the web.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124416526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CrowdMM '12Pub Date : 2012-10-29DOI: 10.1145/2390803.2390819
Dominik Henter, Damian Borth, A. Ulges
{"title":"Tag suggestion on youtube by personalizing content-based auto-annotation","authors":"Dominik Henter, Damian Borth, A. Ulges","doi":"10.1145/2390803.2390819","DOIUrl":"https://doi.org/10.1145/2390803.2390819","url":null,"abstract":"We address the challenge of tag recommendation for web video clips on portals such as YouTube. In a quantitative study on 23,000 YouTube videos, we first evaluate different tag suggestion strategies employing user profiling (using tags from the user's upload history) as well as social signals (the channels a user subscribed to) and content analysis. Our results confirm earlier findings that --~at least when employing users' original tags as ground truth~-- a history-based approach outperforms other techniques. Second, we suggest a novel approach that integrates the strengths of history-based tag suggestion with a content matching crowd-sourced from a large repository of user generated videos. Our approach performs a visual similarity matching and merges neighbors found in a large-scale reference dataset of user-tagged content with others from the user's personal history. This way, signals gained by crowd-sourcing can help to disambiguate tag suggestions, for example in cases of heterogeneous user interest profiles or non-existing user history. Our quantitative experiments indicate that such a personalized tag transfer gives strong improvements over a standard content matching, and moderate ones over a content-free history-based ranking.","PeriodicalId":429491,"journal":{"name":"CrowdMM '12","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129684562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}