Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management最新文献
{"title":"Image clustering fusion technique based on BFS","authors":"Luca Costantini, Raffaele Nicolussi","doi":"10.1145/2063576.2063898","DOIUrl":"https://doi.org/10.1145/2063576.2063898","url":null,"abstract":"With the increasing in number and size of databases dedicated to the storage of visual content, the need for effective retrieval systems has become crucial. The proposed method makes a significant contribution to meet this need through a technique in which sets of clusters are fused together to create an unique and more significant set of clusters. The images are represented by some features and then are grouped by these features, that are considered one by one. A probability matrix is then built and explored by the breadth first search algorithm with the aim of select an unique set of clusters. Experimental results, obtained using two different datasets, show the effectiveness of the proposed technique. Furthermore, the proposed approach overcomes the drawback of tuning a set of parameters that fuse the similarity measurement obtained by each feature to get an overall similarity between two images.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"9 1","pages":"2093-2096"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90436068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantic convolution kernels over dependency trees: smoothed partial tree kernel","authors":"D. Croce, Alessandro Moschitti, Roberto Basili","doi":"10.1145/2063576.2063878","DOIUrl":"https://doi.org/10.1145/2063576.2063878","url":null,"abstract":"In recent years, natural language processing techniques have been used more and more in IR. Among other syntactic and semantic parsing are effective methods for the design of complex applications like for example question answering and sentiment analysis. Unfortunately, extracting feature representations suitable for machine learning algorithms from linguistic structures is typically difficult. In this paper, we describe one of the most advanced piece of technology for automatic engineering of syntactic and semantic patterns. This method merges together convolution dependency tree kernels with lexical similarities. It can efficiently and effectively measure the similarity between dependency structures, whose lexical nodes are in part or completely different. Its use in powerful algorithm such as Support Vector Machines (SVMs) allows for fast design of accurate automatic systems.\u0000 We report some experiments on question classification, which show an unprecedented result, e.g. 41% of error reduction of the former state-of-the-art, along with the analysis of the nice properties of the approach.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"65 1","pages":"2013-2016"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80731405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The list Viterbi training algorithm and its application to keyword search over databases","authors":"Silvia Rota, S. Bergamaschi, F. Guerra","doi":"10.1145/2063576.2063808","DOIUrl":"https://doi.org/10.1145/2063576.2063808","url":null,"abstract":"Hidden Markov Models (HMMs) are today employed in a variety of applications, ranging from speech recognition to bioinformatics. In this paper, we present the List Viterbi training algorithm, a version of the Expectation-Maximization (EM) algorithm based on the List Viterbi algorithm instead of the commonly used forward-backward algorithm. We developed the batch and online versions of the algorithm, and we also describe an interesting application in the context of keyword search over databases, where we exploit a HMM for matching keywords into database terms. In our experiments we tested the online version of the training algorithm in a semi-supervised setting that allows us to take into account the feedbacks provided by the users.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"12 1","pages":"1601-1606"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79662087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Insights into explicit semantic analysis","authors":"Thomas Gottron, Maik Anderka, Benno Stein","doi":"10.1145/2063576.2063865","DOIUrl":"https://doi.org/10.1145/2063576.2063865","url":null,"abstract":"Since its debut the Explicit Semantic Analysis (ESA) has received much attention in the IR community. ESA has been proven to perform surprisingly well in several tasks and in different contexts. However, given the conceptual motivation for ESA, recent work has observed unexpected behavior. In this paper we look at the foundations of ESA from a theoretical point of view and employ a general probabilistic model for term weights which reveals how ESA actually works. Based on this model we explain some of the phenomena that have been observed in previous work and support our findings with new experiments. Moreover, we provide a theoretical grounding on how the size and the composition of the index collection affect the ESA-based computation of similarity values for texts.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"46 1","pages":"1961-1964"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79678444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"YANA: an efficient privacy-preserving recommender system for online social communities","authors":"Dongsheng Li, Q. Lv, L. Shang, Ning Gu","doi":"10.1145/2063576.2063943","DOIUrl":"https://doi.org/10.1145/2063576.2063943","url":null,"abstract":"In online social communities, many recommender systems use collaborative filtering, a method that makes recommendations based on what are liked by other users with similar interests. Serious privacy issues may arise in this process, as sensitive personal information (e.g., content interests) may be collected and disclosed to other parties, especially the recommender server. In this paper, we propose YANA (short for \"you are not alone\"), an efficient group-based privacy-preserving collaborative filtering system for content recommendation in online social communities. We have developed a prototype system on desktop and mobile devices, and evaluated it using real world data. The results demonstrate that YANA can effectively protect users' privacy, while achieving high recommendation quality and energy efficiency.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"23 1","pages":"2269-2272"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83224737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finding redundant and complementary communities in multidimensional networks","authors":"M. Berlingerio, M. Coscia, F. Giannotti","doi":"10.1145/2063576.2063921","DOIUrl":"https://doi.org/10.1145/2063576.2063921","url":null,"abstract":"Community Discovery in networks is the problem of detecting, for each node, its membership to one of more groups of nodes, the communities, that are densely connected, or highly interactive. We define the community discovery problem in multidimensional networks, where more than one connection may reside between any two nodes. We also introduce two measures able to characterize the communities found. Our experiments on real world multidimensional networks support the methodology proposed in this paper, and open the way for a new class of algorithms, aimed at capturing the multifaceted complexity of connections among nodes in a network.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"6 1","pages":"2181-2184"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81173088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Chen, Guoren Wang, Huilin Liu, Junchang Xin, Ye Yuan
{"title":"SISP: a new framework for searching the informative subgraph based on PSO","authors":"Chen Chen, Guoren Wang, Huilin Liu, Junchang Xin, Ye Yuan","doi":"10.1145/2063576.2063645","DOIUrl":"https://doi.org/10.1145/2063576.2063645","url":null,"abstract":"A significant number of applications on graph require the key relations among a group of query nodes. Given a relational graph such as social network or biochemical interaction, an informative subgraph is urgent, which can best explain the relationships among a group of given query nodes. Based on Particle Swarm Optimization (PSO), a new framework of SISP (Searching the Informative Subgraph based on PSO) is proposed. SISP contains three key stages. In the initialization stage, a random spreading method is proposed, which can effectively guarantee the connectivity of the nodes in each particle; In the calculating stage of fitness, a fitness function is designed by incorporating a sign function with the goodness score; In the update stage, the intersection-based particle extension method and rule-based particle compression method are proposed. To evaluate the qualities of returned subgraphs, the appropriate calculating of goodness score is studied. Considering the importance and relevance of a node together, we present the PNR method, which makes the definition of informativeness more reliable and the returned subgraph more satisfying. At last, we present experiments on a real dataset and a synthetic dataset separately. The experimental results confirm that the proposed methods achieve increased accuracy and are efficient for any query set.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"57 1","pages":"453-462"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84809007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mrinmaya Sachan, Danish Contractor, T. Faruquie, L. V. Subramaniam
{"title":"Probabilistic model for discovering topic based communities in social networks","authors":"Mrinmaya Sachan, Danish Contractor, T. Faruquie, L. V. Subramaniam","doi":"10.1145/2063576.2063963","DOIUrl":"https://doi.org/10.1145/2063576.2063963","url":null,"abstract":"Social graphs have received renewed interest as a research topic with the advent of social networking websites. These online networks provide a rich source of data to study user relationships and interaction patterns on a large scale. In this paper, we propose a generative Bayesian model for extracting latent communities from a social graph. We assume that community memberships depend on topics of interest between users and the link relationships between them in the social graph topology. In addition, we make use of the nature of interaction to gauge user interests. Our model allows communities to be related to multiple topics and each user in the graph can be a member of multiple communities. This gives an insight into user interests and topical distribution in communities. We show the effectiveness of our model using a real world data set and also compare our model with existing community discovery methods.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"17 1","pages":"2349-2352"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85296433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Building a generic debugger for information extraction pipelines","authors":"A. Sarma, Alpa Jain, P. Bohannon","doi":"10.1145/2063576.2063933","DOIUrl":"https://doi.org/10.1145/2063576.2063933","url":null,"abstract":"Complex information extraction (IE) pipelines are becoming an integral component of most text processing frameworks. We introduce a first system to help IE users analyze extraction pipeline semantics and operator transformations interactively while debugging. This allows the effort to be proportional to the need, and to focus on the portions of the pipeline under the greatest suspicion. We present a generic debugger for running post-execution analysis of any IE pipeline consisting of arbitrary types of operators. For this, we propose an effective provenance model for IE pipelines which captures a variety of operator types, ranging from those for which full to no specifications are available. We have evaluated our proposed algorithms and provenance model on large-scale real-world extraction pipelines.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"49 1","pages":"2229-2232"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90660444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using games with a purpose and bootstrapping to create domain-specific sentiment lexicons","authors":"A. Weichselbraun, Stefan Gindl, A. Scharl","doi":"10.1145/2063576.2063729","DOIUrl":"https://doi.org/10.1145/2063576.2063729","url":null,"abstract":"Sentiment detection analyzes the positive or negative polarity of text. The field has received considerable attention in recent years, since it plays an important role in providing means to assess user opinions regarding an organization's products, services, or actions. Approaches towards sentiment detection include machine learning techniques as well as computationally less expensive methods. Both approaches rely on the use of language-specific sentiment lexicons, which are lists of sentiment terms with their corresponding sentiment value. The effort involved in creating, customizing, and extending sentiment lexicons is considerable, particularly if less common languages and domains are targeted without access to appropriate language resources. This paper proposes a semi-automatic approach for the creation of sentiment lexicons which assigns sentiment values to sentiment terms via crowd-sourcing. Furthermore, it introduces a bootstrapping process operating on unlabeled domain documents to extend the created lexicons, and to customize them according to the particular use case. This process considers sentiment terms as well as sentiment indicators occurring in the discourse surrounding a articular topic. Such indicators are associated with a positive or negative context in a particular domain, but might have a neutral connotation in other domains. A formal evaluation shows that bootstrapping considerably improves the method's recall. Automatically created lexicons yield a performance comparable to professionally created language resources such as the General Inquirer.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"59 1","pages":"1053-1060"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91015992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}