Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management最新文献
Raúl Ernesto Gutiérrez de Piñerez Reyes, Juan Francisco Díaz-Frías
{"title":"Preprocessing of informal mathematical discourse in context ofcontrolled natural language","authors":"Raúl Ernesto Gutiérrez de Piñerez Reyes, Juan Francisco Díaz-Frías","doi":"10.1145/2396761.2398487","DOIUrl":"https://doi.org/10.1145/2396761.2398487","url":null,"abstract":"Informal Mathematical Discourse (IMD) is characterized by the mixture of natural language and symbolic expressions in the context of textbooks, publications in mathematics and mathematical proof. We focused the IMD processing at the low level of discourse. In this paper, we proposed the preprocessing phase before the IMD structure analysis within the context of Controlled Natural Language (CNL). Our contribution is defined in context of the IMD processing and the use of machine learning; first, we present a CNL, a pure corpus and Matemathical Treebank for processing IMD; second, we present a preprocessing phase for IMD analysis with connectives disambiguation and verbs treatment, finally, we found a satisfactory result on input text parsing using a statistical parsing model. We will propagate these results for classification of argumentative informal practices via the low level discourse in IMD processing.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"468 1","pages":"1632-1636"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78332125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enabling Ontology Based Semantic Queries in Biomedical Database Systems.","authors":"Shuai Zheng, Fusheng Wang, James Lu, Joel Saltz","doi":"10.1145/2396761.2398715","DOIUrl":"10.1145/2396761.2398715","url":null,"abstract":"<p><p>While current biomedical ontology repositories offer primitive query capabilities, it is difficult or cumbersome to support ontology based semantic queries directly in semantically annotated biomedical databases. The problem may be largely attributed to the mismatch between the models of the ontologies and the databases, and the mismatch between the query interfaces of the two systems. To fully realize semantic query capabilities based on ontologies, we develop a system DBOntoLink to provide unified semantic query interfaces by extending database query languages. With DBOntoLink, semantic queries can be directly and naturally specified as extended functions of the database query languages without any programming needed. DBOntoLink is adaptable to different ontologies through customizations and supports major biomedical ontologies hosted at the NCBO BioPortal. We demonstrate the use of DBOntoLink in a real world biomedical database with semantically annotated medical image annotations.</p>","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":" ","pages":"2651-2654"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3567445/pdf/nihms-436207.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"31325251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mirwaes Wahabzada, K. Kersting, A. Pilz, C. Bauckhage
{"title":"More influence means less work: fast latent dirichlet allocation by influence scheduling","authors":"Mirwaes Wahabzada, K. Kersting, A. Pilz, C. Bauckhage","doi":"10.1145/2063576.2063944","DOIUrl":"https://doi.org/10.1145/2063576.2063944","url":null,"abstract":"There have recently been considerable advances in fast inference for (online) latent Dirichlet allocation (LDA). While it is widely recognized that the scheduling of documents in stochastic optimization and in turn in LDA may have significant consequences, this issue remains largely unexplored. Instead, practitioners schedule documents essentially uniformly at random, due perhaps to ease of implementation, and to the lack of clear guidelines on scheduling the documents.\u0000 In this work, we address this issue and propose to schedule documents for an update that exert a disproportionately large influence on the topics of the corpus before less influential ones. More precisely, we justify to sample documents randomly biased towards those ones with higher norms to form mini-batches. On several real-world datasets, including 3M articles from Wikipedia and 8M from PubMed, we demonstrate that the resulting influence scheduled LDA can handily analyze massive document collections and find topic models as good or better than those found with online LDA, often at a fraction of time.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"27 1","pages":"2273-2276"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74125151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Examining the \"leftness\" property of Wikipedia categories","authors":"Karl Gyllstrom, Marie-Francine Moens","doi":"10.1145/2063576.2063953","DOIUrl":"https://doi.org/10.1145/2063576.2063953","url":null,"abstract":"Wikipedia's rich category structure has helped make it one of the largest semantic taxonomies in existence, a property that has been central to much recent research. However, Wikipedia's category representation is simplistic: an article contains a single list of categories, with no data about their relative importance. We investigate the ordering of category lists to determine how a category's position in the list correlates with its relevance to the article and overall significance. We identify a number of interesting connections between a category's position and its persistence within the article, age, popularity, size, and descriptiveness.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"38 1","pages":"2309-2312"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74456995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large-scale behavioral targeting with a social twist","authors":"Kun Liu, Lei Tang","doi":"10.1145/2063576.2063838","DOIUrl":"https://doi.org/10.1145/2063576.2063838","url":null,"abstract":"Behavioral targeting (BT) is a widely used technique for online advertising. It leverages information collected on an individual's web-browsing behavior, such as page views, search queries and ad clicks, to select the ads most relevant to user to display. With the proliferation of social networks, it is possible to relate the behavior of individuals and their social connections. Although the similarity among connected individuals are well established (i.e., homophily), it is still not clear whether and how we can leverage the activities of one's friends for behavioral targeting; whether forecasts derived from such social information are more accurate than standard behavioral targeting models. In this paper, we strive to answer these questions by evaluating the predictive power of social data across 60 consumer domains on a large online network of over 180 million users in a period of two and a half months. To our best knowledge, this is the most comprehensive study of social data in the context of behavioral targeting on such an unprecedented scale. Our analysis offers interesting insights into the value of social data for developing the next generation of targeting services.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"1 1","pages":"1815-1824"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73271046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Cleuziou, D. Buscaldi, Vincent Levorato, G. Dias
{"title":"A pretopological framework for the automatic construction of lexical-semantic structures from texts","authors":"G. Cleuziou, D. Buscaldi, Vincent Levorato, G. Dias","doi":"10.1145/2063576.2063990","DOIUrl":"https://doi.org/10.1145/2063576.2063990","url":null,"abstract":"We present in this paper a new approach for the automatic generation of lexical structures from texts. This tedious task is based on the strong hypothesis that simple statistical observations on textual usages can provide pieces of semantics about the lexicon. Using such \"naive\" observations only, we propose a (pre)-topological framework to formalize and combine various hypothesis on textual data usages and then to derive a structure similar to usual lexical knowledge basis such as WordNet. In addition we also consider the evaluation problem for obtained lexical structures ; a multi-level evaluation strategy is proposed that measures the fitting between a given reference structure and automatically generated structures on different point of views : intrinsic/structural and application-based points of view. The evaluation strategy is then used to quantify the contribution of the new structuring approach with respect to the corresponding solution proposed by (Sanderson et al. 2000) on two case studies that differs on the domain and the size of the lexicon.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"19 1","pages":"2453-2456"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75321984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical information retrieval modelling: from the probability ranking principle to recent advances in diversity, portfolio theory, and beyond","authors":"Jun Wang, Kevyn Collins-Thompson","doi":"10.1145/2063576.2064033","DOIUrl":"https://doi.org/10.1145/2063576.2064033","url":null,"abstract":"Statistical modelling of Information Retrieval (IR) systems is a key driving force in the development of the IR field. The goal of this tutorial is to provide a comprehensive and up-to-date introduction to statistical IR modelling. We take a fresh and systematic perspective from the viewpoint of portfolio theory of IR and risk management. A unified treatment and new insights will be given to reflect the recent developments of considering the ranked retrieval results as a whole. Recent research progress in diversification, risk management, and portfolio theory will be covered, in addition to classic methods such as Maron and Kuhns' Probabilistic Indexing, Robertson-Sparck Jones model (and the resulting BM25 formula) and language modelling approaches. The tutorial also reviews the resulting practical algorithms of risk-aware query expansion, diverse ranking, IR metric optimization as well as their performance evaluations. Practical IR applications such as web search, multimedia retrieval, and collaborative filtering are also introduced, as well as discussion of new opportunities for future research and applications that intersect among information retrieval, knowledge management, and databases.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"13 29 1","pages":"2603-2604"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78662160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Naoki Tani, Danushka Bollegala, N. P. Chandrasiri, Keisuke Okamoto, Kazunari Nawa, S. Iitsuka, Y. Matsuo
{"title":"Collaborative exploratory search in real-world context","authors":"Naoki Tani, Danushka Bollegala, N. P. Chandrasiri, Keisuke Okamoto, Kazunari Nawa, S. Iitsuka, Y. Matsuo","doi":"10.1145/2063576.2063909","DOIUrl":"https://doi.org/10.1145/2063576.2063909","url":null,"abstract":"We propose Collaborative Exploratory Search (CES), which is an integration of dialog analysis and web search that involves multiparty collaboration to accomplish an exploratory information retrieval goal. Given a real-time dialog between users on a single topic; we define CES as the task of automatically detecting the topic of the dialog and retrieving task-relevant web pages to support the dialog. To recognize the task of the dialog, we apply the Author--Topic model as a topic model. Then, attribute extraction is applied to the dialog to obtain the attributes of the tasks. Finally, a specific search query is generated to identify the task-relevant information. We implement and evaluate the CES system for a commercial in-vehicle conversation. We also develop an iPad application that listens to conversations among users and continuously retrieves relevant web pages. Our experimental results reveal that the proposed method outperforms existing methods, which demonstrates the potential usefulness of collaborative exploratory search with practically usable accuracy levels.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"8 1","pages":"2137-2140"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78398134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spreadsheet-based complex data transformation","authors":"Vu Hung, B. Benatallah, Régis Saint-Paul","doi":"10.1145/2063576.2063829","DOIUrl":"https://doi.org/10.1145/2063576.2063829","url":null,"abstract":"Spreadsheets are used by millions of users as a routine all-purpose data management tool. It is now increasingly necessary for external applications and services to consume spreadsheet data. In this paper, we investigate the problem of transforming spreadsheet data to structured formats required by these applications and services. Unlike prior methods, we propose a novel approach in which transformation logic is embedded into a familiar and expressive spreadsheet-like formula mapping language. Popular transformation patterns provided by transformation languages and mapping tools, that are relevant to spreadsheet-based data transformation, are supported in the language via formulas. Consequently, the language avoids cluttering the source spreadsheets with transformations and turns out to be helpful when multiple schemas are targeted. We implemented a prototype and evaluated the benefits of our approach via experiments in a real application. The experimental results confirmed the benefits of our approach.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"77 ","pages":"1749-1754"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2063576.2063829","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72430786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Social ranking for spoken web search","authors":"Shrey Sahay, Nitendra Rajput, Niketan Pansare","doi":"10.1145/2063576.2063840","DOIUrl":"https://doi.org/10.1145/2063576.2063840","url":null,"abstract":"Spoken Web is an alternative Web for low-literacy users in the developing world. People can create audio content over phone and share on the Spoken Web. This enables easy creation of locally relevant content. Even on the World Wide Web in developed regions, the recent increase in traffic is due to the locally relevant content created on social networking sites. This paper argues that content search and ranking in the new scenario needs a re-look. The generic model of using in-links for ranking such content is not an appropriate measure of the content relevance in such a collaborative Web 2.0 world. This paper aims to bring the social context in Spoken Web ranking. We formulate a relationship function between the query-creator and the content-creator and use this as one measure of the content relevance to the user. The relationship function uses the geographical location of the two people and their prior browsing preferences as parameters to determine the relationship between the two users. Further we also determine the trustability of the content based on the content creator's acceptance measure by the social network. We use these two features in addition to the term-frequency - inverse-term-frequency match to rank the search results in context of the social network of the query-creator and provide a more specific and socially relevant result to the user.","PeriodicalId":74507,"journal":{"name":"Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management","volume":"4 4 1","pages":"1835-1840"},"PeriodicalIF":0.0,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75941534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}