{"title":"Exploring the Efficiency of Batch Active Learning for Human-in-the-Loop Relation Extraction","authors":"Ismini Lourentzou, D. Gruhl, Steve Welch","doi":"10.1145/3184558.3191546","DOIUrl":"https://doi.org/10.1145/3184558.3191546","url":null,"abstract":"Domain-specific relation extraction requires training data for supervised learning models, and thus, significant labeling effort. Distant supervision is often leveraged for creating large annotated corpora however these methods require handling the inherent noise. On the other hand, active learning approaches can reduce the annotation cost by selecting the most beneficial examples to label in order to learn a good model. The choice of examples can be performed sequentially, i.e. select one example in each iteration, or in batches, i.e. select a set of examples in each iteration. The optimization of the batch size is a practical problem faced in every real-world application of active learning, however it is often treated as a parameter decided in advance. In this work, we study the trade-off between model performance, the number of requested labels in a batch and the time spent in each round for real-time, domain specific relation extraction. Our results show that the use of an appropriate batch size produces competitive performance, even compared to a fully sequential strategy, while reducing the training time dramatically.","PeriodicalId":235572,"journal":{"name":"Companion Proceedings of the The Web Conference 2018","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129927587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuetong Chen, M. Sykora, Thomas W. Jackson, Suzanne Elayan
{"title":"What about Mood Swings: Identifying Depression on Twitter with Temporal Measures of Emotions","authors":"Xuetong Chen, M. Sykora, Thomas W. Jackson, Suzanne Elayan","doi":"10.1145/3184558.3191624","DOIUrl":"https://doi.org/10.1145/3184558.3191624","url":null,"abstract":"Depression is among the most commonly diagnosed mental disorders around the world. With the increasing popularity of online social network platforms and the advances in data science, more research efforts have been spent on understanding mental disorders through social media by analysing linguistic style, sentiment, online social networks and other activity traces. However, the role of basic emotions and their changes over time, have not yet been fully explored in extant work. In this paper, we proposed a novel approach for identifying users with or at risk of depression by incorporating measures of eight basic emotions as features from Twitter posts over time, including a temporal analysis of these features. The results showed that emotion-related expressions can reveal insights of individuals' psychological states and emotions measured from such expressions show predictive power of identifying depression on Twitter. We also demonstrated that the changes in an individual's emotions as measured over time bear additional information and can further improve the effectiveness of emotions as features, hence, improve the performance of our proposed model in this task.","PeriodicalId":235572,"journal":{"name":"Companion Proceedings of the The Web Conference 2018","volume":"409 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125480815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Urban Perception of Commercial Activeness from Satellite Images and Streetscapes","authors":"Wenshan Wang, Su Yang, Zhiyuan He, Minjie Wang, Jiulong Zhang, Weishan Zhang","doi":"10.1145/3184558.3186581","DOIUrl":"https://doi.org/10.1145/3184558.3186581","url":null,"abstract":"People can percept social attributes from streetscapes such as safety, richness, and happiness by means of visual perception, which inspires the research in terms of urban perception. To the best of our knowledge, this is the first work focused on revealing the relationship between visual patterns of satellite images as well as streetscapes and commercial activeness. We propose to make use of bag of features (BoF) in the context of computer vision and sparse representation in the sense of machine learning to predict commercial activeness of urban commercial districts. After obtaining the urban commercial districts via clustering, we predict the commercial activeness degrees of them using four image features, namely, Histogram of Oriented Gradients (HOG), Autoencoder, GIST, and multifractal spectra for satellite images and street view images, respectively. The performance evaluation with four large-scale datasets demonstrates that the presented computational framework can not only predict the commercial activeness with satisfactory precision compared with that based on Point of Interest (POI) data but also discover the visual patterns related.","PeriodicalId":235572,"journal":{"name":"Companion Proceedings of the The Web Conference 2018","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121336449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Véronique Benzaken, Giuseppe Castagna, L. Daynès, Julien Lopez, K. Nguyen, R. Vernoux
{"title":"Language-Integrated Queries: a BOLDR Approach","authors":"Véronique Benzaken, Giuseppe Castagna, L. Daynès, Julien Lopez, K. Nguyen, R. Vernoux","doi":"10.1145/3184558.3185973","DOIUrl":"https://doi.org/10.1145/3184558.3185973","url":null,"abstract":"We present BOLDR, a modular framework that enables the evaluation in databases of queries containing application logic and, in particular, user-defined functions. BOLDR also allows the nesting of queries for different databases of possibly different data models. The framework detects the boundaries of queries present in an application, translates them into an intermediate representation together with the relevant language environment, rewrites them in order to avoid query avalanches and to make the most out of database optimizations, and converts the results back to the application. Our experiments show that the techniques we implemented are applicable to real-world database applications, successfully handling a variety of language-integrated queries with good performances.","PeriodicalId":235572,"journal":{"name":"Companion Proceedings of the The Web Conference 2018","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123128045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Argyriou, G. Papadakis, G. Stamoulis, Efi Karra Taniskidou, Nikiforos Pittaras, George Giannakopoulos, Sergio Albani, M. Lazzarini, E. Angiuli, A. Popescu, Argyros Argyridis, Manolis Koubarakis
{"title":"GeoSensor: On-line Scalable Change and Event Detection over Big Data","authors":"G. Argyriou, G. Papadakis, G. Stamoulis, Efi Karra Taniskidou, Nikiforos Pittaras, George Giannakopoulos, Sergio Albani, M. Lazzarini, E. Angiuli, A. Popescu, Argyros Argyridis, Manolis Koubarakis","doi":"10.1145/3184558.3186984","DOIUrl":"https://doi.org/10.1145/3184558.3186984","url":null,"abstract":"GeoSensor is a novel system that enriches change detection over satellite images with event detection over news items and social media content. GeoSensor faces the major challenges of Big Data: volume (a single satellite image may be a few GBs), variety (its data sources include two different types of satellite images and various types of user-generated content) and veracity, as the accuracy of the end result is crucial for the usefulness of our system. To overcome these three challenges, while offering on-line functionality, GeoSensor comprises a complex architecture that is based on the open-source platform developed in the H2020 project Big Data Europe. Through the presented demonstration, both the effectiveness and the efficiency of GeoSensor's functionalities are highlighted.","PeriodicalId":235572,"journal":{"name":"Companion Proceedings of the The Web Conference 2018","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126316384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structured Knowledge on the Web 7.0","authors":"Steffen Staab, Jens Lehmann, R. Verborgh","doi":"10.1145/3184558.3190666","DOIUrl":"https://doi.org/10.1145/3184558.3190666","url":null,"abstract":"Structured Knowledge on the Web had an intriguing history before it has become successful. We briefly revisit this history, before we go into the longer discussion about how structured knowledge on the Web should be devised such that it benefits even more applications. Core to this discussion will be issues like trust, information infrastructure usability and resilience, promising realms of structured knowledge and principles and practices of data sharing.","PeriodicalId":235572,"journal":{"name":"Companion Proceedings of the The Web Conference 2018","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126108943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Frank F. Xu, Jiawei Han
{"title":"Open Information Extraction with Global Structure Constraints","authors":"Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Frank F. Xu, Jiawei Han","doi":"10.1145/3184558.3186927","DOIUrl":"https://doi.org/10.1145/3184558.3186927","url":null,"abstract":"Extracting entities and their relations from text is an important task for understanding massive text corpora. Open information extraction (IE) systems mine relation tuples (i.e., entity arguments and a predicate string to describe their relation) from sentences. However, current open IE systems ignore the fact that global statistics in a large corpus can be collectively leveraged to identify high-quality sentence-level extractions. In this paper, we propose a novel open IE system, called ReMine, which integrates local context signal and global structural signal in a unified framework with distant supervision. The new system can be efficiently applied to different domains as it uses facts from external knowledge bases as supervision; and can effectively score sentence-level tuple extractions based on corpus-level statistics. Specifically, we design a joint optimization problem to unify (1) segmenting entity/relation phrases in individual sentences based on local context; and (2) measuring the quality of sentence-level extractions with a translating-based objective. Experiments on real-world corpora from different domains demonstrate the effectiveness and robustness of ReMine when compared to other open IE systems.","PeriodicalId":235572,"journal":{"name":"Companion Proceedings of the The Web Conference 2018","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114082811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dayan de França Costa, Nádia Félix Felipe Da Silva
{"title":"INF-UFG at FiQA 2018 Task 1: Predicting Sentiments and Aspects on Financial Tweets and News Headlines","authors":"Dayan de França Costa, Nádia Félix Felipe Da Silva","doi":"10.1145/3184558.3191828","DOIUrl":"https://doi.org/10.1145/3184558.3191828","url":null,"abstract":"This paper describes our system which participate in Task 1 of FiQA 2018. The task's focuses was to predict sentiment and aspects of financial microblog posts and headlines. The sentiment analysis for a specific company had to be predicted using a scale between -1 and 1, while the aspect prediction had to be predicted using a set of aspects which was given in train data. We had used Support Vector Regression (SVR) to predict the sentiments in both cases (microblog posts and headlines).","PeriodicalId":235572,"journal":{"name":"Companion Proceedings of the The Web Conference 2018","volume":"124 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120970773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linked Data for Production (LD4P): a Multi-Institutional Approach to Technical Services Transformation","authors":"Philip E. Schreur","doi":"10.1145/3184558.3186201","DOIUrl":"https://doi.org/10.1145/3184558.3186201","url":null,"abstract":"Linked Data for Production (LD4P) is a collaboration between six institutions (Columbia, Cornell, Harvard, Library of Congress, Princeton, and Stanford) to begin the transition of technical services production workflows from a series of library-centric data formats (MARC) to ones based in Linked Open Data (LOD). This first phase of the transition focuses on the development of the ability to produce metadata as LOD communally, the enhancement of the BIBFRAME ontology to encompass the multiple resource formats that academic libraries must process, and the engagement of the broader academic library community to ensure a sustainable and extensible environment. As its name implies, LD4P focuses on the immediate needs of metadata production such as ontology coverage and workflow transition. The LD4P partners' work will be based, in part, on a collection of tools that currently exist, such as those developed by the Library of Congress. The cyclical feedback of use and enhancement request to the developers of these tools will allow for their enhancement based on use in an actual production environment. The six institutions involved will focus on materials ranging from art to rare books, from cartographic materials to music, from annotations to workflows. Tool development and enhancement will also be a key aspect of the project. By the end of the first phase of this project (Spring 2018), the partners will have the minimal tooling, workflows, and standards developed to begin the transformation from MARC to LOD in Phase 2 of the project.","PeriodicalId":235572,"journal":{"name":"Companion Proceedings of the The Web Conference 2018","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121673385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Procedures from Text: Codifying How-to Procedures in Deep Neural Networks","authors":"Hogun Park, H. M. Nezhad","doi":"10.1145/3184558.3186347","DOIUrl":"https://doi.org/10.1145/3184558.3186347","url":null,"abstract":"A lot of knowledge about procedures and how-tos are described in text. Recently, extracting semantic relations from the procedural text has been actively explored. Prior work mostly has focused on finding relationships among verb-noun pairs or clustering of extracted pairs. In this paper, we investigate the problem of learning individual procedure-specific relationships (e.g. is method of, is alternative of, or is subtask of) among sentences. To identify the relationships, we propose an end-to-end neural network architecture, which can selectively learn important procedure-specific relationships. Using this approach, we could construct a how-to knowledge base from the largest procedure sharing-community, wiki-how.com. The evaluation of our approach shows that it outperforms the existing entity relationship extraction algorithms.","PeriodicalId":235572,"journal":{"name":"Companion Proceedings of the The Web Conference 2018","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124897201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}