João Filgueiras, Luís Barbosa, Gil Rocha, Henrique Lopes Cardoso, Luís Paulo Reis, J. Machado, Ana Maria Oliveira
{"title":"Complaint Analysis and Classification for Economic and Food Safety","authors":"João Filgueiras, Luís Barbosa, Gil Rocha, Henrique Lopes Cardoso, Luís Paulo Reis, J. Machado, Ana Maria Oliveira","doi":"10.18653/v1/D19-5107","DOIUrl":"https://doi.org/10.18653/v1/D19-5107","url":null,"abstract":"Governmental institutions are employing artificial intelligence techniques to deal with their specific problems and exploit their huge amounts of both structured and unstructured information. In particular, natural language processing and machine learning techniques are being used to process citizen feedback. In this paper, we report on the use of such techniques for analyzing and classifying complaints, in the context of the Portuguese Economic and Food Safety Authority. Grounded in its operational process, we address three different classification problems: target economic activity, implied infraction severity level, and institutional competence. We show promising results obtained using feature-based approaches and traditional classifiers, with accuracy scores above 70%, and analyze the shortcomings of our current results and avenues for further improvement, taking into account the intended use of our classifiers in helping human officers to cope with thousands of yearly complaints.","PeriodicalId":119881,"journal":{"name":"Proceedings of the Second Workshop on Economics and Natural Language Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124363008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Forecasting Firm Material Events from 8-K Reports","authors":"Shuang (Sophie) Zhai, Zhu Zhang","doi":"10.18653/v1/D19-5104","DOIUrl":"https://doi.org/10.18653/v1/D19-5104","url":null,"abstract":"In this paper, we show deep learning models can be used to forecast firm material event sequences based on the contents in the company’s 8-K Current Reports. Specifically, we exploit state-of-the-art neural architectures, including sequence-to-sequence (Seq2Seq) architecture and attention mechanisms, in the model. Our 8K-powered deep learning model demonstrates promising performance in forecasting firm future event sequences. The model is poised to benefit various stakeholders, including management and investors, by facilitating risk management and decision making.","PeriodicalId":119881,"journal":{"name":"Proceedings of the Second Workshop on Economics and Natural Language Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126180575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Berke Oral, Erdem Emekligil, S. Arslan, Gülşen Eryiğit
{"title":"Extracting Complex Relations from Banking Documents","authors":"Berke Oral, Erdem Emekligil, S. Arslan, Gülşen Eryiğit","doi":"10.18653/v1/D19-5101","DOIUrl":"https://doi.org/10.18653/v1/D19-5101","url":null,"abstract":"In order to automate banking processes (e.g. payments, money transfers, foreign trade), we need to extract banking transactions from different types of mediums such as faxes, e-mails, and scanners. Banking orders may be considered as complex documents since they contain quite complex relations compared to traditional datasets used in relation extraction research. In this paper, we present our method to extract intersentential, nested and complex relations from banking orders, and introduce a relation extraction method based on maximal clique factorization technique. We demonstrate 11% error reduction over previous methods.","PeriodicalId":119881,"journal":{"name":"Proceedings of the Second Workshop on Economics and Natural Language Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133158663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jared Rivera, Jan Caleb Oliver Pensica, Jolene Valenzuela, Alfonso Secuya, C. Cheng
{"title":"Annotation Process for the Dialog Act Classification of a Taglish E-commerce Q&A Corpus","authors":"Jared Rivera, Jan Caleb Oliver Pensica, Jolene Valenzuela, Alfonso Secuya, C. Cheng","doi":"10.18653/v1/D19-5108","DOIUrl":"https://doi.org/10.18653/v1/D19-5108","url":null,"abstract":"With conversational agents or chatbots making up in quantity of replies rather than quality, the need to identify user intent has become a main concern to improve these agents. Dialog act (DA) classification tackles this concern, and while existing studies have already addressed DA classification in general contexts, no training corpora in the context of e-commerce is available to the public. This research addressed the said insufficiency by building a text-based corpus of 7,265 posts from the question and answer section of products on Lazada Philippines. The SWBD-DAMSL tagset for DA classification was modified to 28 tags fitting the categories applicable to e-commerce conversations. The posts were annotated manually by three (3) human annotators and preprocessing techniques decreased the vocabulary size from 6,340 to 1,134. After analysis, the corpus was composed dominantly of single-label posts, with 34% of the corpus having multiple intent tags. The annotated corpus allowed insights toward the structure of posts created with single to multiple intents.","PeriodicalId":119881,"journal":{"name":"Proceedings of the Second Workshop on Economics and Natural Language Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129806346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}