NUT@EMNLP最新文献

筛选
英文 中文
Learning to Define Terms in the Software Domain 学习在软件领域定义术语
NUT@EMNLP Pub Date : 2018-11-01 DOI: 10.18653/v1/W18-6122
Vidhisha Balachandran, Dheeraj Rajagopal, R. Catherine, William W. Cohen
{"title":"Learning to Define Terms in the Software Domain","authors":"Vidhisha Balachandran, Dheeraj Rajagopal, R. Catherine, William W. Cohen","doi":"10.18653/v1/W18-6122","DOIUrl":"https://doi.org/10.18653/v1/W18-6122","url":null,"abstract":"One way to test a person’s knowledge of a domain is to ask them to define domain-specific terms. Here, we investigate the task of automatically generating definitions of technical terms by reading text from the technical domain. Specifically, we learn definitions of software entities from a large corpus built from the user forum Stack Overflow. To model definitions, we train a language model and incorporate additional domain-specific information like word co-occurrence, and ontological category information. Our approach improves previous baselines by 2 BLEU points for the definition generation task. Our experiments also show the additional challenges associated with the task and the short-comings of language-model based architectures for definition generation.","PeriodicalId":207795,"journal":{"name":"NUT@EMNLP","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117089901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Assigning people to tasks identified in email: The EPA dataset for addressee tagging for detected task intent 将人员分配到电子邮件中确定的任务:用于检测任务意图的收件人标记的EPA数据集
NUT@EMNLP Pub Date : 2018-11-01 DOI: 10.18653/v1/W18-6104
Revanth Rameshkumar, P. Bailey, Abhishek Jha, Chris Quirk
{"title":"Assigning people to tasks identified in email: The EPA dataset for addressee tagging for detected task intent","authors":"Revanth Rameshkumar, P. Bailey, Abhishek Jha, Chris Quirk","doi":"10.18653/v1/W18-6104","DOIUrl":"https://doi.org/10.18653/v1/W18-6104","url":null,"abstract":"We describe the Enron People Assignment (EPA) dataset, in which tasks that are described in emails are associated with the person(s) responsible for carrying out these tasks. We identify tasks and the responsible people in the Enron email dataset. We define evaluation methods for this challenge and report scores for our model and naïve baselines. The resulting model enables a user experience operating within a commercial email service: given a person and a task, it determines if the person should be notified of the task.","PeriodicalId":207795,"journal":{"name":"NUT@EMNLP","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114902082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling Student Response Times: Towards Efficient One-on-one Tutoring Dialogues 模拟学生的反应时间:实现有效的一对一辅导对话
NUT@EMNLP Pub Date : 2018-11-01 DOI: 10.18653/v1/W18-6117
Luciana Benotti, J. Bhaskaran, Sigtryggur Kjartansson, David Lang
{"title":"Modeling Student Response Times: Towards Efficient One-on-one Tutoring Dialogues","authors":"Luciana Benotti, J. Bhaskaran, Sigtryggur Kjartansson, David Lang","doi":"10.18653/v1/W18-6117","DOIUrl":"https://doi.org/10.18653/v1/W18-6117","url":null,"abstract":"In this paper we investigate the task of modeling how long it would take a student to respond to a tutor question during a tutoring dialogue. Solving such a task has applications in educational settings such as intelligent tutoring systems, as well as in platforms that help busy human tutors to keep students engaged. Knowing how long it would normally take a student to respond to different types of questions could help tutors optimize their own time while answering multiple dialogues concurrently, as well as deciding when to prompt a student again. We study this problem using data from a service that offers tutor support for math, chemistry and physics through an instant messaging platform. We create a dataset of 240K questions. We explore several strong baselines for this task and compare them with human performance.","PeriodicalId":207795,"journal":{"name":"NUT@EMNLP","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129544665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Distantly Supervised Attribute Detection from Reviews 远程监督属性检测从评论
NUT@EMNLP Pub Date : 2018-11-01 DOI: 10.18653/v1/W18-6110
Lisheng Fu, Pablo Barrio
{"title":"Distantly Supervised Attribute Detection from Reviews","authors":"Lisheng Fu, Pablo Barrio","doi":"10.18653/v1/W18-6110","DOIUrl":"https://doi.org/10.18653/v1/W18-6110","url":null,"abstract":"This work aims to detect specific attributes of a place (e.g., if it has a romantic atmosphere, or if it offers outdoor seating) from its user reviews via distant supervision: without direct annotation of the review text, we use the crowdsourced attribute labels of the place as labels of the review text. We then use review-level attention to pay more attention to those reviews related to the attributes. The experimental results show that our attention-based model predicts attributes for places from reviews with over 98% accuracy. The attention weights assigned to each review provide explanation of capturing relevant reviews.","PeriodicalId":207795,"journal":{"name":"NUT@EMNLP","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128912252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Author Embeddings to Improve Tweet Stance Classification 使用作者嵌入改进推文姿态分类
NUT@EMNLP Pub Date : 2018-11-01 DOI: 10.18653/v1/W18-6124
Adrian Benton, Mark Dredze
{"title":"Using Author Embeddings to Improve Tweet Stance Classification","authors":"Adrian Benton, Mark Dredze","doi":"10.18653/v1/W18-6124","DOIUrl":"https://doi.org/10.18653/v1/W18-6124","url":null,"abstract":"Many social media classification tasks analyze the content of a message, but do not consider the context of the message. For example, in tweet stance classification – where a tweet is categorized according to a viewpoint it espouses – the expressed viewpoint depends on latent beliefs held by the user. In this paper we investigate whether incorporating knowledge about the author can improve tweet stance classification. Furthermore, since author information and embeddings are often unavailable for labeled training examples, we propose a semi-supervised pretraining method to predict user embeddings. Although the neural stance classifiers we learn are often outperformed by a baseline SVM, author embedding pre-training yields improvements over a non-pre-trained neural network on four out of five domains in the SemEval 2016 6A tweet stance classification task. In a tweet gun control stance classification dataset, improvements from pre-training are only apparent when training data is limited.","PeriodicalId":207795,"journal":{"name":"NUT@EMNLP","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127003330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Detecting Code-Switching between Turkish-English Language Pair 土耳其语-英语语言对的语码转换检测
NUT@EMNLP Pub Date : 2018-11-01 DOI: 10.18653/v1/W18-6115
Zeynep Yi̇rmi̇beşoğlu, Gülşen Eryiğit
{"title":"Detecting Code-Switching between Turkish-English Language Pair","authors":"Zeynep Yi̇rmi̇beşoğlu, Gülşen Eryiğit","doi":"10.18653/v1/W18-6115","DOIUrl":"https://doi.org/10.18653/v1/W18-6115","url":null,"abstract":"Code-switching (usage of different languages within a single conversation context in an alternative manner) is a highly increasing phenomenon in social media and colloquial usage which poses different challenges for natural language processing. This paper introduces the first study for the detection of Turkish-English code-switching and also a small test data collected from social media in order to smooth the way for further studies. The proposed system using character level n-grams and conditional random fields (CRFs) obtains 95.6% micro-averaged F1-score on the introduced test data set.","PeriodicalId":207795,"journal":{"name":"NUT@EMNLP","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126986378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Content Extraction and Lexical Analysis from Customer-Agent Interactions 客户-座席交互中的内容提取和词法分析
NUT@EMNLP Pub Date : 2018-11-01 DOI: 10.18653/v1/W18-6118
Sergiu Nisioi, A. Bucur, Liviu P. Dinu
{"title":"Content Extraction and Lexical Analysis from Customer-Agent Interactions","authors":"Sergiu Nisioi, A. Bucur, Liviu P. Dinu","doi":"10.18653/v1/W18-6118","DOIUrl":"https://doi.org/10.18653/v1/W18-6118","url":null,"abstract":"In this paper, we provide a lexical comparative analysis of the vocabulary used by customers and agents in an Enterprise Resource Planning (ERP) environment and a potential solution to clean the data and extract relevant content for NLP. As a result, we demonstrate that the actual vocabulary for the language that prevails in the ERP conversations is highly divergent from the standardized dictionary and further different from general language usage as extracted from the Common Crawl corpus. Moreover, in specific business communication circumstances, where it is expected to observe a high usage of standardized language, code switching and non-standard expression are predominant, emphasizing once more the discrepancy between the day-to-day use of language and the standardized one.","PeriodicalId":207795,"journal":{"name":"NUT@EMNLP","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121630742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How do you correct run-on sentences it’s not as easy as it seems 你如何纠正连句?这并不像看起来那么容易
NUT@EMNLP Pub Date : 2018-09-01 DOI: 10.18653/v1/W18-6105
Junchao Zheng, Courtney Napoles, Joel R. Tetreault, Kostiantyn Omelianchuk
{"title":"How do you correct run-on sentences it’s not as easy as it seems","authors":"Junchao Zheng, Courtney Napoles, Joel R. Tetreault, Kostiantyn Omelianchuk","doi":"10.18653/v1/W18-6105","DOIUrl":"https://doi.org/10.18653/v1/W18-6105","url":null,"abstract":"Run-on sentences are common grammatical mistakes but little research has tackled this problem to date. This work introduces two machine learning models to correct run-on sentences that outperform leading methods for related tasks, punctuation restoration and whole-sentence grammatical error correction. Due to the limited annotated data for this error, we experiment with artificially generating training data from clean newswire text. Our findings suggest artificial training data is viable for this task. We discuss implications for correcting run-ons and other types of mistakes that have low coverage in error-annotated corpora.","PeriodicalId":207795,"journal":{"name":"NUT@EMNLP","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128692731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Language Identification in Code-Mixed Data using Multichannel Neural Networks and Context Capture 基于多通道神经网络和上下文捕获的代码混合数据语言识别
NUT@EMNLP Pub Date : 2018-08-21 DOI: 10.18653/v1/W18-6116
Soumil Mandal, Anil Kumar Singh
{"title":"Language Identification in Code-Mixed Data using Multichannel Neural Networks and Context Capture","authors":"Soumil Mandal, Anil Kumar Singh","doi":"10.18653/v1/W18-6116","DOIUrl":"https://doi.org/10.18653/v1/W18-6116","url":null,"abstract":"An accurate language identification tool is an absolute necessity for building complex NLP systems to be used on code-mixed data. Lot of work has been recently done on the same, but there’s still room for improvement. Inspired from the recent advancements in neural network architectures for computer vision tasks, we have implemented multichannel neural networks combining CNN and LSTM for word level language identification of code-mixed data. Combining this with a Bi-LSTM-CRF context capture module, accuracies of 93.28% and 93.32% is achieved on our two testing sets.","PeriodicalId":207795,"journal":{"name":"NUT@EMNLP","volume":"320 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116428475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Orthogonal Matching Pursuit for Text Classification 文本分类的正交匹配追踪
NUT@EMNLP Pub Date : 2018-07-01 DOI: 10.18653/v1/w18-6113
Konstantinos Skianis, Nikolaos Tziortziotis, M. Vazirgiannis
{"title":"Orthogonal Matching Pursuit for Text Classification","authors":"Konstantinos Skianis, Nikolaos Tziortziotis, M. Vazirgiannis","doi":"10.18653/v1/w18-6113","DOIUrl":"https://doi.org/10.18653/v1/w18-6113","url":null,"abstract":"In text classification, the problem of overfitting arises due to the high dimensionality, making regularization essential. Although classic regularizers provide sparsity, they fail to return highly accurate models. On the contrary, state-of-the-art group-lasso regularizers provide better results at the expense of low sparsity. In this paper, we apply a greedy variable selection algorithm, called Orthogonal Matching Pursuit, for the text classification task. We also extend standard group OMP by introducing overlapping Group OMP to handle overlapping groups of features. Empirical analysis verifies that both OMP and overlapping GOMP constitute powerful regularizers, able to produce effective and very sparse models. Code and data are available online.","PeriodicalId":207795,"journal":{"name":"NUT@EMNLP","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122205532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信