P. Madhyastha, Antigoni-Maria Founta, Lucia Specia
{"title":"A study towards contextual understanding of toxicity in online conversations","authors":"P. Madhyastha, Antigoni-Maria Founta, Lucia Specia","doi":"10.1017/s1351324923000414","DOIUrl":null,"url":null,"abstract":"\n Identifying and annotating toxic online content on social media platforms is an extremely challenging problem. Work that studies toxicity in online content has predominantly focused on comments as independent entities. However, comments on social media are inherently conversational, and therefore, understanding and judging the comments fundamentally requires access to the context in which they are made. We introduce a study and resulting annotated dataset where we devise a number of controlled experiments on the importance of context and other observable confounders – namely gender, age and political orientation – towards the perception of toxicity in online content. Our analysis clearly shows the significance of context and the effect of observable confounders on annotations. Namely, we observe that the ratio of toxic to non-toxic judgements can be very different for each control group, and a higher proportion of samples are judged toxic in the presence of contextual information.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1017/s1351324923000414","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Identifying and annotating toxic online content on social media platforms is an extremely challenging problem. Work that studies toxicity in online content has predominantly focused on comments as independent entities. However, comments on social media are inherently conversational, and therefore, understanding and judging the comments fundamentally requires access to the context in which they are made. We introduce a study and resulting annotated dataset where we devise a number of controlled experiments on the importance of context and other observable confounders – namely gender, age and political orientation – towards the perception of toxicity in online content. Our analysis clearly shows the significance of context and the effect of observable confounders on annotations. Namely, we observe that the ratio of toxic to non-toxic judgements can be very different for each control group, and a higher proportion of samples are judged toxic in the presence of contextual information.
期刊介绍:
Natural Language Engineering meets the needs of professionals and researchers working in all areas of computerised language processing, whether from the perspective of theoretical or descriptive linguistics, lexicology, computer science or engineering. Its aim is to bridge the gap between traditional computational linguistics research and the implementation of practical applications with potential real-world use. As well as publishing research articles on a broad range of topics - from text analysis, machine translation, information retrieval and speech analysis and generation to integrated systems and multi modal interfaces - it also publishes special issues on specific areas and technologies within these topics, an industry watch column and book reviews.