{"title":"Improving open-domain event schema discovery with casual english normalization for noisy text","authors":"Assia Mezhar, A. Mzabi, M. Ramdani","doi":"10.1109/INTELLISYS.2017.8324323","DOIUrl":null,"url":null,"abstract":"Social media enable people to share significant events from their daily life. Social data mining evolves the challenge of dealing with casual language extraction due to the unstructured social media content: social media users often prefer communicating unconventionally with informal language using abbreviations, slang, misspelled words, or non-standard short-forms. Thereby, this paper proposes a new open-domain event schema discovery approach using casual language normalization to normalize, extract events and discover their adequate schemas (event types and argument roles) from noisy corpus. The proposed approach exploits casual language normalization to improve both tasks of event schema discovery and event extraction. This approach can automatically normalize and generate high-quality schemas from the extracted events with unknown types. The introduced approach promises better results in terms of accuracy and quality of the discovered schemas.","PeriodicalId":131825,"journal":{"name":"2017 Intelligent Systems Conference (IntelliSys)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Intelligent Systems Conference (IntelliSys)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INTELLISYS.2017.8324323","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Social media enable people to share significant events from their daily life. Social data mining evolves the challenge of dealing with casual language extraction due to the unstructured social media content: social media users often prefer communicating unconventionally with informal language using abbreviations, slang, misspelled words, or non-standard short-forms. Thereby, this paper proposes a new open-domain event schema discovery approach using casual language normalization to normalize, extract events and discover their adequate schemas (event types and argument roles) from noisy corpus. The proposed approach exploits casual language normalization to improve both tasks of event schema discovery and event extraction. This approach can automatically normalize and generate high-quality schemas from the extracted events with unknown types. The introduced approach promises better results in terms of accuracy and quality of the discovered schemas.