{"title":"带有属性标签的短文本学习表示的混合分布式模型","authors":"Shashi Kumar, S. Roy, Vishal Pathak","doi":"10.1145/3371158.3371195","DOIUrl":null,"url":null,"abstract":"Short text documents in real-world applications, such as incident tickets, bug tickets, feedback texts etc. contain fixed field entries in the form of certain attribute instances as well as free text entries capturing the summaries of them. We propose an approach based on the Paragraph Vector (due to Le and Mikolov) to learn fixed-length feature representation from these short texts of varying lengths appended with attribute instances. Our method contributes to the existing approach by learning representation from summary of tickets as well as their attribute contents captured using fixed field entries. Further we show such representation of short texts produce better performance on a few learning tasks compared to the other popular representations.","PeriodicalId":360747,"journal":{"name":"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Hybrid Distributed Model for Learning Representation of Short Texts with Attribute Labels\",\"authors\":\"Shashi Kumar, S. Roy, Vishal Pathak\",\"doi\":\"10.1145/3371158.3371195\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Short text documents in real-world applications, such as incident tickets, bug tickets, feedback texts etc. contain fixed field entries in the form of certain attribute instances as well as free text entries capturing the summaries of them. We propose an approach based on the Paragraph Vector (due to Le and Mikolov) to learn fixed-length feature representation from these short texts of varying lengths appended with attribute instances. Our method contributes to the existing approach by learning representation from summary of tickets as well as their attribute contents captured using fixed field entries. Further we show such representation of short texts produce better performance on a few learning tasks compared to the other popular representations.\",\"PeriodicalId\":360747,\"journal\":{\"name\":\"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD\",\"volume\":\"102 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3371158.3371195\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3371158.3371195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Hybrid Distributed Model for Learning Representation of Short Texts with Attribute Labels
Short text documents in real-world applications, such as incident tickets, bug tickets, feedback texts etc. contain fixed field entries in the form of certain attribute instances as well as free text entries capturing the summaries of them. We propose an approach based on the Paragraph Vector (due to Le and Mikolov) to learn fixed-length feature representation from these short texts of varying lengths appended with attribute instances. Our method contributes to the existing approach by learning representation from summary of tickets as well as their attribute contents captured using fixed field entries. Further we show such representation of short texts produce better performance on a few learning tasks compared to the other popular representations.