Towards a Retrospective One-Class Oriented Approach to Parents Detection in Social Media

2020 27th Conference of Open Innovations Association (FRUCT) Pub Date : 2020-09-01 DOI:10.23919/fruct49677.2020.9211021

Alexander Egorov, T. Sokhin, N. Butakov

{"title":"Towards a Retrospective One-Class Oriented Approach to Parents Detection in Social Media","authors":"Alexander Egorov, T. Sokhin, N. Butakov","doi":"10.23919/fruct49677.2020.9211021","DOIUrl":null,"url":null,"abstract":"Social media is the source of data for different purposes: advertisement, social study, human recruiting. However, usually, we are limited to readily available, structured information: age, gender, education, occupation. We have to work with unstructured data such as texts related to a user if we want to extract more complex, implicit features. We show the case of complex user analysis in social media using textual data. The task we solve is detecting parents on social networks. Our approach works with content that is not generated by a user, but with the content, the user was interested in implicitly - the user liked, or explicitly - the user subscribed to a group, where the content was published. In this paper, we compare classification methods for the task of parents detection on social media. Using mentioned above user’s likes and other information it is required to estimate chances if a user has got a child or children already or not. This task is an example of positive-unlabeled learning: data from social networks and media may contain explicit signals about users’ parenthood but there is no ground to make a backward conclusion. It can be considered as a case of look-a-like modelling or in other words a one-class classification problem. We propose a retrospective approach that can exploit data from social media to allow building a binary classifier. We compare both these approaches and conclude that the retrospective approach albeit requiring more efforts to be implemented may yield better results. This approach may be useful in similar tasks having look-a-like problem statement.","PeriodicalId":149674,"journal":{"name":"2020 27th Conference of Open Innovations Association (FRUCT)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 27th Conference of Open Innovations Association (FRUCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/fruct49677.2020.9211021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Social media is the source of data for different purposes: advertisement, social study, human recruiting. However, usually, we are limited to readily available, structured information: age, gender, education, occupation. We have to work with unstructured data such as texts related to a user if we want to extract more complex, implicit features. We show the case of complex user analysis in social media using textual data. The task we solve is detecting parents on social networks. Our approach works with content that is not generated by a user, but with the content, the user was interested in implicitly - the user liked, or explicitly - the user subscribed to a group, where the content was published. In this paper, we compare classification methods for the task of parents detection on social media. Using mentioned above user’s likes and other information it is required to estimate chances if a user has got a child or children already or not. This task is an example of positive-unlabeled learning: data from social networks and media may contain explicit signals about users’ parenthood but there is no ground to make a backward conclusion. It can be considered as a case of look-a-like modelling or in other words a one-class classification problem. We propose a retrospective approach that can exploit data from social media to allow building a binary classifier. We compare both these approaches and conclude that the retrospective approach albeit requiring more efforts to be implemented may yield better results. This approach may be useful in similar tasks having look-a-like problem statement.

查看原文本刊更多论文

社交媒体中家长检测的回溯性单班导向方法

社交媒体是不同目的的数据来源:广告，社会研究，人力招聘。然而，通常情况下，我们受限于现成的、结构化的信息:年龄、性别、教育程度、职业。如果我们想要提取更复杂、隐含的特征，我们必须处理非结构化数据，比如与用户相关的文本。我们展示了使用文本数据在社交媒体中进行复杂用户分析的案例。我们解决的任务是在社交网络上发现父母。我们的方法适用于不是由用户生成的内容，而是用户隐式感兴趣的内容(用户喜欢)或显式感兴趣的内容(用户订阅了一个发布内容的组)。在本文中，我们比较了社交媒体上父母检测任务的分类方法。使用上面提到的用户的喜欢和其他信息，我们需要估计用户是否有孩子或已经有孩子的可能性。这个任务是一个正面无标签学习的例子:来自社交网络和媒体的数据可能包含关于用户父母身份的明确信号，但没有理由做出反向结论。它可以被认为是一个类似于look-a的模型，或者换句话说，是一个单类分类问题。我们提出了一种回顾性的方法，可以利用来自社交媒体的数据来建立一个二元分类器。我们比较了这两种方法，并得出结论，回顾性方法虽然需要更多的努力来实施，但可能会产生更好的结果。这种方法对于具有类似问题语句的类似任务可能很有用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 27th Conference of Open Innovations Association (FRUCT)

自引率

0.00%

发文量