{"title":"Hubness as a Case of Technical Algorithmic Bias in Music Recommendation","authors":"A. Flexer, M. Dörfler, Jan Schlüter, Thomas Grill","doi":"10.1109/ICDMW.2018.00154","DOIUrl":null,"url":null,"abstract":"This paper tries to bring the problem of technical algorithmic bias to the attention of the high-dimensional data mining community. A system suffering from algorithmic bias results in systematic unfair treatment of certain users or data, with technical algorithmic bias arising specifically from technical constraints. We illustrate this problem, which so far has been neglected in high-dimensional data mining, for a real world music recommendation system. Due to a problem of measuring distances in high dimensional spaces, songs closer to the center of all data are recommended over and over again, while songs far from the center are not recommended at all. We show that these so-called hub songs do not carry a specific semantic meaning and that deleting them from the data base promotes other songs to hub songs being recommended disturbingly often as a consequence. We argue that it is the ethical responsibility of data mining researchers to care about the fairness of their algorithms in high-dimensional spaces.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2018.00154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
This paper tries to bring the problem of technical algorithmic bias to the attention of the high-dimensional data mining community. A system suffering from algorithmic bias results in systematic unfair treatment of certain users or data, with technical algorithmic bias arising specifically from technical constraints. We illustrate this problem, which so far has been neglected in high-dimensional data mining, for a real world music recommendation system. Due to a problem of measuring distances in high dimensional spaces, songs closer to the center of all data are recommended over and over again, while songs far from the center are not recommended at all. We show that these so-called hub songs do not carry a specific semantic meaning and that deleting them from the data base promotes other songs to hub songs being recommended disturbingly often as a consequence. We argue that it is the ethical responsibility of data mining researchers to care about the fairness of their algorithms in high-dimensional spaces.