Shuang Wang;He Zhang;Quan Z. Sheng;Xiaoping Li;Zhu Sun;Taotao Cai;Wei Emma Zhang;Jian Yang;Qing Gao
{"title":"A Survey on Truth Discovery: Concepts, Methods, Applications, and Opportunities","authors":"Shuang Wang;He Zhang;Quan Z. Sheng;Xiaoping Li;Zhu Sun;Taotao Cai;Wei Emma Zhang;Jian Yang;Qing Gao","doi":"10.1109/TBDATA.2024.3423677","DOIUrl":null,"url":null,"abstract":"In the era of data information explosion, there are different observations on an object (e.g., the height of the Himalayas) from different sources on the web, social sensing, crowd sensing, and data sensing applications. Observations from different sources on an object can conflict with each other due to errors, missing records, typos, outdated data, etc. How to discover truth facts for objects from various sources is essential and urgent. In this paper, we aim to deliver a comprehensive and exhaustive survey on truth discovery problems from the perspectives of concepts, methods, applications, and opportunities. We first systematically review and compare problems from objects, sources, and observations. Based on these problem properties, different methods are analyzed and compared in depth from observation with single or multiple values, independent or dependent sources, static or dynamic sources, and supervised or unsupervised learning, followed by the surveyed applications in various scenarios. For future studies in truth discovery fields, we summarize the code sources and datasets used in above methods. Finally, we point out the potential challenges and opportunities on truth discovery, with the goal of shedding light and promoting further investigation in this area.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 2","pages":"314-332"},"PeriodicalIF":7.5000,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10587116/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In the era of data information explosion, there are different observations on an object (e.g., the height of the Himalayas) from different sources on the web, social sensing, crowd sensing, and data sensing applications. Observations from different sources on an object can conflict with each other due to errors, missing records, typos, outdated data, etc. How to discover truth facts for objects from various sources is essential and urgent. In this paper, we aim to deliver a comprehensive and exhaustive survey on truth discovery problems from the perspectives of concepts, methods, applications, and opportunities. We first systematically review and compare problems from objects, sources, and observations. Based on these problem properties, different methods are analyzed and compared in depth from observation with single or multiple values, independent or dependent sources, static or dynamic sources, and supervised or unsupervised learning, followed by the surveyed applications in various scenarios. For future studies in truth discovery fields, we summarize the code sources and datasets used in above methods. Finally, we point out the potential challenges and opportunities on truth discovery, with the goal of shedding light and promoting further investigation in this area.
期刊介绍:
The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.