A deep learning-based approach for identifying unresolved questions on Stack Exchange Q &A communities through graph-based communication modelling

IF 2.8 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Data Science and Analytics Pub Date : 2023-09-30 DOI:10.1007/s41060-023-00454-0

Hassan Abedi Firouzjaei

{"title":"A deep learning-based approach for identifying unresolved questions on Stack Exchange Q &A communities through graph-based communication modelling","authors":"Hassan Abedi Firouzjaei","doi":"10.1007/s41060-023-00454-0","DOIUrl":null,"url":null,"abstract":"Abstract In recent years, online question–answer (Q &A) platforms, such as Stack Exchange (SE), have become increasingly popular for information and knowledge sharing. Despite the vast amount of information available on these platforms, many questions remain unresolved. In this work, we aim to address this issue by proposing a novel approach to identify unresolved questions in SE Q &A communities. Our approach utilises the graph structure of communication formed around a question by users to model the communication network surrounding it. We employ a property graph model and graph neural networks (GNNs), which can effectively capture both the structure of communication and the content of messages exchanged among users. By leveraging the power of graph representation and GNNs, our approach can effectively identify unresolved questions in SE communities. Experimental results on the complete historical data from three distinct Q &A communities demonstrate the superiority of our proposed approach over baseline methods that only consider the content of questions. Finally, our work represents a first but important step towards better understanding the factors that can affect questions becoming and remaining unresolved in SE communities.","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"40 1","pages":"0"},"PeriodicalIF":2.8000,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Science and Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41060-023-00454-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract In recent years, online question–answer (Q &A) platforms, such as Stack Exchange (SE), have become increasingly popular for information and knowledge sharing. Despite the vast amount of information available on these platforms, many questions remain unresolved. In this work, we aim to address this issue by proposing a novel approach to identify unresolved questions in SE Q &A communities. Our approach utilises the graph structure of communication formed around a question by users to model the communication network surrounding it. We employ a property graph model and graph neural networks (GNNs), which can effectively capture both the structure of communication and the content of messages exchanged among users. By leveraging the power of graph representation and GNNs, our approach can effectively identify unresolved questions in SE communities. Experimental results on the complete historical data from three distinct Q &A communities demonstrate the superiority of our proposed approach over baseline methods that only consider the content of questions. Finally, our work represents a first but important step towards better understanding the factors that can affect questions becoming and remaining unresolved in SE communities.

查看原文本刊更多论文

一种基于深度学习的方法，通过基于图的通信建模来识别Stack Exchange Q &A社区中未解决的问题

近年来，以Stack Exchange (SE)为代表的在线问答(Q &A)平台在信息和知识共享方面越来越受欢迎。尽管这些平台上有大量的信息，但许多问题仍未得到解决。在这项工作中，我们的目标是通过提出一种新的方法来识别SE Q & a社区中未解决的问题来解决这个问题。我们的方法利用用户围绕问题形成的通信图结构来建模围绕该问题的通信网络。我们采用属性图模型和图神经网络(gnn)，可以有效地捕获用户之间的通信结构和交换的消息内容。通过利用图表示和gnn的力量，我们的方法可以有效地识别SE社区中未解决的问题。来自三个不同问答社区的完整历史数据的实验结果表明，我们提出的方法优于仅考虑问题内容的基线方法。最后，我们的工作代表了第一步，但重要的一步，朝着更好地理解可能影响在东南社区中产生和保持未解决问题的因素。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Data Science and Analytics Multiple-

CiteScore

6.40

自引率

8.30%

发文量

期刊介绍： Data Science has been established as an important emergent scientific field and paradigm driving research evolution in such disciplines as statistics, computing science and intelligence science, and practical transformation in such domains as science, engineering, the public sector, business, social science, and lifestyle. The field encompasses the larger areas of artificial intelligence, data analytics, machine learning, pattern recognition, natural language understanding, and big data manipulation. It also tackles related new scientific challenges, ranging from data capture, creation, storage, retrieval, sharing, analysis, optimization, and visualization, to integrative analysis across heterogeneous and interdependent complex resources for better decision-making, collaboration, and, ultimately, value creation.The International Journal of Data Science and Analytics (JDSA) brings together thought leaders, researchers, industry practitioners, and potential users of data science and analytics, to develop the field, discuss new trends and opportunities, exchange ideas and practices, and promote transdisciplinary and cross-domain collaborations. The journal is composed of three streams: Regular, to communicate original and reproducible theoretical and experimental findings on data science and analytics; Applications, to report the significant data science applications to real-life situations; and Trends, to report expert opinion and comprehensive surveys and reviews of relevant areas and topics in data science and analytics.Topics of relevance include all aspects of the trends, scientific foundations, techniques, and applications of data science and analytics, with a primary focus on:statistical and mathematical foundations for data science and analytics;understanding and analytics of complex data, human, domain, network, organizational, social, behavior, and system characteristics, complexities and intelligences;creation and extraction, processing, representation and modelling, learning and discovery, fusion and integration, presentation and visualization of complex data, behavior, knowledge and intelligence;data analytics, pattern recognition, knowledge discovery, machine learning, deep analytics and deep learning, and intelligent processing of various data (including transaction, text, image, video, graph and network), behaviors and systems;active, real-time, personalized, actionable and automated analytics, learning, computation, optimization, presentation and recommendation; big data architecture, infrastructure, computing, matching, indexing, query processing, mapping, search, retrieval, interoperability, exchange, and recommendation;in-memory, distributed, parallel, scalable and high-performance computing, analytics and optimization for big data;review, surveys, trends, prospects and opportunities of data science research, innovation and applications;data science applications, intelligent devices and services in scientific, business, governmental, cultural, behavioral, social and economic, health and medical, human, natural and artificial (including online/Web, cloud, IoT, mobile and social media) domains; andethics, quality, privacy, safety and security, trust, and risk of data science and analytics