{"title":"GNN: Graph Neural Network and Large Language Model Based for Data Discovery","authors":"Thomas Hoang","doi":"arxiv-2408.13609","DOIUrl":null,"url":null,"abstract":"Our algorithm GNN: Graph Neural Network and Large Language Model Based for\nData Discovery inherits the benefits of \\cite{hoang2024plod} (PLOD: Predictive\nLearning Optimal Data Discovery), \\cite{Hoang2024BODBO} (BOD: Blindly Optimal\nData Discovery) in terms of overcoming the challenges of having to predefine\nutility function and the human input for attribute ranking, which helps prevent\nthe time-consuming loop process. In addition to these previous works, our\nalgorithm GNN leverages the advantages of graph neural networks and large\nlanguage models to understand text type values that cannot be understood by\nPLOD and MOD, thus making the task of predicting outcomes more reliable. GNN\ncould be seen as an extension of PLOD in terms of understanding the text type\nvalue and the user's preferences based on not only numerical values but also\ntext values, making the promise of data science and analytics purposes.","PeriodicalId":501123,"journal":{"name":"arXiv - CS - Databases","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.13609","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Our algorithm GNN: Graph Neural Network and Large Language Model Based for
Data Discovery inherits the benefits of \cite{hoang2024plod} (PLOD: Predictive
Learning Optimal Data Discovery), \cite{Hoang2024BODBO} (BOD: Blindly Optimal
Data Discovery) in terms of overcoming the challenges of having to predefine
utility function and the human input for attribute ranking, which helps prevent
the time-consuming loop process. In addition to these previous works, our
algorithm GNN leverages the advantages of graph neural networks and large
language models to understand text type values that cannot be understood by
PLOD and MOD, thus making the task of predicting outcomes more reliable. GNN
could be seen as an extension of PLOD in terms of understanding the text type
value and the user's preferences based on not only numerical values but also
text values, making the promise of data science and analytics purposes.
我们的算法 GNN:基于图神经网络和大语言模型的数据发现算法(GNN:Graph Neural Network and Large Language Model Based forData Discovery)继承了PLOD(PredictiveLearning Optimal Data Discovery)、BOD(Blindly OptimalData Discovery)的优点,克服了属性排序必须预先定义效用函数和人工输入的难题,从而避免了耗时的循环过程。除了这些前人的研究成果,我们的算法 GNN 充分利用了图神经网络和大型语言模型的优势,能够理解PLOD 和 MOD 无法理解的文本类型值,从而使预测结果的任务更加可靠。GNN可以看作是PLOD的延伸,它不仅能根据数值,还能根据文本值理解文本类型值和用户的偏好,从而实现数据科学和分析的目的。