Jiao Chen, Luyi Ma, Xiaohan Li, Jianpeng Xu, Jason H. D. Cho, Kaushiki Nag, Evren Korpeoglu, Sushant Kumar, Kannan Achan
{"title":"利用大型语言模型在产品知识图谱中为电子商务进行关系标注","authors":"Jiao Chen, Luyi Ma, Xiaohan Li, Jianpeng Xu, Jason H. D. Cho, Kaushiki Nag, Evren Korpeoglu, Sushant Kumar, Kannan Achan","doi":"10.1007/s13042-024-02274-5","DOIUrl":null,"url":null,"abstract":"<p>Product Knowledge Graphs (PKGs) play a crucial role in enhancing e-commerce system performance by providing structured information about entities and their relationships, such as complementary or substitutable relations between products or product types, which can be utilized in recommender systems. However, relation labeling in PKGs remains a challenging task due to the dynamic nature of e-commerce domains and the associated cost of human labor. Recently, breakthroughs in Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks, especially in the in-context learning (ICL). In this paper, we conduct an empirical study of LLMs for relation labeling in e-commerce PKGs, investigating their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with few-shot in-context learning. We evaluate the performance of various LLMs, including PaLM-2, GPT-3.5, and Llama-2, on benchmark datasets for e-commerce relation labeling tasks. We use different prompt engineering techniques to examine their impact on model performance. Our results show that LLMs can achieve competitive performance compared to human labelers using just 1–5 labeled examples per relation. We also illustrate the bias issues in LLMs towards minority ethnic groups. Additionally, we show that LLMs significantly outperform existing KG completion models or classification methods in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling. Beyond empirical investigations, we also carry out a theoretical analysis to explain the superior capability of LLMs in few-shot ICL by comparing it with kernel regression.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"176 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Relation labeling in product knowledge graphs with large language models for e-commerce\",\"authors\":\"Jiao Chen, Luyi Ma, Xiaohan Li, Jianpeng Xu, Jason H. D. Cho, Kaushiki Nag, Evren Korpeoglu, Sushant Kumar, Kannan Achan\",\"doi\":\"10.1007/s13042-024-02274-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Product Knowledge Graphs (PKGs) play a crucial role in enhancing e-commerce system performance by providing structured information about entities and their relationships, such as complementary or substitutable relations between products or product types, which can be utilized in recommender systems. However, relation labeling in PKGs remains a challenging task due to the dynamic nature of e-commerce domains and the associated cost of human labor. Recently, breakthroughs in Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks, especially in the in-context learning (ICL). In this paper, we conduct an empirical study of LLMs for relation labeling in e-commerce PKGs, investigating their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with few-shot in-context learning. We evaluate the performance of various LLMs, including PaLM-2, GPT-3.5, and Llama-2, on benchmark datasets for e-commerce relation labeling tasks. We use different prompt engineering techniques to examine their impact on model performance. Our results show that LLMs can achieve competitive performance compared to human labelers using just 1–5 labeled examples per relation. We also illustrate the bias issues in LLMs towards minority ethnic groups. Additionally, we show that LLMs significantly outperform existing KG completion models or classification methods in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling. Beyond empirical investigations, we also carry out a theoretical analysis to explain the superior capability of LLMs in few-shot ICL by comparing it with kernel regression.</p>\",\"PeriodicalId\":51327,\"journal\":{\"name\":\"International Journal of Machine Learning and Cybernetics\",\"volume\":\"176 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Machine Learning and Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s13042-024-02274-5\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02274-5","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Relation labeling in product knowledge graphs with large language models for e-commerce
Product Knowledge Graphs (PKGs) play a crucial role in enhancing e-commerce system performance by providing structured information about entities and their relationships, such as complementary or substitutable relations between products or product types, which can be utilized in recommender systems. However, relation labeling in PKGs remains a challenging task due to the dynamic nature of e-commerce domains and the associated cost of human labor. Recently, breakthroughs in Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks, especially in the in-context learning (ICL). In this paper, we conduct an empirical study of LLMs for relation labeling in e-commerce PKGs, investigating their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with few-shot in-context learning. We evaluate the performance of various LLMs, including PaLM-2, GPT-3.5, and Llama-2, on benchmark datasets for e-commerce relation labeling tasks. We use different prompt engineering techniques to examine their impact on model performance. Our results show that LLMs can achieve competitive performance compared to human labelers using just 1–5 labeled examples per relation. We also illustrate the bias issues in LLMs towards minority ethnic groups. Additionally, we show that LLMs significantly outperform existing KG completion models or classification methods in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling. Beyond empirical investigations, we also carry out a theoretical analysis to explain the superior capability of LLMs in few-shot ICL by comparing it with kernel regression.
期刊介绍:
Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data.
The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC.
Key research areas to be covered by the journal include:
Machine Learning for modeling interactions between systems
Pattern Recognition technology to support discovery of system-environment interaction
Control of system-environment interactions
Biochemical interaction in biological and biologically-inspired systems
Learning for improvement of communication schemes between systems