{"title":"Leveraging Large Language Model for Automatic Evolving of Industrial Data-Centric R&D Cycle","authors":"Xu Yang, Xiao Yang, Weiqing Liu, Jinhui Li, Peng Yu, Zeqi Ye, Jiang Bian","doi":"arxiv-2310.11249","DOIUrl":null,"url":null,"abstract":"In the wake of relentless digital transformation, data-driven solutions are\nemerging as powerful tools to address multifarious industrial tasks such as\nforecasting, anomaly detection, planning, and even complex decision-making.\nAlthough data-centric R&D has been pivotal in harnessing these solutions, it\noften comes with significant costs in terms of human, computational, and time\nresources. This paper delves into the potential of large language models (LLMs)\nto expedite the evolution cycle of data-centric R&D. Assessing the foundational\nelements of data-centric R&D, including heterogeneous task-related data,\nmulti-facet domain knowledge, and diverse computing-functional tools, we\nexplore how well LLMs can understand domain-specific requirements, generate\nprofessional ideas, utilize domain-specific tools to conduct experiments,\ninterpret results, and incorporate knowledge from past endeavors to tackle new\nchallenges. We take quantitative investment research as a typical example of\nindustrial data-centric R&D scenario and verified our proposed framework upon\nour full-stack open-sourced quantitative research platform Qlib and obtained\npromising results which shed light on our vision of automatic evolving of\nindustrial data-centric R&D cycle.","PeriodicalId":501372,"journal":{"name":"arXiv - QuantFin - General Finance","volume":"28 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - General Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2310.11249","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the wake of relentless digital transformation, data-driven solutions are
emerging as powerful tools to address multifarious industrial tasks such as
forecasting, anomaly detection, planning, and even complex decision-making.
Although data-centric R&D has been pivotal in harnessing these solutions, it
often comes with significant costs in terms of human, computational, and time
resources. This paper delves into the potential of large language models (LLMs)
to expedite the evolution cycle of data-centric R&D. Assessing the foundational
elements of data-centric R&D, including heterogeneous task-related data,
multi-facet domain knowledge, and diverse computing-functional tools, we
explore how well LLMs can understand domain-specific requirements, generate
professional ideas, utilize domain-specific tools to conduct experiments,
interpret results, and incorporate knowledge from past endeavors to tackle new
challenges. We take quantitative investment research as a typical example of
industrial data-centric R&D scenario and verified our proposed framework upon
our full-stack open-sourced quantitative research platform Qlib and obtained
promising results which shed light on our vision of automatic evolving of
industrial data-centric R&D cycle.