大型语言模型生成有意义的特征模型实例

Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume A Pub Date : 2023-08-28 DOI:10.1145/3579027.3608973

J. Galindo, Antonio J. Dominguez, Jules White, David Benavides

{"title":"大型语言模型生成有意义的特征模型实例","authors":"J. Galindo, Antonio J. Dominguez, Jules White, David Benavides","doi":"10.1145/3579027.3608973","DOIUrl":null,"url":null,"abstract":"Feature models are the \"de facto\" standard for representing variability in software-intensive systems. Automated analysis of feature models is the computer-aided extraction of information of feature models and is used in testing, maintenance, configuration, and derivation, among other tasks. Testing the analyses of feature models often requires relying on a large number of models that are as realistic as possible. There exist different proposals to generate synthetic feature models using random techniques or metamorphic relations; however, the existing methods do not take into account the semantics of the concepts of the domain that are being represented and the interrelations between them, leading to less realistic feature models. In this paper, we propose a novel approach that uses Large Language Models (LLMs), such as Codex or GPT-3, to generate realistic feature models that preserve semantic coherence while maintaining syntactic validity. The approach automatically generates instances of feature models from a given domain. Concretely, two language models were used, first OpenAI's Codex to generate new instances of feature models using the Universal Variability Language (UVL) syntax and then Cohere's semantic analysis to verify if the newly introduced concepts are from the same domain. This approach enabled the generation of 90% of valid instances according to the UVL syntax. In addition, the valid models score well on model complexity metrics, and the generated features mirror the domain of the original UVL instance used as prompts. With this work, we envision a new thread of research where variability is generated and analyzed using LLMs. This opens the door for a new generation of techniques and tools for variability management.","PeriodicalId":322542,"journal":{"name":"Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume A","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Large Language Models to generate meaningful feature model instances\",\"authors\":\"J. Galindo, Antonio J. Dominguez, Jules White, David Benavides\",\"doi\":\"10.1145/3579027.3608973\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature models are the \\\"de facto\\\" standard for representing variability in software-intensive systems. Automated analysis of feature models is the computer-aided extraction of information of feature models and is used in testing, maintenance, configuration, and derivation, among other tasks. Testing the analyses of feature models often requires relying on a large number of models that are as realistic as possible. There exist different proposals to generate synthetic feature models using random techniques or metamorphic relations; however, the existing methods do not take into account the semantics of the concepts of the domain that are being represented and the interrelations between them, leading to less realistic feature models. In this paper, we propose a novel approach that uses Large Language Models (LLMs), such as Codex or GPT-3, to generate realistic feature models that preserve semantic coherence while maintaining syntactic validity. The approach automatically generates instances of feature models from a given domain. Concretely, two language models were used, first OpenAI's Codex to generate new instances of feature models using the Universal Variability Language (UVL) syntax and then Cohere's semantic analysis to verify if the newly introduced concepts are from the same domain. This approach enabled the generation of 90% of valid instances according to the UVL syntax. In addition, the valid models score well on model complexity metrics, and the generated features mirror the domain of the original UVL instance used as prompts. With this work, we envision a new thread of research where variability is generated and analyzed using LLMs. This opens the door for a new generation of techniques and tools for variability management.\",\"PeriodicalId\":322542,\"journal\":{\"name\":\"Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume A\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume A\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3579027.3608973\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume A","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3579027.3608973","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

特征模型是在软件密集型系统中表示可变性的“事实上的”标准。特征模型的自动分析是计算机辅助提取特征模型的信息，用于测试、维护、配置和派生以及其他任务。测试特征模型的分析通常需要依赖大量尽可能真实的模型。利用随机技术或变质关系生成综合特征模型已有不同的建议;然而，现有的方法没有考虑到所表示的领域概念的语义以及它们之间的相互关系，导致不太真实的特征模型。在本文中，我们提出了一种新的方法，使用大型语言模型(LLMs)，如Codex或GPT-3，来生成现实的特征模型，在保持句法有效性的同时保持语义一致性。该方法从给定的域自动生成特征模型的实例。具体来说，使用了两种语言模型，首先是OpenAI的Codex，使用通用变异性语言(Universal Variability language, UVL)语法生成新的特征模型实例，然后是Cohere的语义分析，以验证新引入的概念是否来自同一领域。这种方法可以根据UVL语法生成90%的有效实例。此外，有效的模型在模型复杂性度量上得分很高，并且生成的特征反映了作为提示使用的原始UVL实例的域。通过这项工作，我们设想了一种新的研究思路，即使用llm生成和分析可变性。这为可变性管理的新一代技术和工具打开了大门。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Large Language Models to generate meaningful feature model instances

Feature models are the "de facto" standard for representing variability in software-intensive systems. Automated analysis of feature models is the computer-aided extraction of information of feature models and is used in testing, maintenance, configuration, and derivation, among other tasks. Testing the analyses of feature models often requires relying on a large number of models that are as realistic as possible. There exist different proposals to generate synthetic feature models using random techniques or metamorphic relations; however, the existing methods do not take into account the semantics of the concepts of the domain that are being represented and the interrelations between them, leading to less realistic feature models. In this paper, we propose a novel approach that uses Large Language Models (LLMs), such as Codex or GPT-3, to generate realistic feature models that preserve semantic coherence while maintaining syntactic validity. The approach automatically generates instances of feature models from a given domain. Concretely, two language models were used, first OpenAI's Codex to generate new instances of feature models using the Universal Variability Language (UVL) syntax and then Cohere's semantic analysis to verify if the newly introduced concepts are from the same domain. This approach enabled the generation of 90% of valid instances according to the UVL syntax. In addition, the valid models score well on model complexity metrics, and the generated features mirror the domain of the original UVL instance used as prompts. With this work, we envision a new thread of research where variability is generated and analyzed using LLMs. This opens the door for a new generation of techniques and tools for variability management.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume A

自引率

0.00%

发文量