{"title":"探讨代码语言模型的算术性和逻辑性","authors":"Razan Baltaji, Parth Thakkar","doi":"10.1109/InteNSE59150.2023.00006","DOIUrl":null,"url":null,"abstract":"Machine learning techniques have found a widespread use in the software engineering community. In particular, language models (LMs) trained on code form the backbone of a majority of these applications, spanning tasks such as code completion, summarization, refactoring, execution prediction, and test generation. These tasks require reasoning about both the syntax and semantics of code. Recent work has shown that language models learn to capture the syntactic properties of code, but it is unclear to what extent they can reason about the semantics of code. In this work, we explore the ability of 3 language models of code to reason about a specific kind of semantics: numerical and logical properties of code. We propose several probing tasks to test the numerical and logical reasoning abilities of these models. We find that the models we explore - CodeBERT, GraphCodeBERT and CodeGen do indeed learn many numerical and logical properties of code, such as finding maximum in a list of numbers, comparing numbers, evaluating boolean expressions and representing numbers. They do not perform as well on complex tasks such as evaluating arithmetic expressions and substituting variables in such expressions. Our results indicate that while these models hold promise, there is a lot of room for improvement of their numeric and logical reasoning abilities.","PeriodicalId":166762,"journal":{"name":"2023 IEEE/ACM International Workshop on Interpretability and Robustness in Neural Software Engineering (InteNSE)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Probing Numeracy and Logic of Language Models of Code\",\"authors\":\"Razan Baltaji, Parth Thakkar\",\"doi\":\"10.1109/InteNSE59150.2023.00006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning techniques have found a widespread use in the software engineering community. In particular, language models (LMs) trained on code form the backbone of a majority of these applications, spanning tasks such as code completion, summarization, refactoring, execution prediction, and test generation. These tasks require reasoning about both the syntax and semantics of code. Recent work has shown that language models learn to capture the syntactic properties of code, but it is unclear to what extent they can reason about the semantics of code. In this work, we explore the ability of 3 language models of code to reason about a specific kind of semantics: numerical and logical properties of code. We propose several probing tasks to test the numerical and logical reasoning abilities of these models. We find that the models we explore - CodeBERT, GraphCodeBERT and CodeGen do indeed learn many numerical and logical properties of code, such as finding maximum in a list of numbers, comparing numbers, evaluating boolean expressions and representing numbers. They do not perform as well on complex tasks such as evaluating arithmetic expressions and substituting variables in such expressions. Our results indicate that while these models hold promise, there is a lot of room for improvement of their numeric and logical reasoning abilities.\",\"PeriodicalId\":166762,\"journal\":{\"name\":\"2023 IEEE/ACM International Workshop on Interpretability and Robustness in Neural Software Engineering (InteNSE)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE/ACM International Workshop on Interpretability and Robustness in Neural Software Engineering (InteNSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/InteNSE59150.2023.00006\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM International Workshop on Interpretability and Robustness in Neural Software Engineering (InteNSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/InteNSE59150.2023.00006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Probing Numeracy and Logic of Language Models of Code
Machine learning techniques have found a widespread use in the software engineering community. In particular, language models (LMs) trained on code form the backbone of a majority of these applications, spanning tasks such as code completion, summarization, refactoring, execution prediction, and test generation. These tasks require reasoning about both the syntax and semantics of code. Recent work has shown that language models learn to capture the syntactic properties of code, but it is unclear to what extent they can reason about the semantics of code. In this work, we explore the ability of 3 language models of code to reason about a specific kind of semantics: numerical and logical properties of code. We propose several probing tasks to test the numerical and logical reasoning abilities of these models. We find that the models we explore - CodeBERT, GraphCodeBERT and CodeGen do indeed learn many numerical and logical properties of code, such as finding maximum in a list of numbers, comparing numbers, evaluating boolean expressions and representing numbers. They do not perform as well on complex tasks such as evaluating arithmetic expressions and substituting variables in such expressions. Our results indicate that while these models hold promise, there is a lot of room for improvement of their numeric and logical reasoning abilities.