Hongjie Ye, Wei Chen, Wensheng Dou, Guoquan Wu, Jun Wei
{"title":"基于知识的Python程序环境依赖推理","authors":"Hongjie Ye, Wei Chen, Wensheng Dou, Guoquan Wu, Jun Wei","doi":"10.1145/3510003.3510127","DOIUrl":null,"url":null,"abstract":"Besides third-party packages, the Python interpreter and system libraries are also critical dependencies of a Python program. In our empirical study, 34% programs are only compatible with specific Python interpreter versions, and 24% programs require specific system libraries. However, existing techniques mainly focus on inferring third-party package dependencies. Therefore, they can lack other necessary dependencies and violate version constraints, thus resulting in program build failures and runtime errors. This paper proposes a knowledge-based technique named PyEGo, which can automatically infer dependencies of third-party packages, the Python interpreter, and system libraries at compatible versions for Python programs. We first construct the dependency knowl-edge graph PyKG, which can portray the relations and constraints among third-party packages, the Python interpreter, and system libraries. Then, by querying PyKG with extracted program features, PyEGo constructs a program-related sub-graph with dependency candidates of the three types. It finally outputs the latest compatible dependency versions by solving constraints in the sub-graph. We evaluate PyEGo on 2,891 single-file Python gists, 100 open-source Python projects and 4,836 jupyter notebooks. The experimental re-sults show that PyEGo achieves better accuracy, 0.2x to 3.5x higher than the state-of-the-art approaches.","PeriodicalId":202896,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Knowledge-Based Environment Dependency Inference for Python Programs\",\"authors\":\"Hongjie Ye, Wei Chen, Wensheng Dou, Guoquan Wu, Jun Wei\",\"doi\":\"10.1145/3510003.3510127\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Besides third-party packages, the Python interpreter and system libraries are also critical dependencies of a Python program. In our empirical study, 34% programs are only compatible with specific Python interpreter versions, and 24% programs require specific system libraries. However, existing techniques mainly focus on inferring third-party package dependencies. Therefore, they can lack other necessary dependencies and violate version constraints, thus resulting in program build failures and runtime errors. This paper proposes a knowledge-based technique named PyEGo, which can automatically infer dependencies of third-party packages, the Python interpreter, and system libraries at compatible versions for Python programs. We first construct the dependency knowl-edge graph PyKG, which can portray the relations and constraints among third-party packages, the Python interpreter, and system libraries. Then, by querying PyKG with extracted program features, PyEGo constructs a program-related sub-graph with dependency candidates of the three types. It finally outputs the latest compatible dependency versions by solving constraints in the sub-graph. We evaluate PyEGo on 2,891 single-file Python gists, 100 open-source Python projects and 4,836 jupyter notebooks. The experimental re-sults show that PyEGo achieves better accuracy, 0.2x to 3.5x higher than the state-of-the-art approaches.\",\"PeriodicalId\":202896,\"journal\":{\"name\":\"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3510003.3510127\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510003.3510127","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Knowledge-Based Environment Dependency Inference for Python Programs
Besides third-party packages, the Python interpreter and system libraries are also critical dependencies of a Python program. In our empirical study, 34% programs are only compatible with specific Python interpreter versions, and 24% programs require specific system libraries. However, existing techniques mainly focus on inferring third-party package dependencies. Therefore, they can lack other necessary dependencies and violate version constraints, thus resulting in program build failures and runtime errors. This paper proposes a knowledge-based technique named PyEGo, which can automatically infer dependencies of third-party packages, the Python interpreter, and system libraries at compatible versions for Python programs. We first construct the dependency knowl-edge graph PyKG, which can portray the relations and constraints among third-party packages, the Python interpreter, and system libraries. Then, by querying PyKG with extracted program features, PyEGo constructs a program-related sub-graph with dependency candidates of the three types. It finally outputs the latest compatible dependency versions by solving constraints in the sub-graph. We evaluate PyEGo on 2,891 single-file Python gists, 100 open-source Python projects and 4,836 jupyter notebooks. The experimental re-sults show that PyEGo achieves better accuracy, 0.2x to 3.5x higher than the state-of-the-art approaches.