{"title":"在万维网上查找代码:初步调查","authors":"J. Bieman, Vanessa Murdock","doi":"10.1109/SCAM.2001.972668","DOIUrl":null,"url":null,"abstract":"To find out what kind of design structures programmers really use, we need to examine a wide variety of programs. Unfortunately, most program source code is proprietary and is unavailable for analysis. The World Wide Web (Web) potentially can provide a rich source of programs for study. The freely available code on the Web, if in sufficient quality and quantity, can provide a window into software design as it is practiced today. In a preliminary study of source code availability on the Web, we estimate that 4% of URLs contain object-oriented source code, and 9% of URLs contain executable code: either binary or class files. This represents an enormous resource for program analysis. We can, with some risk of inaccuracy, conservatively project our sampling results to the entire Web. Our estimate is that the Web contains at least 3.4 million files containing either Java, C++, or Perl source code, 20.3 million files containing C source code, and 8.7 million files containing executable code.","PeriodicalId":190865,"journal":{"name":"Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Finding code on the World Wide Web: a preliminary investigation\",\"authors\":\"J. Bieman, Vanessa Murdock\",\"doi\":\"10.1109/SCAM.2001.972668\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To find out what kind of design structures programmers really use, we need to examine a wide variety of programs. Unfortunately, most program source code is proprietary and is unavailable for analysis. The World Wide Web (Web) potentially can provide a rich source of programs for study. The freely available code on the Web, if in sufficient quality and quantity, can provide a window into software design as it is practiced today. In a preliminary study of source code availability on the Web, we estimate that 4% of URLs contain object-oriented source code, and 9% of URLs contain executable code: either binary or class files. This represents an enormous resource for program analysis. We can, with some risk of inaccuracy, conservatively project our sampling results to the entire Web. Our estimate is that the Web contains at least 3.4 million files containing either Java, C++, or Perl source code, 20.3 million files containing C source code, and 8.7 million files containing executable code.\",\"PeriodicalId\":190865,\"journal\":{\"name\":\"Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCAM.2001.972668\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCAM.2001.972668","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Finding code on the World Wide Web: a preliminary investigation
To find out what kind of design structures programmers really use, we need to examine a wide variety of programs. Unfortunately, most program source code is proprietary and is unavailable for analysis. The World Wide Web (Web) potentially can provide a rich source of programs for study. The freely available code on the Web, if in sufficient quality and quantity, can provide a window into software design as it is practiced today. In a preliminary study of source code availability on the Web, we estimate that 4% of URLs contain object-oriented source code, and 9% of URLs contain executable code: either binary or class files. This represents an enormous resource for program analysis. We can, with some risk of inaccuracy, conservatively project our sampling results to the entire Web. Our estimate is that the Web contains at least 3.4 million files containing either Java, C++, or Perl source code, 20.3 million files containing C source code, and 8.7 million files containing executable code.