{"title":"应用聚类算法对提出的软件变更进行批处理","authors":"U. Krohn, C. Boldyreff","doi":"10.1002/(SICI)1096-908X(199905/06)11:3%3C151::AID-SMR189%3E3.0.CO;2-G","DOIUrl":null,"url":null,"abstract":"This paper proposes the application of cluster algorithms for the identification of changes which may be batched together. The results of impact-analysis sessions are represented as binary data where each variable has two values indicating the presence or absence of an impact on a particular software component. These data are then used to produce a matrix containing the similarity or the dissimilarity of each pair of proposed changes which are to be clustered. There are many clustering techniques for binary data. Most of the empirical investigations indicate that average-linkage and centroid-method clustering may be most useful in practice. Both clustering methods produced similar results in an example application. Proposed software changes that impacted a large number of the same components were merged early into common clusters, showing the maintainer which changes may be batched together. Copyright 1999 John Wiley & Sons, Ltd.","PeriodicalId":383619,"journal":{"name":"J. Softw. Maintenance Res. Pract.","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Application of cluster algorithms for batching of proposed software changes\",\"authors\":\"U. Krohn, C. Boldyreff\",\"doi\":\"10.1002/(SICI)1096-908X(199905/06)11:3%3C151::AID-SMR189%3E3.0.CO;2-G\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes the application of cluster algorithms for the identification of changes which may be batched together. The results of impact-analysis sessions are represented as binary data where each variable has two values indicating the presence or absence of an impact on a particular software component. These data are then used to produce a matrix containing the similarity or the dissimilarity of each pair of proposed changes which are to be clustered. There are many clustering techniques for binary data. Most of the empirical investigations indicate that average-linkage and centroid-method clustering may be most useful in practice. Both clustering methods produced similar results in an example application. Proposed software changes that impacted a large number of the same components were merged early into common clusters, showing the maintainer which changes may be batched together. Copyright 1999 John Wiley & Sons, Ltd.\",\"PeriodicalId\":383619,\"journal\":{\"name\":\"J. Softw. Maintenance Res. Pract.\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J. Softw. Maintenance Res. Pract.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/(SICI)1096-908X(199905/06)11:3%3C151::AID-SMR189%3E3.0.CO;2-G\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Softw. Maintenance Res. Pract.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/(SICI)1096-908X(199905/06)11:3%3C151::AID-SMR189%3E3.0.CO;2-G","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Application of cluster algorithms for batching of proposed software changes
This paper proposes the application of cluster algorithms for the identification of changes which may be batched together. The results of impact-analysis sessions are represented as binary data where each variable has two values indicating the presence or absence of an impact on a particular software component. These data are then used to produce a matrix containing the similarity or the dissimilarity of each pair of proposed changes which are to be clustered. There are many clustering techniques for binary data. Most of the empirical investigations indicate that average-linkage and centroid-method clustering may be most useful in practice. Both clustering methods produced similar results in an example application. Proposed software changes that impacted a large number of the same components were merged early into common clusters, showing the maintainer which changes may be batched together. Copyright 1999 John Wiley & Sons, Ltd.