Retracted on March 14, 2023: Cross-lingual transfer learning for statistical type inference

Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis Pub Date : 2021-07-01 DOI:10.1145/3533767.3534411

Zhiming Li, Xiaofei Xie, Hao Li, Zhengzi Xu, Yi Li, Yang Liu

{"title":"Retracted on March 14, 2023: Cross-lingual transfer learning for statistical type inference","authors":"Zhiming Li, Xiaofei Xie, Hao Li, Zhengzi Xu, Yi Li, Yang Liu","doi":"10.1145/3533767.3534411","DOIUrl":null,"url":null,"abstract":"NOTE OF RETRACTION: The authors, Zhiming Li, Xiaofei Xie, Haoliang Li, Zhengzi Xu, Yi Li, and Yang Liu, of the paper “Cross-lingual transfer learning for statistical type inference” have requested their paper be Retracted due to errors in the paper. The authors all agree the major conclusions are erroneous: 1. (Major) In RQ4, the results of LambadaNet and Typilus baseline methods are erroneous and the PLATO results are implemented without the incorporation of cross-lingual data. And some numbers are recorded erroneously in the table, which makes the important conclusion of the paper “Plato can significantly outperform the baseline” erroneous. 2. (Major) In RQ1, the implementations of the rule-based tools (CheckJS and Pytype) (Page 8) are erroneous, and we find it not possible to compare PLATO with the Pytype tool fairly. This renders the conclusion of the paper “With Plato, one can achieve comparative or even better performance by using cross-lingual labeled data instead of implementing rule-based tool from scratch that requires significant manual effort and expert knowledge.” erroneous. 3. Besides, for RQ1, we realize that the type set used for the Python & TypeScript transfer only uses 6 and 4 meta-types, which are somewhat inconsistent with the description on Page 6. The implementation of the ADV baseline for the Java transfer benchmarks and the supervised_o of TypeScript baselines are erroneous. And the ensemble method used for PLATO is inconsistent with the description in the methodology section. And RQ1 has used an outdated checkpoint of ours (different from the one used in other RQs.) The pre-trained model, training process, and ensemble strategy are implemented in settings somewhat different from the description in the methodology section. 4. The visualizations of Figure 6 & 8 are somewhat inconsistent with real cases. 5. In RQ3, the description of the baseline method (Bert with supervised learning) is wrong (Page 9) (It should be “only trained on partially labeled target language data”). And we find that some tokens are erroneously normalized during preprocessing. And some data points’ results are erroneous, thus “Plato without Kernel” and “PLATO” methods would not achieve as high improvements as claimed. 6. In RQ2, the ablation of the PLATO model is erroneous and we find that the sequence submodel performs better than the kernel submodel (Table 3).","PeriodicalId":412271,"journal":{"name":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533767.3534411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

NOTE OF RETRACTION: The authors, Zhiming Li, Xiaofei Xie, Haoliang Li, Zhengzi Xu, Yi Li, and Yang Liu, of the paper “Cross-lingual transfer learning for statistical type inference” have requested their paper be Retracted due to errors in the paper. The authors all agree the major conclusions are erroneous: 1. (Major) In RQ4, the results of LambadaNet and Typilus baseline methods are erroneous and the PLATO results are implemented without the incorporation of cross-lingual data. And some numbers are recorded erroneously in the table, which makes the important conclusion of the paper “Plato can significantly outperform the baseline” erroneous. 2. (Major) In RQ1, the implementations of the rule-based tools (CheckJS and Pytype) (Page 8) are erroneous, and we find it not possible to compare PLATO with the Pytype tool fairly. This renders the conclusion of the paper “With Plato, one can achieve comparative or even better performance by using cross-lingual labeled data instead of implementing rule-based tool from scratch that requires significant manual effort and expert knowledge.” erroneous. 3. Besides, for RQ1, we realize that the type set used for the Python & TypeScript transfer only uses 6 and 4 meta-types, which are somewhat inconsistent with the description on Page 6. The implementation of the ADV baseline for the Java transfer benchmarks and the supervised_o of TypeScript baselines are erroneous. And the ensemble method used for PLATO is inconsistent with the description in the methodology section. And RQ1 has used an outdated checkpoint of ours (different from the one used in other RQs.) The pre-trained model, training process, and ensemble strategy are implemented in settings somewhat different from the description in the methodology section. 4. The visualizations of Figure 6 & 8 are somewhat inconsistent with real cases. 5. In RQ3, the description of the baseline method (Bert with supervised learning) is wrong (Page 9) (It should be “only trained on partially labeled target language data”). And we find that some tokens are erroneously normalized during preprocessing. And some data points’ results are erroneous, thus “Plato without Kernel” and “PLATO” methods would not achieve as high improvements as claimed. 6. In RQ2, the ablation of the PLATO model is erroneous and we find that the sequence submodel performs better than the kernel submodel (Table 3).

查看原文本刊更多论文

2023年3月14日撤回:统计类型推断的跨语言迁移学习

撤稿声明:《跨语言迁移学习用于统计类型推断》的作者李志明、谢晓飞、李浩良、徐正子、李毅、刘洋，由于论文中存在错误，要求撤稿。作者一致认为主要结论是错误的:(主要)在RQ4中，LambadaNet和Typilus基线方法的结果是错误的，PLATO结果没有纳入跨语言数据。并且有些数字在表格中记录错误，这使得论文的重要结论“柏拉图可以显著优于基线”是错误的。2. (主要)在RQ1中，基于规则的工具(CheckJS和Pytype)(第8页)的实现是错误的，我们发现不可能公平地将PLATO与Pytype工具进行比较。这就得出了论文的结论:“使用Plato，人们可以通过使用跨语言标记数据而不是从头开始实现基于规则的工具来获得比较甚至更好的性能，这需要大量的手工工作和专业知识。””错误。3.此外，对于RQ1，我们意识到用于Python和TypeScript转换的类型集只使用了6和4个元类型，这与第6页的描述有些不一致。Java传输基准的ADV基线的实现和TypeScript基线的overed_o是错误的。而用于PLATO的集成方法与方法论部分的描述不一致。RQ1使用了我们过时的检查点(不同于其他rq中使用的检查点)。预训练模型、训练过程和集成策略在与方法学部分中的描述有所不同的设置中实现。4. 图6和图8的可视化与实际情况有些不一致。5. 在RQ3中，对基线方法(Bert with supervised learning)的描述是错误的(第9页)(它应该“只在部分标记的目标语言数据上进行训练”)。我们发现在预处理过程中，一些标记被错误地归一化。一些数据点的结果是错误的，因此“Plato without Kernel”和“Plato”方法不会达到声称的高改进。6. 在RQ2中，PLATO模型的消融是错误的，我们发现序列子模型比内核子模型表现得更好(表3)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis

自引率

0.00%

发文量