将代码搜索集成到开发会话中

2011 IEEE 27th International Conference on Data Engineering Pub Date : 2011-04-11 DOI:10.1109/ICDE.2011.5767948

Mu-Woong Lee, Seung-won Hwang, Sunghun Kim

{"title":"将代码搜索集成到开发会话中","authors":"Mu-Woong Lee, Seung-won Hwang, Sunghun Kim","doi":"10.1109/ICDE.2011.5767948","DOIUrl":null,"url":null,"abstract":"To support rapid and efficient software development, we propose to demonstrate our tool, integrating code search into software development process. For example, a developer, right during writing a module, can find a code piece sharing the same syntactic structure from a large code corpus representing the wisdom of other developers in the same team (or in the universe of open-source code). While there exist commercial code search engines on the code universe, they treat software as text (thus oblivious of syntactic structure), and fail at finding semantically related code. Meanwhile, existing tools, searching for syntactic clones, do not focus on efficiency, focusing on “post-mortem” usage scenario of detecting clones “after” the code development is completed. In clear contrast, we focus on optimizing efficiency for syntactic code search and making this search “interactive” for large-scale corpus, to complement the existing two lines of research. From our demonstration, we will show how such interactive search supports rapid software development, as similarly claimed lately in SE and HCI communities [1], [2]. As an enabling technology, we design efficient index building and traversal techniques, optimized for code corpus and code search workload. Our tool can identify relevant code in the corpus of 1.7 million code pieces in a sub-second response time, without compromising any accuracy obtained by a state-of-the-art tool, as we report our extensive evaluation results in [3].","PeriodicalId":332374,"journal":{"name":"2011 IEEE 27th International Conference on Data Engineering","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Integrating code search into the development session\",\"authors\":\"Mu-Woong Lee, Seung-won Hwang, Sunghun Kim\",\"doi\":\"10.1109/ICDE.2011.5767948\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To support rapid and efficient software development, we propose to demonstrate our tool, integrating code search into software development process. For example, a developer, right during writing a module, can find a code piece sharing the same syntactic structure from a large code corpus representing the wisdom of other developers in the same team (or in the universe of open-source code). While there exist commercial code search engines on the code universe, they treat software as text (thus oblivious of syntactic structure), and fail at finding semantically related code. Meanwhile, existing tools, searching for syntactic clones, do not focus on efficiency, focusing on “post-mortem” usage scenario of detecting clones “after” the code development is completed. In clear contrast, we focus on optimizing efficiency for syntactic code search and making this search “interactive” for large-scale corpus, to complement the existing two lines of research. From our demonstration, we will show how such interactive search supports rapid software development, as similarly claimed lately in SE and HCI communities [1], [2]. As an enabling technology, we design efficient index building and traversal techniques, optimized for code corpus and code search workload. Our tool can identify relevant code in the corpus of 1.7 million code pieces in a sub-second response time, without compromising any accuracy obtained by a state-of-the-art tool, as we report our extensive evaluation results in [3].\",\"PeriodicalId\":332374,\"journal\":{\"name\":\"2011 IEEE 27th International Conference on Data Engineering\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 27th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2011.5767948\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 27th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2011.5767948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

为了支持快速有效的软件开发，我们建议演示我们的工具，将代码搜索集成到软件开发过程中。例如，开发人员在编写模块期间，可以从代表同一团队中其他开发人员智慧的大型代码语料库中找到共享相同语法结构的代码片段(或在开放源代码的世界中)。虽然在代码领域存在商业代码搜索引擎，但它们将软件视为文本(因此忽略了语法结构)，无法找到语义相关的代码。同时，现有的搜索语法克隆的工具并不注重效率，而是注重“在”代码开发完成后“事后”检测克隆的使用场景。与此形成鲜明对比的是，我们的研究重点是优化句法代码搜索的效率，并使这种搜索在大规模语料库中具有“交互性”，以补充现有的两条研究方向。从我们的演示中，我们将展示这种交互式搜索如何支持快速软件开发，正如最近在SE和HCI社区中所声称的那样[1]，[2]。作为一种支持技术，我们设计了高效的索引构建和遍历技术，针对代码语料库和代码搜索工作负载进行了优化。我们的工具可以在亚秒的响应时间内识别170万代码片段的语料库中的相关代码，而不会影响最先进工具获得的任何准确性，正如我们在[3]中报告的那样，我们进行了广泛的评估结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Integrating code search into the development session

To support rapid and efficient software development, we propose to demonstrate our tool, integrating code search into software development process. For example, a developer, right during writing a module, can find a code piece sharing the same syntactic structure from a large code corpus representing the wisdom of other developers in the same team (or in the universe of open-source code). While there exist commercial code search engines on the code universe, they treat software as text (thus oblivious of syntactic structure), and fail at finding semantically related code. Meanwhile, existing tools, searching for syntactic clones, do not focus on efficiency, focusing on “post-mortem” usage scenario of detecting clones “after” the code development is completed. In clear contrast, we focus on optimizing efficiency for syntactic code search and making this search “interactive” for large-scale corpus, to complement the existing two lines of research. From our demonstration, we will show how such interactive search supports rapid software development, as similarly claimed lately in SE and HCI communities [1], [2]. As an enabling technology, we design efficient index building and traversal techniques, optimized for code corpus and code search workload. Our tool can identify relevant code in the corpus of 1.7 million code pieces in a sub-second response time, without compromising any accuracy obtained by a state-of-the-art tool, as we report our extensive evaluation results in [3].

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE 27th International Conference on Data Engineering

自引率

0.00%

发文量