Daniel Bernau, O. Mordvinova, J. Karstens, S. Hickl
{"title":"研究数据存储组织对结构化代码搜索性能的影响","authors":"Daniel Bernau, O. Mordvinova, J. Karstens, S. Hickl","doi":"10.1109/CCIENG.2011.6008112","DOIUrl":null,"url":null,"abstract":"Code search in an industrial environment is driven by the programmers wish to scan huge source code repositories with high precision in a very short time. Given a challenging scenario of a huge software repository, the question for an efficient code search backend is relevant. This paper discusses the question of an appropriate data storage model for a structured code search engine applied in an industrial development scenario, where a search on large software repositories is common. To investigate this, a search engine approach with integrated Abstract Syntax Trees is adapted. Using the capabilities of a hybrid in-memory database, we stored a big amount of structured data obtained from the source code repository into column-, row-, and a hybrid store layout and performed a set of typical queries using an SQL interface on them. The results have shown the superiority of the column-oriented approach for the investigated scenario.","PeriodicalId":6316,"journal":{"name":"2011 IEEE 2nd International Conference on Computing, Control and Industrial Engineering","volume":"76 1","pages":"247-250"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Investigating influence of data storage organization on structured code search performance\",\"authors\":\"Daniel Bernau, O. Mordvinova, J. Karstens, S. Hickl\",\"doi\":\"10.1109/CCIENG.2011.6008112\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code search in an industrial environment is driven by the programmers wish to scan huge source code repositories with high precision in a very short time. Given a challenging scenario of a huge software repository, the question for an efficient code search backend is relevant. This paper discusses the question of an appropriate data storage model for a structured code search engine applied in an industrial development scenario, where a search on large software repositories is common. To investigate this, a search engine approach with integrated Abstract Syntax Trees is adapted. Using the capabilities of a hybrid in-memory database, we stored a big amount of structured data obtained from the source code repository into column-, row-, and a hybrid store layout and performed a set of typical queries using an SQL interface on them. The results have shown the superiority of the column-oriented approach for the investigated scenario.\",\"PeriodicalId\":6316,\"journal\":{\"name\":\"2011 IEEE 2nd International Conference on Computing, Control and Industrial Engineering\",\"volume\":\"76 1\",\"pages\":\"247-250\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 2nd International Conference on Computing, Control and Industrial Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCIENG.2011.6008112\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 2nd International Conference on Computing, Control and Industrial Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCIENG.2011.6008112","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Investigating influence of data storage organization on structured code search performance
Code search in an industrial environment is driven by the programmers wish to scan huge source code repositories with high precision in a very short time. Given a challenging scenario of a huge software repository, the question for an efficient code search backend is relevant. This paper discusses the question of an appropriate data storage model for a structured code search engine applied in an industrial development scenario, where a search on large software repositories is common. To investigate this, a search engine approach with integrated Abstract Syntax Trees is adapted. Using the capabilities of a hybrid in-memory database, we stored a big amount of structured data obtained from the source code repository into column-, row-, and a hybrid store layout and performed a set of typical queries using an SQL interface on them. The results have shown the superiority of the column-oriented approach for the investigated scenario.