{"title":"一种Elasticsearch索引分片数优化方法","authors":"Bizhong Wei, Jian Dai, Liqiang Deng, Haiyan Huang","doi":"10.1109/CIS52066.2020.00048","DOIUrl":null,"url":null,"abstract":"Elasticsearch, as an open source distributed data search and analysis engine, has been widely used in recent years due to its characteristics. But in a wide range of utilization and deployment, it is not suitable for all scenarios and requirements. Therefore, this paper proposes a method to optimize the number of Elasticsearch index shard based on Elasticsearch full-text retrieval technology and data features in practical application. This method can comprehensively analyze and calculate Elasticsearch remaining storage space and index shard size of each node in distributed cluster to determine the optimal number of index shard in the system, which can improve the efficiency of data retrieval. Experimental results show that, compare with traditional methods, the proposed method can improve the system performance in data distribution, data writing efficiency and data query delay.","PeriodicalId":106959,"journal":{"name":"2020 16th International Conference on Computational Intelligence and Security (CIS)","volume":"259 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Optimization Method for Elasticsearch Index Shard Number\",\"authors\":\"Bizhong Wei, Jian Dai, Liqiang Deng, Haiyan Huang\",\"doi\":\"10.1109/CIS52066.2020.00048\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Elasticsearch, as an open source distributed data search and analysis engine, has been widely used in recent years due to its characteristics. But in a wide range of utilization and deployment, it is not suitable for all scenarios and requirements. Therefore, this paper proposes a method to optimize the number of Elasticsearch index shard based on Elasticsearch full-text retrieval technology and data features in practical application. This method can comprehensively analyze and calculate Elasticsearch remaining storage space and index shard size of each node in distributed cluster to determine the optimal number of index shard in the system, which can improve the efficiency of data retrieval. Experimental results show that, compare with traditional methods, the proposed method can improve the system performance in data distribution, data writing efficiency and data query delay.\",\"PeriodicalId\":106959,\"journal\":{\"name\":\"2020 16th International Conference on Computational Intelligence and Security (CIS)\",\"volume\":\"259 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 16th International Conference on Computational Intelligence and Security (CIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIS52066.2020.00048\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 16th International Conference on Computational Intelligence and Security (CIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS52066.2020.00048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Optimization Method for Elasticsearch Index Shard Number
Elasticsearch, as an open source distributed data search and analysis engine, has been widely used in recent years due to its characteristics. But in a wide range of utilization and deployment, it is not suitable for all scenarios and requirements. Therefore, this paper proposes a method to optimize the number of Elasticsearch index shard based on Elasticsearch full-text retrieval technology and data features in practical application. This method can comprehensively analyze and calculate Elasticsearch remaining storage space and index shard size of each node in distributed cluster to determine the optimal number of index shard in the system, which can improve the efficiency of data retrieval. Experimental results show that, compare with traditional methods, the proposed method can improve the system performance in data distribution, data writing efficiency and data query delay.