基于MapReduce的高性能话单处理

2015 9th International Conference on Telecommunication Systems Services and Applications (TSSA) Pub Date : 2015-11-01 DOI:10.1109/TSSA.2015.7440424

Mulya Agung, A. I. Kistijantoro

{"title":"基于MapReduce的高性能话单处理","authors":"Mulya Agung, A. I. Kistijantoro","doi":"10.1109/TSSA.2015.7440424","DOIUrl":null,"url":null,"abstract":"A Call Detail Record (CDR) is a data record produced by telecommunication equipment consisting of detail of call transaction logs. It contains valuable information for many purposes of several domains such as billing, fraud detection and analytical purposes. However, in the real world, these needs face a big data challenge. Billions CDRs are generated every day and the processing systems are expected to deliver results in a timely manner. In our case, the system also has constraint that is running in limited computation resources. We found that our current production system was not enough to meet these needs. We had successfully analyzed the current system bottleneck and found the root cause. Based on this analysis, we designed and implemented a better performance system which is based on MapReduce and running on Hadoop cluster. This paper presents the analysis of previous system and the design and implementation of new system, called MS2. In this paper, we also provide empirical evidence demonstrating the efficiency and linearity of MS2. In a test case of telecommunication mediation system, our test has shown that MS2 reduces overhead by 44% and speedup performance by nearly twice compared to previous system. From benchmarking with several related technologies in large scale data processing, MS2 is also shown to perform better in case of CDR batch processing. Running on a cluster consists of eight core CPU and two conventional disks, MS2 is able to process 67,000 CDRs/second.","PeriodicalId":428512,"journal":{"name":"2015 9th International Conference on Telecommunication Systems Services and Applications (TSSA)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"High performance CDR processing with MapReduce\",\"authors\":\"Mulya Agung, A. I. Kistijantoro\",\"doi\":\"10.1109/TSSA.2015.7440424\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A Call Detail Record (CDR) is a data record produced by telecommunication equipment consisting of detail of call transaction logs. It contains valuable information for many purposes of several domains such as billing, fraud detection and analytical purposes. However, in the real world, these needs face a big data challenge. Billions CDRs are generated every day and the processing systems are expected to deliver results in a timely manner. In our case, the system also has constraint that is running in limited computation resources. We found that our current production system was not enough to meet these needs. We had successfully analyzed the current system bottleneck and found the root cause. Based on this analysis, we designed and implemented a better performance system which is based on MapReduce and running on Hadoop cluster. This paper presents the analysis of previous system and the design and implementation of new system, called MS2. In this paper, we also provide empirical evidence demonstrating the efficiency and linearity of MS2. In a test case of telecommunication mediation system, our test has shown that MS2 reduces overhead by 44% and speedup performance by nearly twice compared to previous system. From benchmarking with several related technologies in large scale data processing, MS2 is also shown to perform better in case of CDR batch processing. Running on a cluster consists of eight core CPU and two conventional disks, MS2 is able to process 67,000 CDRs/second.\",\"PeriodicalId\":428512,\"journal\":{\"name\":\"2015 9th International Conference on Telecommunication Systems Services and Applications (TSSA)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 9th International Conference on Telecommunication Systems Services and Applications (TSSA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TSSA.2015.7440424\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 9th International Conference on Telecommunication Systems Services and Applications (TSSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TSSA.2015.7440424","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

呼叫详细记录(CDR)是由电信设备产生的包含呼叫事务日志详细信息的数据记录。它包含用于多个领域的许多目的的有价值的信息，例如计费、欺诈检测和分析目的。然而，在现实世界中，这些需求面临着大数据的挑战。每天生成数十亿的cdr，处理系统有望及时提供结果。在我们的例子中，系统也有在有限的计算资源中运行的约束。我们发现我们目前的生产系统不足以满足这些需求。我们已经成功地分析了当前的系统瓶颈，并找到了根本原因。在此基础上，我们设计并实现了一个基于MapReduce、运行在Hadoop集群上的性能更好的系统。本文对原有系统进行了分析，并对新系统MS2进行了设计与实现。在本文中，我们还提供了实证证据来证明MS2的效率和线性。在一个电信中介系统的测试用例中，我们的测试表明，与以前的系统相比，MS2减少了44%的开销，并将性能提高了近两倍。通过对大规模数据处理中的几种相关技术进行基准测试，MS2在CDR批处理的情况下也表现得更好。MS2运行在由8核CPU和2个传统磁盘组成的集群上，每秒能够处理67,000个话单。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

High performance CDR processing with MapReduce

A Call Detail Record (CDR) is a data record produced by telecommunication equipment consisting of detail of call transaction logs. It contains valuable information for many purposes of several domains such as billing, fraud detection and analytical purposes. However, in the real world, these needs face a big data challenge. Billions CDRs are generated every day and the processing systems are expected to deliver results in a timely manner. In our case, the system also has constraint that is running in limited computation resources. We found that our current production system was not enough to meet these needs. We had successfully analyzed the current system bottleneck and found the root cause. Based on this analysis, we designed and implemented a better performance system which is based on MapReduce and running on Hadoop cluster. This paper presents the analysis of previous system and the design and implementation of new system, called MS2. In this paper, we also provide empirical evidence demonstrating the efficiency and linearity of MS2. In a test case of telecommunication mediation system, our test has shown that MS2 reduces overhead by 44% and speedup performance by nearly twice compared to previous system. From benchmarking with several related technologies in large scale data processing, MS2 is also shown to perform better in case of CDR batch processing. Running on a cluster consists of eight core CPU and two conventional disks, MS2 is able to process 67,000 CDRs/second.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 9th International Conference on Telecommunication Systems Services and Applications (TSSA)

自引率

0.00%

发文量