Performance Evaluation最新文献

筛选
英文 中文
Optimizing parallel I/O performance in NVMe SSDs by Dynamic cache partitioning 通过动态缓存分区优化NVMe ssd的并行I/O性能
IF 1 4区 计算机科学
Performance Evaluation Pub Date : 2025-03-10 DOI: 10.1016/j.peva.2025.102479
Zecheng Li , Shu Yin , Xiaojun Ruan
{"title":"Optimizing parallel I/O performance in NVMe SSDs by Dynamic cache partitioning","authors":"Zecheng Li ,&nbsp;Shu Yin ,&nbsp;Xiaojun Ruan","doi":"10.1016/j.peva.2025.102479","DOIUrl":"10.1016/j.peva.2025.102479","url":null,"abstract":"<div><div>Solid State Drive cache, implemented as on-board shared DRAM memory, can significantly enhance I/O performance by caching frequently accessed data. Although SSD caching strategies for single I/O data flows have been extensively explored, studies on cache partitioning to optimize parallel I/O in an SSD are scarce. In this paper, we present a novel dynamic cache partitioning approach designed to improve overall performance of multi-parallel I/O data flows by minimizing performance degradation of cache pollution and resource contention. By dynamically adjusting cache partition sizes for each data flow by considering cache sensitivity on performance, our strategy seeks to determine the optimal cache partition sizes to maximize overall I/O throughput. We implemented the strategy in the SSD simulator MQSim and evaluated its performance using various synthetic and real-world workloads. Our experimental results indicate that our dynamic cache partitioning strategy achieves an overall throughput increase of up to 33.22% compared to shared cache methods and outperforms static cache partitioning strategies by up to 21.19%.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"168 ","pages":"Article 102479"},"PeriodicalIF":1.0,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143609296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical properties of a class of randomized binary search algorithms 一类随机二叉搜索算法的统计性质
IF 1 4区 计算机科学
Performance Evaluation Pub Date : 2025-03-05 DOI: 10.1016/j.peva.2025.102478
Ye Xia
{"title":"Statistical properties of a class of randomized binary search algorithms","authors":"Ye Xia","doi":"10.1016/j.peva.2025.102478","DOIUrl":"10.1016/j.peva.2025.102478","url":null,"abstract":"<div><div>In this paper, we analyze the statistical properties of a randomized binary search algorithm and its variants. These algorithms have applications in caching and load balancing in distributed environments such as peer-to-peer networks, cloud storage, data centers, and content distribution networks. The basic discrete version of the problem is as follows. Suppose there are <span><math><mi>m</mi></math></span> servers, numbered 1, 2, …, <span><math><mi>m</mi></math></span>, out of which the first <span><math><mi>k</mi></math></span> servers are marked as special, where <span><math><mi>k</mi></math></span> is unknown. These <span><math><mi>k</mi></math></span> servers may contain a particular file or service that clients want. The objective is to select one of the marked servers uniformly at random. Considering the intended applications, we impose the constraint that there is no central controller to facilitate the selection process. We start with a basic algorithm: In each step, the client requesting the service chooses a number <span><math><mi>y</mi></math></span> uniformly at random from <span><math><mrow><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mo>…</mo><mo>,</mo><mi>x</mi></mrow></math></span>, where <span><math><mi>x</mi></math></span> is the number chosen in the previous step, initially set to <span><math><mi>m</mi></math></span> in the first step. A query is then sent to server <span><math><mi>y</mi></math></span> asking whether <span><math><mi>y</mi></math></span> is marked. If the answer is yes, the algorithm returns <span><math><mi>y</mi></math></span>; otherwise, the process is repeated with <span><math><mrow><mi>x</mi><mo>←</mo><mi>y</mi></mrow></math></span>. In this paper, we primarily consider two batch versions of this algorithm in which multiple numbers are chosen in each step and multiple queries are made in parallel. We derive the mean and variance (exact and/or asymptotic) for the number of search steps in each version of the algorithm, and when possible, we give its distribution. Additionally, we analyze the access pattern of queries across the entire search space.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"168 ","pages":"Article 102478"},"PeriodicalIF":1.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143563474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Multiserver Job Queuing Model with big and small jobs: Stability in the case of infinite servers 具有大小作业的多服务器作业排队模型:无限服务器情况下的稳定性
IF 1 4区 计算机科学
Performance Evaluation Pub Date : 2025-02-27 DOI: 10.1016/j.peva.2025.102477
Adityo Anggraito , Diletta Olliaro , Marco Ajmone Marsan , Andrea Marin
{"title":"The Multiserver Job Queuing Model with big and small jobs: Stability in the case of infinite servers","authors":"Adityo Anggraito ,&nbsp;Diletta Olliaro ,&nbsp;Marco Ajmone Marsan ,&nbsp;Andrea Marin","doi":"10.1016/j.peva.2025.102477","DOIUrl":"10.1016/j.peva.2025.102477","url":null,"abstract":"<div><div>The Multiserver Job Queuing Model (MJQM) is a queuing system that plays a key role in the study of the dynamics of resource allocation in data centers. The MJQM comprises a waiting line with infinite capacity and a large number of servers. In this paper, we look at the limiting case in which the number of servers is infinite. Jobs are termed “multiserver” because each one is characterized by a resource demand in terms of number of simultaneously used servers and by a service duration. Job classes are defined by collecting all jobs that require the same number of servers. Job service times are independent and identically distributed random variables whose distributions depend on the class of the job. We consider the case of only two job classes: “small” jobs use a fixed number of servers, while “big” jobs use all servers in the system. The service discipline is First-In First-Out (FIFO). This means that if the job at the Head-of-Line (HOL) cannot enter service because the number of free servers is not sufficient to meet the job requirement, it blocks all subsequent jobs, even if there are sufficient free servers for them. Despite its importance, only few results exist for the MJQM, whose analysis is challenging, especially because the MJQM is not work-conserving. This implies that even the stability region of the MJQM is known only in special cases. In a previous work, we obtained a closed-form stability condition for MJQM with big and small jobs under the assumption of exponentially distributed service times for small jobs. In this paper, we compute the stability condition of MJQM with an infinite number of servers processing big and small jobs, considering different distributions of the service times of small jobs. Simulations are used to support the analytical results and to investigate the impact of service time distributions on the average job waiting time before saturation.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"168 ","pages":"Article 102477"},"PeriodicalIF":1.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143577868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational algorithms and arrival theorem for non-conventional product-form solutions 非常规积型解的计算算法和到达定理
IF 1 4区 计算机科学
Performance Evaluation Pub Date : 2025-02-21 DOI: 10.1016/j.peva.2025.102469
Diletta Olliaro , Gianfranco Balbo , Andrea Marin , Matteo Sereno
{"title":"Computational algorithms and arrival theorem for non-conventional product-form solutions","authors":"Diletta Olliaro ,&nbsp;Gianfranco Balbo ,&nbsp;Andrea Marin ,&nbsp;Matteo Sereno","doi":"10.1016/j.peva.2025.102469","DOIUrl":"10.1016/j.peva.2025.102469","url":null,"abstract":"<div><div>Queuing networks with finite capacity are widely discussed in performance analysis literature. One approach to address the finite capacity of stations involves the implementation of a <em>skip-over</em> policy. Under this policy, when a customer arrives at a saturated station, service at that station is skipped, and the customer is rerouted based on the predefined network routing protocol.</div><div>Skip-over networks have been extensively investigated, and they exhibit a product-form stationary distribution under the exponential assumptions of Jackson networks. However, a comprehensive understanding of the celebrated <em>Arrival Theorem</em> for this class of product-form models is still lacking and relies on certain conjectures.</div><div>This paper makes three contributions: (i) it provides an in-depth comprehension of the Arrival Theorem for skip-over networks by offering a proof for the conjectures outlined in existing literature, (ii) it introduces a Mean Value Analysis (MVA) algorithm tailored for this type of queuing networks, and (iii) it explores the implications of these findings on the class of product-form queuing networks with fetching and repetitive service discipline.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"168 ","pages":"Article 102469"},"PeriodicalIF":1.0,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143549094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy-performance tradeoffs in server farms with batch services and setup times 具有批处理服务和设置时间的服务器群中的能源性能权衡
IF 1 4区 计算机科学
Performance Evaluation Pub Date : 2025-01-30 DOI: 10.1016/j.peva.2025.102468
Thu Le-Anh , Tuan Phung-Duc
{"title":"Energy-performance tradeoffs in server farms with batch services and setup times","authors":"Thu Le-Anh ,&nbsp;Tuan Phung-Duc","doi":"10.1016/j.peva.2025.102468","DOIUrl":"10.1016/j.peva.2025.102468","url":null,"abstract":"<div><div>Data centers consume a large amount of energy, much of which is wasted due to idle servers. Turning off idle servers might be an effective power-saving solution; however, there is a trade-off between energy savings and system performance. Hence, we propose a setup queueing model with a batching policy that allows servers to process a set of jobs simultaneously to minimize power consumption while maintaining acceptable performance. We consider an M/M/<span><math><mrow><mi>c</mi><mo>/</mo></mrow></math></span>SET–BATCH queue, a multi-server batch service queue with a fixed batch size and setup times, and some variants, including systems in which idle servers delay before turning off or systems in which the batch size is dynamic. We analyze the steady-state probabilities and system performance of the M/M/<span><math><mrow><mi>c</mi><mo>/</mo></mrow></math></span>SET–BATCH system and its variants. Our analysis of the M/M/<span><math><mrow><mi>c</mi><mo>/</mo></mrow></math></span>SET–BATCH system with lower computational complexity is made possible by utilizing the special structure of the model. In addition, we use simulations to compare the M/M/<span><math><mrow><mi>c</mi><mo>/</mo></mrow></math></span>SET–BATCH model with some other variants with different setup time distributions. The results suggest that the model performs better when the setup time has a larger coefficient of variation. Our results indicate that the batching policy enhances the system performance, especially when we allow servers to be idle before turning them off.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"168 ","pages":"Article 102468"},"PeriodicalIF":1.0,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foreword - Special Issue - MASCOTS 2023 前言-特刊-吉祥物2023
IF 1 4区 计算机科学
Performance Evaluation Pub Date : 2025-01-03 DOI: 10.1016/j.peva.2025.102467
Maria Carla Calzarossa , Anshul Gandhi
{"title":"Foreword - Special Issue - MASCOTS 2023","authors":"Maria Carla Calzarossa ,&nbsp;Anshul Gandhi","doi":"10.1016/j.peva.2025.102467","DOIUrl":"10.1016/j.peva.2025.102467","url":null,"abstract":"","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102467"},"PeriodicalIF":1.0,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143182336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coupled queues with server interruptions: Some solutions 带有服务器中断的耦合队列:一些解决方案
IF 1 4区 计算机科学
Performance Evaluation Pub Date : 2024-12-18 DOI: 10.1016/j.peva.2024.102466
Herwig Bruneel, Arnaud Devos
{"title":"Coupled queues with server interruptions: Some solutions","authors":"Herwig Bruneel,&nbsp;Arnaud Devos","doi":"10.1016/j.peva.2024.102466","DOIUrl":"10.1016/j.peva.2024.102466","url":null,"abstract":"&lt;div&gt;&lt;div&gt;We study three different &lt;em&gt;discrete-time&lt;/em&gt; queueing systems, which accommodate two types of customers, named type 1 and type 2. New customers arrive independently from slot to slot, but the numbers of arrivals of both types in any slot are possibly mutually dependent; their joint probability generating function (&lt;em&gt;pgf&lt;/em&gt;) is &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;. The service times of all customers are deterministically equal to one time slot.&lt;/div&gt;&lt;div&gt;We first consider a scenario (&lt;em&gt;Option&lt;/em&gt; &lt;span&gt;&lt;math&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;) with &lt;em&gt;one single server&lt;/em&gt; which is to be shared by the two customer types. Here, we assume that type-1 customers have &lt;em&gt;absolute service priority&lt;/em&gt; over type-2 customers. Moreover, the server is subject to &lt;em&gt;random server interruptions&lt;/em&gt;, which occur independently from slot to slot. We derive a functional equation for the steady-state joint pgf &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;U&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt; of the numbers of type-1 and type-2 customers in the system. Relying on the application of Rouché’s theorem, we are able to explicitly solve the functional equation for &lt;em&gt;arbitrary&lt;/em&gt; arrival pgfs &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;, but more elegant results are obtained for some specific choices of &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;.&lt;/div&gt;&lt;div&gt;Next, we focus on two different scenarios (&lt;em&gt;Option&lt;/em&gt; &lt;span&gt;&lt;math&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt; and &lt;em&gt;Option&lt;/em&gt; &lt;span&gt;&lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;) where both customer types have their &lt;em&gt;own dedicated server&lt;/em&gt;. Here, there are no service priorities involved. In Option &lt;span&gt;&lt;math&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;, the two servers experience &lt;em&gt;simultaneous&lt;/em&gt; interruptions, whereas in Option &lt;span&gt;&lt;math&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;, &lt;em&gt;only one&lt;/em&gt; of the servers is subject to interruptions. Again, we derive functional equations for the pgf &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;U&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;. Although solving these equations for arbitrary arrival pgfs &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/ms","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102466"},"PeriodicalIF":1.0,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143181845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Formal error bounds for the state space reduction of Markov chains 马尔可夫链状态空间缩减的形式误差边界
IF 1 4区 计算机科学
Performance Evaluation Pub Date : 2024-12-18 DOI: 10.1016/j.peva.2024.102464
Fabian Michel, Markus Siegle
{"title":"Formal error bounds for the state space reduction of Markov chains","authors":"Fabian Michel,&nbsp;Markus Siegle","doi":"10.1016/j.peva.2024.102464","DOIUrl":"10.1016/j.peva.2024.102464","url":null,"abstract":"<div><div>We study the approximation of a Markov chain on a reduced state space, for both discrete- and continuous-time Markov chains. In this context, we extend the existing theory of formal error bounds for the approximated transient distributions. In the discrete-time setting, we bound the stepwise increment of the error, and in the continuous-time setting, we bound the rate at which the error grows. In addition, the same error bounds can also be applied to bound how far an approximated stationary distribution is from stationarity. As a special case, we consider aggregated (or lumped) Markov chains, where the state space reduction is achieved by partitioning the state space into macro states. Subsequently, we compare the error bounds with relevant concepts from the literature, such as exact and ordinary lumpability, as well as deflatability and aggregatability. These concepts provide stricter than necessary conditions for settings in which the aggregation error is zero. We also present possible algorithms for finding suitable aggregations for which the formal error bounds are low, and we analyze first experiments with these algorithms on a range of different models.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102464"},"PeriodicalIF":1.0,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143181844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial: Special issue on Performance Analysis and Evaluation of Systems for Artificial Intelligence 社论:人工智能系统性能分析与评价特刊
IF 1 4区 计算机科学
Performance Evaluation Pub Date : 2024-12-13 DOI: 10.1016/j.peva.2024.102465
Anshul Gandhi , Bo Jiang , Shaolei Ren
{"title":"Editorial: Special issue on Performance Analysis and Evaluation of Systems for Artificial Intelligence","authors":"Anshul Gandhi ,&nbsp;Bo Jiang ,&nbsp;Shaolei Ren","doi":"10.1016/j.peva.2024.102465","DOIUrl":"10.1016/j.peva.2024.102465","url":null,"abstract":"","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102465"},"PeriodicalIF":1.0,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143182335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Job assignment in machine learning inference systems with accuracy constraints 具有准确性约束的机器学习推理系统中的任务分配
IF 1 4区 计算机科学
Performance Evaluation Pub Date : 2024-12-12 DOI: 10.1016/j.peva.2024.102463
Tuhinangshu Choudhury , Gauri Joshi , Weina Wang
{"title":"Job assignment in machine learning inference systems with accuracy constraints","authors":"Tuhinangshu Choudhury ,&nbsp;Gauri Joshi ,&nbsp;Weina Wang","doi":"10.1016/j.peva.2024.102463","DOIUrl":"10.1016/j.peva.2024.102463","url":null,"abstract":"<div><div>Modern machine learning inference systems often host multiple models that can perform the same task with different levels of accuracy and latency. For example, a large model can be more accurate but slow, whereas a smaller and less accurate can be faster in serving inference queries. Amidst the rapid advancements in Large Language Models (LLMs), it is paramount for such systems to strike the best trade-off between latency and accuracy. In this paper, we consider the problem of designing job assignment policies for a multi-server queueing system where servers have heterogeneous rates and accuracies, and our goal is to minimize the expected inference latency while meeting an average accuracy target. Such queueing systems with constraints have been sparsely studied in prior literature to the best of our knowledge. We first identify a lower bound on the minimum achievable latency under any policy that achieves the target accuracy <span><math><msup><mrow><mi>a</mi></mrow><mrow><mo>∗</mo></mrow></msup></math></span> using a linear programming (LP) formulation. Building on the LP solution, we introduce a Randomized-Join-the Idle Queue (R-JIQ) policy, which consistently meets the accuracy target and asymptotically (as system size increases) achieves the optimal latency <span><math><mrow><msub><mrow><mi>T</mi></mrow><mrow><mtext>LP-LB</mtext></mrow></msub><mrow><mo>(</mo><mi>λ</mi><mo>)</mo></mrow></mrow></math></span>. However, the R-JIQ policy relies on the knowledge of the arrival rate <span><math><mi>λ</mi></math></span> to solve the LP. To address this limitation, we propose the Prioritize Ordered Pairs (POP) policy that incorporates the concept of <em>ordered pairs</em> of servers into waterfilling to iteratively solve the LP. This allows the POP policy to function without relying on the arrival rate. Experiments suggest that POP performs robustly across different system sizes and load scenarios, achieving near-optimal performance.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"167 ","pages":"Article 102463"},"PeriodicalIF":1.0,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143181843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信