{"title":"故障检测调度的强化学习方法","authors":"Fancong Zeng","doi":"10.1109/QSIC.2007.4385492","DOIUrl":null,"url":null,"abstract":"A failure-detection scheduler for an online production system must strike a tradeoff between performance and reliability. If failure-detection processes are run too frequently, valuable system resources are spent checking and rechecking for failures. However, if failure-detection processes are run too rarely, a failure can remain undetected for a long time. In both cases, system performability suffers. We present a model-based learning approach that estimates the failure rate and then performs an optimization to find the tradeoff that maximizes system performability. We show that our approach is not only theoretically sound but practically effective, and we demonstrate its use in an implemented automated deadlock-detection system for Java.","PeriodicalId":136227,"journal":{"name":"Seventh International Conference on Quality Software (QSIC 2007)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Reinforcement-Learning Approach to Failure-Detection Scheduling\",\"authors\":\"Fancong Zeng\",\"doi\":\"10.1109/QSIC.2007.4385492\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A failure-detection scheduler for an online production system must strike a tradeoff between performance and reliability. If failure-detection processes are run too frequently, valuable system resources are spent checking and rechecking for failures. However, if failure-detection processes are run too rarely, a failure can remain undetected for a long time. In both cases, system performability suffers. We present a model-based learning approach that estimates the failure rate and then performs an optimization to find the tradeoff that maximizes system performability. We show that our approach is not only theoretically sound but practically effective, and we demonstrate its use in an implemented automated deadlock-detection system for Java.\",\"PeriodicalId\":136227,\"journal\":{\"name\":\"Seventh International Conference on Quality Software (QSIC 2007)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seventh International Conference on Quality Software (QSIC 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/QSIC.2007.4385492\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seventh International Conference on Quality Software (QSIC 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QSIC.2007.4385492","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Reinforcement-Learning Approach to Failure-Detection Scheduling
A failure-detection scheduler for an online production system must strike a tradeoff between performance and reliability. If failure-detection processes are run too frequently, valuable system resources are spent checking and rechecking for failures. However, if failure-detection processes are run too rarely, a failure can remain undetected for a long time. In both cases, system performability suffers. We present a model-based learning approach that estimates the failure rate and then performs an optimization to find the tradeoff that maximizes system performability. We show that our approach is not only theoretically sound but practically effective, and we demonstrate its use in an implemented automated deadlock-detection system for Java.