{"title":"大型系统中的机器学习异常检测","authors":"J. Murphree","doi":"10.1109/AUTEST.2016.7589589","DOIUrl":null,"url":null,"abstract":"We have a need for methods to efficiently determine the health of a system. Diagnostics and prognostics determine system heath through analysis of data from sensors. Anomalies in the data can help us determine if there is a failure or a pending failure. There are common statistical methods to detect anomalies in individual measurements. For systems with many measurements, the anomalies may occur as specific combinations of values. Large systems have various associated states and modes which define the valid measurements. The amount of data to analyze grows very quickly as the system becomes more complex. In recent years techniques have been developed to address large data analysis. Machine Learning encompasses a broad selection of tools to optimize a statistical model of the data. These tools include supervised learning techniques, such as linear regression and logistic regression, in which training data exists to tune the model. Unsupervised learning, such as clustering, is used to explore data which does not have a defined output label associated with inputs data. Standard approaches to training supervised learning systems require a large sample of positive and negative outcome data. Some uses of machine learning involve data where there are very few cases of negative outcomes. There are machine learning algorithms defined as Anomaly Detection which are designed to deal with this type of data. Simple algorithms include Gaussian Distribution Analysis, which assumes independence in distributions of data. Large Systems with anomalies defined in the dependent combinations of data require either a manual creation of combinations of independent variables, or Multivariate Gaussian Distribution Analysis, which does not scale well for large systems. A further complication is the mixture of linear and discrete data. Neural Networks are a type of learning system which has been applied to each of the individual needs addressed above. This paper describes an approach to anomaly detection using neural networks for the specific problems in large systems to efficiently determine system health.","PeriodicalId":314357,"journal":{"name":"2016 IEEE AUTOTESTCON","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"Machine learning anomaly detection in large systems\",\"authors\":\"J. Murphree\",\"doi\":\"10.1109/AUTEST.2016.7589589\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We have a need for methods to efficiently determine the health of a system. Diagnostics and prognostics determine system heath through analysis of data from sensors. Anomalies in the data can help us determine if there is a failure or a pending failure. There are common statistical methods to detect anomalies in individual measurements. For systems with many measurements, the anomalies may occur as specific combinations of values. Large systems have various associated states and modes which define the valid measurements. The amount of data to analyze grows very quickly as the system becomes more complex. In recent years techniques have been developed to address large data analysis. Machine Learning encompasses a broad selection of tools to optimize a statistical model of the data. These tools include supervised learning techniques, such as linear regression and logistic regression, in which training data exists to tune the model. Unsupervised learning, such as clustering, is used to explore data which does not have a defined output label associated with inputs data. Standard approaches to training supervised learning systems require a large sample of positive and negative outcome data. Some uses of machine learning involve data where there are very few cases of negative outcomes. There are machine learning algorithms defined as Anomaly Detection which are designed to deal with this type of data. Simple algorithms include Gaussian Distribution Analysis, which assumes independence in distributions of data. Large Systems with anomalies defined in the dependent combinations of data require either a manual creation of combinations of independent variables, or Multivariate Gaussian Distribution Analysis, which does not scale well for large systems. A further complication is the mixture of linear and discrete data. Neural Networks are a type of learning system which has been applied to each of the individual needs addressed above. This paper describes an approach to anomaly detection using neural networks for the specific problems in large systems to efficiently determine system health.\",\"PeriodicalId\":314357,\"journal\":{\"name\":\"2016 IEEE AUTOTESTCON\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE AUTOTESTCON\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AUTEST.2016.7589589\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE AUTOTESTCON","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AUTEST.2016.7589589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Machine learning anomaly detection in large systems
We have a need for methods to efficiently determine the health of a system. Diagnostics and prognostics determine system heath through analysis of data from sensors. Anomalies in the data can help us determine if there is a failure or a pending failure. There are common statistical methods to detect anomalies in individual measurements. For systems with many measurements, the anomalies may occur as specific combinations of values. Large systems have various associated states and modes which define the valid measurements. The amount of data to analyze grows very quickly as the system becomes more complex. In recent years techniques have been developed to address large data analysis. Machine Learning encompasses a broad selection of tools to optimize a statistical model of the data. These tools include supervised learning techniques, such as linear regression and logistic regression, in which training data exists to tune the model. Unsupervised learning, such as clustering, is used to explore data which does not have a defined output label associated with inputs data. Standard approaches to training supervised learning systems require a large sample of positive and negative outcome data. Some uses of machine learning involve data where there are very few cases of negative outcomes. There are machine learning algorithms defined as Anomaly Detection which are designed to deal with this type of data. Simple algorithms include Gaussian Distribution Analysis, which assumes independence in distributions of data. Large Systems with anomalies defined in the dependent combinations of data require either a manual creation of combinations of independent variables, or Multivariate Gaussian Distribution Analysis, which does not scale well for large systems. A further complication is the mixture of linear and discrete data. Neural Networks are a type of learning system which has been applied to each of the individual needs addressed above. This paper describes an approach to anomaly detection using neural networks for the specific problems in large systems to efficiently determine system health.