SQL Server

Machine Learning algorithms together with SCOM for correlations between performance,  alerts and events data

By | Event Management, Subject Matter Expert, System Center Operations Manager | No Comments

Introduction This summer we had the benefit to do some research with the new features built in SQL Server and the more advanced analytical part of SQL Server 2016 such as R. The research intends to explore the possibility to find correlation between certain events, such as broken hardware or server unavailability, and deviations in performance data from related computer systems. The main purpose and final goal is to be able to automatically find the most probable causes to certain events, especially where there may not be any obvious connections and/or the number of considerable causes is far too many for manual analysis. The analysis has taken place in a smaller environment containing simulated errors. Performance, alerts and events data has been collected from a period of 30 days. Performance data from each point in time has been compared with other data from the same entity to measure its deviation. Its deviation ratio has been used to find patterns by filtering out data considered as “not deviating”.Some filtered data has appeared to have patterns similar to the patterns from some alerts and events. Calculating deviations from 30 days (7.7 million observations) has been possible in less than 3 minutes. The method should therefore be applicable on larger environments with decent…

Read More