Introduction This summer we had the benefit to do some research with the new features built in SQL Server and the more advanced analytical part of SQL Server 2016 such as R. The research intends to explore the possibility to find correlation between certain events, such as broken hardware or server unavailability, and deviations in performance data from related computer systems. The main purpose and final goal is to be able to automatically find the most probable causes to certain events, especially where there may not be any obvious connections and/or the number of considerable causes is far too many for manual analysis. The analysis has taken place in a smaller environment containing simulated errors. Performance, alerts and events data has been collected from a period of 30 days. Performance data from each point in time has been compared with other data from the same entity to measure its deviation. Its deviation ratio has been used to find patterns by filtering out data considered as “not deviating”.Some filtered data has appeared to have patterns similar to the patterns from some alerts and events. Calculating deviations from 30 days (7.7 million observations) has been possible in less than 3 minutes. The method should therefore be applicable on larger environments with decent…
CSI, or Continual Service Improvement, is one of the more important processes when working with IT Service Management. To achieve good results when implementing IT Service Management you need to be able to measure, follow up and evaluate complete processes and not just separate parts at a time. Showing a holistic picture of how functional your IT Services are yet another challenge while using System Center products. They are all sold under one System Center flag and yet are completely separate entities. Thus making your life hard once you realize that all of the separate bits and pieces are actually a part of one big puzzle. Here is how we help our clients overcome the System Center segregation issue and enable them to have full visibility of their IT Service Delivery.
World of IT is changing and so are the IT organizations. The cloud era has begun and it is now being widely adopted. As a result of this, new technologies and products are evolving rapidly, and many of them powered by the cloud platform. While it’s appealing to look into and explore all the new capabilities of all these new technologies, it’s easy to forget that most organizations in the real world are still managing or partially managing their datacenters on-premise. And more important for this article, many of them are managed with products from the Microsoft System Center Suite. In this article I will talk about combining and analyzing the result sets from different technologies, such as agent based monitoring from Microsoft System Center Operations Manager (SCOM), but also some rather new technologies, such as log analytics with Microsoft Operations Management Suite (OMS). In addition, I’ll address some of the concerns and requests we get from our customers throughout the industry such as service modeling, security considerations and wish for a holistic perspective of the whole IT Service delivery supported by their processes.
The transition from reactive to proactive monitoring can in some cases be a real challenge for IT organizations. This usually involves a lot of tuning of thresholds and alerts to give an early indication if a resource might fail. It can also increase the alert noise in your environment and to succeed with the proactive initiative, more IT staff might be required to work with all the new alerts and indications to prevent a possible service failure. To deliver a successful implementation of proactive monitoring with the increased amount of events generated from growing numbers of devices each year, another approach needs to be considered.