You have both hourly and daily data for service component availability, performance, events and alerts gathered in SCOM. Since it is just heaps of data, it can be difficult to get an understanding whether any of the logged events occur according to some kind of time pattern and when any of them relate to one another. Recognizing reocurrences is very beneficial when managing sparse and at first sight unrelated events. After grouping and filtering all events by an identified time pattern we are able to find correlations and (hopefully) causations of specific happenings. In this blog post we will review the built-in SCOM reports for analyzing this kind of data and will also show you how adding some extra capabilities makes life much easier.
I bet we would all agree that the ultimate goal of IT monitoring is to prevent IT service outages. But admit, even with all the monitors set up in System Center Operations Manager and alerts configured to notify admins when vital thresholds have been reached, we are still not being efficient enough in solving (not to mention preventing) critical issues on time. Dealing with alerts generated in SCOM has multiple complications. Too many alerts being generated and no clear priority system are a few examples, leading administrators into ‘alert ignorance’. In this article we are going to look at a different approach to identifying upcoming issues in your IT environment which will introduce clarity and guidance into the assorted jungle of alerts and capacity issues.
With the massive amount of data collected in the System Center Operations Manager from all servers and other monitored equipment, IT Operations departments are sitting on a gold mine of data just begging to be used. One of the areas that can benefit from such internal data capital is forecasting. By implementing forecasting processes you can predict the behavior of managed objects some months into the future. This knowledge enables you to act in advance in order to prevent service failures and service level breaches. Most business areas use some kind of forecasting methods when planning new investments, calculating yearly budgets etc. We believe that IT organizations should be no different and start using operational data to gain insights and learn from the past while planning their future.
In previous blog post we looked at custom SLO reporting solution for SCOM. Now we will take a closer look at what can be done with all the alert-related data that is stored in SCOM Data warehouse. Understanding Alert Nature At first it might seem that all we want to see about alerts is just sitting and waiting for us in the alert view (Alert.vAlert). But by now we all know that nothing is as easy as it seems in System Center data warehouses. There are three main points that complicate matters when it comes to alert querying. First of all comes the fact that alerts can be generated either by Rules or by Monitors. And both alert sets end up in one fact table, making it a little more complicated to figure out the exact entity which generated each row. Second thing that might happen is that your query might return results generated by Managed Entities which are no longer available in SCOM (deleted old stuff etc.). And we don’t want those to be cluttering our reports either. Finally, figuring out the current (latest) resolution state is a bit of a painful task (both for the DB server and…
Service Level Agreements allow IT organizations and their clients to have a common view upon service delivery quality. Service Level Objectives are defined within every SLA in order to obtain specific and measurable metrics which help evaluate and improve the delivery of IT services. Therefore monitoring SLO targets and outcomes is obviously a very important task. Especially within organizations working by (or starting to implement) ITIL framework. The whole of Microsoft Systems Center product family is designed to support ITIL processes. Measuring uptime and managing SLO values is no exception. In this article we will look at 3 ways to track SLO performance based on data gathered by Systems Center Operations Manager.