The transition from reactive to proactive monitoring can in some cases be a real challenge for IT organizations. This usually involves a lot of tuning of thresholds and alerts to give an early indication if a resource might fail. It can also increase the alert noise in your environment and to succeed with the proactive initiative, more IT staff might be required to work with all the new alerts and indications to prevent a possible service failure. To deliver a successful implementation of proactive monitoring with the increased amount of events generated from growing numbers of devices each year, another approach needs to be considered.
I bet we would all agree that the ultimate goal of IT monitoring is to prevent IT service outages. But admit, even with all the monitors set up in System Center Operations Manager and alerts configured to notify admins when vital thresholds have been reached, we are still not being efficient enough in solving (not to mention preventing) critical issues on time. Dealing with alerts generated in SCOM has multiple complications. Too many alerts being generated and no clear priority system are a few examples, leading administrators into ‘alert ignorance’. In this article we are going to look at a different approach to identifying upcoming issues in your IT environment which will introduce clarity and guidance into the assorted jungle of alerts and capacity issues.
With the massive amount of data collected in the System Center Operations Manager from all servers and other monitored equipment, IT Operations departments are sitting on a gold mine of data just begging to be used. One of the areas that can benefit from such internal data capital is forecasting. By implementing forecasting processes you can predict the behavior of managed objects some months into the future. This knowledge enables you to act in advance in order to prevent service failures and service level breaches. Most business areas use some kind of forecasting methods when planning new investments, calculating yearly budgets etc. We believe that IT organizations should be no different and start using operational data to gain insights and learn from the past while planning their future.
Service Level Agreements allow IT organizations and their clients to have a common view upon service delivery quality. Service Level Objectives are defined within every SLA in order to obtain specific and measurable metrics which help evaluate and improve the delivery of IT services. Therefore monitoring SLO targets and outcomes is obviously a very important task. Especially within organizations working by (or starting to implement) ITIL framework. The whole of Microsoft Systems Center product family is designed to support ITIL processes. Measuring uptime and managing SLO values is no exception. In this article we will look at 3 ways to track SLO performance based on data gathered by Systems Center Operations Manager.