In previous blog post we looked at custom SLO reporting solution for SCOM. Now we will take a closer look at what can be done with all the alert-related data that is stored in SCOM Data warehouse.
Understanding Alert Nature
At first it might seem that all we want to see about alerts is just sitting and waiting for us in the alert view (Alert.vAlert). But by now we all know that nothing is as easy as it seems in System Center data warehouses.
There are three main points that complicate matters when it comes to alert querying.
First of all comes the fact that alerts can be generated either by Rules or by Monitors. And both alert sets end up in one fact table, making it a little more complicated to figure out the exact entity which generated each row.
Second thing that might happen is that your query might return results generated by Managed Entities which are no longer available in SCOM (deleted old stuff etc.). And we don’t want those to be cluttering our reports either.
Finally, figuring out the current (latest) resolution state is a bit of a painful task (both for the DB server and us analysts) because all of those states reside in a separate table and there are quite a few of them to choose from!
Let us tackle each issue separately and in the end we will get to a nice looking data table like this:
Which was it: Rule or Monitor?
Every alert can be generated either by a Rule or a Monitor and the only trace of the difference we get within the fact rows are numerical indicators (1 or 0).
But we want to know more than just what it was. We want the name here as well. And since Monitors reside in vMonitor and Rule names can be found in vRule, we end up having to join our Alert view to the two lookup views and pick the right value depending on the indicator flag.
LEFT OUTER JOIN [SEGOTSQL01\SCOM2012R2].OperationsManagerDW.dbo.vRule R ON A.WorkflowRowId = R.RuleRowId LEFT OUTER JOIN [SEGOTSQL01\SCOM2012R2].OperationsManagerDW.dbo.vMonitor M ON A.WorkflowRowId = M.MonitorRowId
Using outer join allows either value to be null and does not affect the number of rows that are returned from the Alert view. Now we just need to pick the right name.
This can be done by checking the Monitor Indicator. For example like this:
CASE A.MonitorAlertInd WHEN 1 THEN M.MonitorDefaultName WHEN 0 THEN R.RuleDefaultName END AS [RaisedByMonitor/Rule]
In human language that would mean: when MonitorAlertInd field says we are dealing with a monitor (value = 1) then it should use MonitorDefaultName field coming from v.Monitor, otherwise (when value = 0) use RuleDefaultName from v.Rule.
To make the results more readable, we can re-label the numerical alert source identifiers to ‘Monitor’ when it is 1 and ‘Rule’ when it is 0. Like this:
CASE A.MonitorAlertInd WHEN 1 THEN 'Monitor' WHEN 0 THEN 'Rule' END AS AlertSource
Now we are fully aware which entity generated every single row and we have nice labels for them. Time to make sure that we only see values generated by Managed Entities which are still available in SCOM.
Managed Entities from Existing Management Packs Only
In most real-life scenarios we don’t want to report anything that is obsolete. That would be the case when analyzing alert data as well. Thus our goal should be to only include data that is relevant to management packs currently available in SCOM.
Also, some alerts stay open in SCOM Data Warehouse for ever and ever and the reason is quite simple- management pack which created the alert was deleted BEFORE the alert was resolved. So in the end there was nothing that could resolve the open issue, thus it is still open and skewing our result set.
vManagedEntityManagementGroup view is the place where you can find traces of validity timeframe for managed entities via ToDateTime field:
What we really want from this data set is to make sure our managed entity was within a ManagedEntityManagementGroup at the time when it was created.
Using a bit more advanced join can help here:
INNER JOIN OperationsManagerDW.dbo.vManagedEntityManagementGroup MEMG ON A.ManagedEntityRowId = MEMG.ManagedEntityRowId AND (A.RaisedDateTime >= MEMG.FromDateTime AND MEMG.ToDateTime IS NULL)
MEMG.ToDateTime IS NULL ensures our Entity is still available and not part of a deleted MP while A.RaisedDateTime >= MEMG.FromDateTime part of the join ensures a valid connection between ME and MEG existed at the time of alert creation.
Two down one to go. Let’s find the latest resolution state of our alerts.
Current Resolution State of Alert
Each Alert has one or more states associated with it. All state changes are stored in Alert.vAlertResolutionState view and can be reached via a one-to-many join from the Alert.vAlert view.
In cases when we want to analyze alerts, having all of their states in the same result set is not really useful and only causes clutter. What we will do instead is fetch the latest available state for every Alert.
First we need to identify the timestamp of the latest state change. That is easily done with SQL MAX statement:
SELECT AlertGuid, MAX(StateSetDateTime) AS StateSetDateTime FROM OperationsManagerDW.Alert.vAlertResolutionState GROUP BY AlertGuid
This subquery can now be complemented with the actual Resolution State from the same view:
SELECT I.AlertGuid, O.ResolutionState FROM ( SELECT AlertGuid, MAX(StateSetDateTime) AS StateSetDateTime FROM OperationsManagerDW.Alert.vAlertResolutionState GROUP BY AlertGuid ) INNER JOIN OperationsManagerDW.Alert.vAlertResolutionState O ON I.AlertGuid = O.AlertGuid AND I.StateSetDateTime = O.StateSetDateTime
The combination of these two queries gives us the latest state by AlertGuid. This result can now be inner-joined to the main query and the result will only show one state for every alert.
The full query
We have looked at all the building blocks required to fetch Alerts and data related to them.
Time to complete the puzzle. Copy the whole query and give it a go! Let us know how it goes!
SELECT A.RaisedDateTime, E.DisplayName AS RaisedByEntity, CASE A.MonitorAlertInd WHEN 1 THEN M.MonitorDefaultName WHEN 0 THEN R.RuleDefaultName END AS [RaisedByMonitor/Rule], CASE A.MonitorAlertInd WHEN 1 THEN 'Monitor' WHEN 0 THEN 'Rule' END AS AlertSource, A.Category AS AlertCategory, CASE A.Severity WHEN 2 THEN 'Critical' WHEN 1 THEN 'Warning' WHEN 0 THEN 'Information' END AS AlertSeverity, CASE A.Priority WHEN 2 THEN 'High' WHEN 1 THEN 'Medium' WHEN 0 THEN 'Low' END AS AlertPriority, A.AlertName, RS.ResolutionStateName AS ResolutionState, ARS.TimeFromRaisedSeconds, A.RepeatCount FROM OperationsManagerDW.Alert.vAlert A INNER JOIN OperationsManagerDW.dbo.vManagedEntity E ON A.ManagedEntityRowId = E.ManagedEntityRowId --Show only alerts that are part of a working CP INNER JOIN OperationsManagerDW.dbo.vManagedEntityManagementGroup MEMG ON A.ManagedEntityRowId = MEMG.ManagedEntityRowId AND (A.RaisedDateTime >= MEMG.FromDateTime AND MEMG.ToDateTime IS NULL) INNER JOIN OperationsManagerDW.Alert.vAlertResolutionState ARS ON A.AlertGuid = ARS.AlertGuid --Return only largest resolution state value for alert INNER JOIN (SELECT I.AlertGuid, O.ResolutionState FROM (SELECT AlertGuid, MAX(StateSetDateTime) AS StateSetDateTime FROM OperationsManagerDW.Alert.vAlertResolutionState GROUP BY AlertGuid ) I INNER JOIN OperationsManagerDW.Alert.vAlertResolutionState O ON I.AlertGuid = O.AlertGuid AND I.StateSetDateTime = O.StateSetDateTime) MARS ON ARS.ResolutionState = MARS.ResolutionState AND ARS.AlertGuid = MARS.AlertGuid INNER JOIN OperationsManagerDW.dbo.vResolutionState RS ON ARS.ResolutionState = RS.ResolutionStateId --Outer joins for Rule and Monitor lookup LEFT OUTER JOIN OperationsManagerDW.dbo.vRule R ON A.WorkflowRowId = R.RuleRowId LEFT OUTER JOIN OperationsManagerDW.dbo.vMonitor M ON A.WorkflowRowId = M.MonitorRowId --Return only latest month. WHERE A.RaisedDateTime > DATEADD(MM, -1, GETDATE())
So know we have a great way of querying alerts from SCOM Data warehouse and the next step is to get control of Distributed Applications and how we can relate alerts to them.
Stay tuned for upcoming posts.