Event ManagementSystem Center Operations Manager

Querying Alerts From SCOM Data Warehouse

By October 25, 2015 No Comments

Reporting_Computerstripped
In previous blog post we looked at custom SLO reporting solution for SCOM. Now we will take a closer look at what can be done with all the alert-related data that is stored in SCOM Data warehouse.

Understanding Alert Nature

At first it might seem that all we want to see about alerts is just sitting and waiting for us in the alert view (Alert.vAlert). But by now we all know that nothing is as easy as it seems in System Center data warehouses.

There are three main points that complicate matters when it comes to alert querying.

First of all comes the fact that alerts can be generated either by Rules or by Monitors. And both alert sets end up in one fact table, making it a little more complicated to figure out the exact entity which generated each row.

Second thing that might happen is that your query might return results generated by Managed Entities which are no longer available in SCOM (deleted old stuff etc.). And we don’t want those to be cluttering our reports either.

Finally, figuring out the current (latest) resolution state is a bit of a painful task (both for the DB server and us analysts) because all of those states reside in a separate table and there are quite a few of them to choose from!

Let us tackle each issue separately and in the end we will get to a nice looking data table like this:

Which was it: Rule or Monitor?

Every alert can be generated either by a Rule or a Monitor and the only trace of the difference we get within the fact rows are numerical indicators (1 or 0).

But we want to know more than just what it was. We want the name here as well. And since Monitors reside in vMonitor and Rule names can be found in vRule, we end up having to join our Alert view to the two lookup views and pick the right value depending on the indicator flag.

LEFT OUTER JOIN [SEGOTSQL01\SCOM2012R2].OperationsManagerDW.dbo.vRule R ON
A.WorkflowRowId = R.RuleRowId
LEFT OUTER JOIN [SEGOTSQL01\SCOM2012R2].OperationsManagerDW.dbo.vMonitor M ON
A.WorkflowRowId = M.MonitorRowId

Using outer join allows either value to be null and does not affect the number of rows that are returned from the Alert view. Now we just need to pick the right name.

This can be done by checking the Monitor Indicator. For example like this:

CASE A.MonitorAlertInd
WHEN 1 THEN M.MonitorDefaultName
WHEN 0 THEN R.RuleDefaultName
END AS [RaisedByMonitor/Rule]

In human language that would mean: when MonitorAlertInd field says we are dealing with a monitor (value = 1) then it should use MonitorDefaultName field coming from v.Monitor, otherwise (when value = 0) use RuleDefaultName from v.Rule.

To make the results more readable, we can re-label the numerical alert source identifiers to ‘Monitor’ when it is 1 and ‘Rule’ when it is 0. Like this:

CASE A.MonitorAlertInd
WHEN 1 THEN 'Monitor'
WHEN 0 THEN 'Rule'
END AS AlertSource

Now we are fully aware which entity generated every single row and we have nice labels for them. Time to make sure that we only see values generated by Managed Entities which are still available in SCOM.

Managed Entities from Existing Management Packs Only

In most real-life scenarios we don’t want to report anything that is obsolete. That would be the case when analyzing alert data as well. Thus our goal should be to only include data that is relevant to management packs currently available in SCOM.

Also, some alerts stay open in SCOM Data Warehouse for ever and ever and the reason is quite simple- management pack which created the alert was deleted BEFORE the alert was resolved. So in the end there was nothing that could resolve the open issue, thus it is still open and skewing our result set.

vManagedEntityManagementGroup view is the place where you can find traces of validity timeframe for managed entities via ToDateTime field:

What we really want from this data set is to make sure our managed entity was within a ManagedEntityManagementGroup at the time when it was created.

Using a bit more advanced join can help here:

INNER JOIN OperationsManagerDW.dbo.vManagedEntityManagementGroup MEMG ON
A.ManagedEntityRowId = MEMG.ManagedEntityRowId
AND
(A.RaisedDateTime >= MEMG.FromDateTime AND MEMG.ToDateTime IS NULL)

MEMG.ToDateTime IS NULL ensures our Entity is still available and not part of a deleted MP while A.RaisedDateTime >= MEMG.FromDateTime part of the join ensures a valid connection between ME and MEG existed at the time of alert creation.

Two down one to go. Let’s find the latest resolution state of our alerts.

Current Resolution State of Alert

Each Alert has one or more states associated with it. All state changes are stored in Alert.vAlertResolutionState view and can be reached via a one-to-many join from the Alert.vAlert view.

In cases when we want to analyze alerts, having all of their states in the same result set is not really useful and only causes clutter. What we will do instead is fetch the latest available state for every Alert.

First we need to identify the timestamp of the latest state change. That is easily done with SQL MAX statement:

SELECT
 AlertGuid,
 MAX(StateSetDateTime) AS StateSetDateTime
FROM OperationsManagerDW.Alert.vAlertResolutionState
GROUP BY AlertGuid

This subquery can now be complemented with the actual Resolution State from the same view:

SELECT I.AlertGuid, O.ResolutionState
FROM (
 SELECT
AlertGuid,
MAX(StateSetDateTime) AS StateSetDateTime
FROM OperationsManagerDW.Alert.vAlertResolutionState
GROUP BY AlertGuid
 )
INNER JOIN OperationsManagerDW.Alert.vAlertResolutionState O ON
I.AlertGuid = O.AlertGuid
 AND
 I.StateSetDateTime = O.StateSetDateTime

The combination of these two queries gives us the latest state by AlertGuid. This result can now be inner-joined to the main query and the result will only show one state for every alert.

The full query

We have looked at all the building blocks required to fetch Alerts and data related to them.

Time to complete the puzzle. Copy the whole query and give it a go! Let us know how it goes!

SELECT
A.RaisedDateTime,
E.DisplayName AS RaisedByEntity,
CASE A.MonitorAlertInd
WHEN 1 THEN M.MonitorDefaultName
WHEN 0 THEN R.RuleDefaultName
END AS [RaisedByMonitor/Rule],
CASE A.MonitorAlertInd
WHEN 1 THEN 'Monitor'
WHEN 0 THEN 'Rule'
END AS AlertSource,
A.Category AS AlertCategory,
CASE A.Severity
WHEN 2 THEN 'Critical'
WHEN 1 THEN 'Warning'
WHEN 0 THEN 'Information'
END AS AlertSeverity,
CASE A.Priority
WHEN 2 THEN 'High'
WHEN 1 THEN 'Medium'
WHEN 0 THEN 'Low'
END AS AlertPriority,
A.AlertName,
RS.ResolutionStateName AS ResolutionState,
ARS.TimeFromRaisedSeconds,
A.RepeatCount
FROM
OperationsManagerDW.Alert.vAlert A
INNER JOIN OperationsManagerDW.dbo.vManagedEntity E ON
A.ManagedEntityRowId = E.ManagedEntityRowId
--Show only alerts that are part of a working CP
INNER JOIN OperationsManagerDW.dbo.vManagedEntityManagementGroup MEMG ON
A.ManagedEntityRowId = MEMG.ManagedEntityRowId
AND
(A.RaisedDateTime >= MEMG.FromDateTime AND MEMG.ToDateTime IS NULL)
INNER JOIN OperationsManagerDW.Alert.vAlertResolutionState ARS ON
A.AlertGuid = ARS.AlertGuid
--Return only largest resolution state value for alert
INNER JOIN
(SELECT I.AlertGuid, O.ResolutionState
FROM (SELECT
AlertGuid,
MAX(StateSetDateTime) AS StateSetDateTime
FROM OperationsManagerDW.Alert.vAlertResolutionState
GROUP BY AlertGuid ) I
INNER JOIN OperationsManagerDW.Alert.vAlertResolutionState O ON
I.AlertGuid = O.AlertGuid AND I.StateSetDateTime = O.StateSetDateTime) MARS ON
ARS.ResolutionState = MARS.ResolutionState
AND
ARS.AlertGuid = MARS.AlertGuid
INNER JOIN OperationsManagerDW.dbo.vResolutionState RS ON
ARS.ResolutionState = RS.ResolutionStateId
--Outer joins for Rule and Monitor lookup
LEFT OUTER JOIN OperationsManagerDW.dbo.vRule R ON
A.WorkflowRowId = R.RuleRowId
LEFT OUTER JOIN OperationsManagerDW.dbo.vMonitor M ON
A.WorkflowRowId = M.MonitorRowId
--Return only latest month.
WHERE
A.RaisedDateTime > DATEADD(MM, -1, GETDATE())

Conclusion

So know we have a great way of querying alerts from SCOM Data warehouse and the next step is to get control of Distributed Applications and how we can relate alerts to them.

Stay tuned for upcoming posts.