I have been talking to a number of people, both online as well as in person (at MMS, TechED and in other occasions) and one of the recurring questions I get, as people plan for their deployment of System Center 2012 – Operations Manager and want to use the APM feature, is “how much more space will I need for APM data?”.
Now this really is a tough question… and, as much as I hate having to say so, the answer is “it depends”.
We do have APM factored in the Sizing Helper Spreadsheet – but as you might know from some other earlier posts of mine, I am no big fan of this type of calculations. I admit they are useful, but deep within me I consider them a necessary evil. This is because I realize how HARD it really is to accurately predict the amount of data for things (such as events) that are really the result of so many different factors and variables coming into play… that is almost impossible to give an accurate measure of how things will look like, without actually looking at the data a specific environment produces. APM events are not different: the sizer uses some math based off our tests, and generally errs on the side of caution, but it is still a very rough approximation when it comes to APM events. Every environment is unique, and the real amount of APM events that will be generated (hence the size they’ll take in your database) depends on how your applications behave in production, on the way your users interact with those, when they do it, how many of them there are, the thresholds and advanced settings you chose… Basically, trying to predict this is a bit of a divining act. And I don’t believe that is truly possible to do it accurately in advance.
What IS possible, instead, and much more reliable in my experience, is OBSERVATION and ADAPTATION (call it “tuning” if you wish): in short, taking a look of what data certain applications with certain settings produce over a period of time… and try to use that as a baseline for a forecast. And then, as time passes and the environment evolves, re-baseline and take it from there (tuning configuration and the applications themselves). This is what I have been trying to teach people about Tuning Management Packs when I was working in Premier Field Engineering, this is what I also write in the already-linked-post about ACS, and I am trying to apply the same logic here to APM.
So, what would a first step be in understanding the amount of database space that APM data is taking?
Well, my process would be like:
So, here’s the query from step #4: (to be executed against the Data Warehouse – note that it is a single query):
---APM Events and Alerts DW Version --APM Events by day select CONVERT(VARCHAR(20),UTCEVENTDATE,102) as Day, COUNT(*) as TotalAPMEvents into #APMEvents from apm.EVENT group by CONVERT(VARCHAR(20),UTCEVENTDATE,102) order by CONVERT(VARCHAR(20),UTCEVENTDATE,102) DESC --CSM Events by day select CONVERT(VARCHAR(20),UTCDATE,102) as Day, COUNT(*) as TotalCSMEvents into #CSMEvents from apm.CSEVENT group by CONVERT(VARCHAR(20),UTCDATE,102) order by CONVERT(VARCHAR(20),UTCDATE,102) DESC --get count of alerts per day select CONVERT(VARCHAR(20),a.DBCreatedDateTime,102) as Day, count(*) as TotalAPMAlerts INTO #APMAlerts from Alert.vAlert a left outer join vRule r on a.WorkflowRowId = r.RuleRowId left outer join vMonitor m on a.WorkflowRowId = m.MonitorRowId where r.ManagementPackRowId in (select ManagementPackRowId from ManagementPack mp with (nolock) where mp.ManagementPackSystemName in ('Microsoft.SystemCenter.Apm.Web','Microsoft.SystemCenter.Apm.Library')) or m.ManagementPackRowId in (select ManagementPackRowId from ManagementPack mp with (nolock) where mp.ManagementPackSystemName in ('Microsoft.SystemCenter.Apm.Web','Microsoft.SystemCenter.Apm.Library')) group by CONVERT(VARCHAR(20),a.DBCreatedDateTime,102) order by CONVERT(VARCHAR(20),a.DBCreatedDateTime,102) desc --combined table to see ratio select e.Day, CASE WHEN e.TotalAPMEvents IS NULL THEN 0 ELSE e.TotalAPMEvents END as 'Total APM Events', CASE WHEN ce.TotalCSMEvents IS NULL THEN 0 ELSE ce.TotalCSMEvents END as 'Total CSM Events', ((CASE WHEN e.TotalAPMEvents IS NULL THEN 0 ELSE e.TotalAPMEvents END) + (CASE WHEN ce.TotalCSMEvents IS NULL THEN 0 ELSE ce.TotalCSMEvents END)) as 'Total APM/CSM Events', CASE WHEN a.TotalAPMAlerts IS NULL THEN 0 ELSE a.TotalAPMAlerts END as 'Total APM Alerts' from #APMAlerts a right outer join #APMEvents e on a.Day = e.Day left outer join #CSMEvents ce on a.Day = ce.Day order by e.Day desc --cleanup drop table #APMAlerts drop table #CSMEvents drop table #APMEvents
The result set should look a bit like the following screenshot:
And by quickly copying and pasting the results in Excel you can get a graph like the following, which is fairly eloquent in terms of showing you ratio of APM events versus Alerts that have been generated based off those events:
Other useful queries could get fancier and look at things like:
These can also be accomplished with queries, of course, but before going the hard route, and very good starting point is using the report provided by Application Advisor, that give us this type of info already, i.e.:
Based on your findings you then can triage the issues and do a few things:
A sample flow on how to use the Advisor Reports is illustrated by the following diagram:
I hope this gives you useful tips on how to approach slicing and dicing the data collected by APM in order to get the most value out of it while also controlling the amount of space that your database requires… and eventually improving your applications!
Disclaimer
This posting is provided "AS IS" with no warranties, and confers no rights. Use of included utilities are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm