You will likely find that the default retention in the OpsMgr data warehouse will need to be adjusted for your environment. I often find customers are reluctant to adjust these �C because they don't know what they want to keep. So �C they assume the defaults are good �C and they just keep EVERYTHING.
This is a bad idea.
A data warehouse will often be one of the largest databases supported by a company. Large databases cost money. They cost money to support. They are more difficult to maintain. They cost more to backup in time, tape capacity, network impact, etc. They take longer to restore in the case of a disaster. The larger they get, the more they cost in hardware (disk space) to support them. The larger they get, can impact how long reports take to complete.
For these reasons �C you should give STRONG consideration to reducing your warehouse retention to your reporting REQUIREMENTS. If you don't have any �C MAKE SOME!
Originally �C when the product released �C you had to directly edit SQL tables to adjust this. Then �C a command line tool was released to adjust these values �C making the process easier and safer. This post is just going to be a walk through of this process to better understand using this tool �C and what each dataset actually means.
Here is the link to the command line tool:
http://blogs.technet.com/momteam/archive/2008/05/14/data-warehouse-data-retention-policy-dwdatarp-exe.aspx
Different data types are kept in the Data Warehouse in unique “Datasets”. Each dataset represents a different data type (events, alerts, performance, etc..) and the aggregation type (raw, hourly, daily)
Not every customer will have exactly the same data sets. This is because some management packs will add their own dataset �C if that MP has something very unique that it will collect �C that does not fit into the default “buckets” that already exist.
So �C first �C we need to understand the different datasets available �C and what they mean. All the datasets for an environment are kept in the “Dataset” table in the Warehouse database.
select * from dataset
order by DataSetDefaultName
This will show us the available datasets. Common datasets are:
Alert data set
Client Monitoring data set
Event data set
Microsoft.Windows.Client.Vista.Dataset.ClientPerf
Microsoft.Windows.Client.Vista.Dataset.DiskFailure
Microsoft.Windows.Client.Vista.Dataset.Memory
Microsoft.Windows.Client.Vista.Dataset.ShellPerf
Performance data set
State data set
Alert, Event, Performance, and State are the most common ones we look at.
However �C in the warehouse �C we also keep different aggregations of some of the datasets �C where it makes sense. The most common datasets that we will aggregate are Performance data, State data, and Client Monitoring data (AEM). The reason we have raw, hourly, and daily aggregations �C is to be able to keep data for longer periods of time �C but still have very good performance on running reports.
In MOM 2005 �C we used to stick ALL the raw performance data into a single table in the Warehouse. After a year of data was reached �C this meant the perf table would grow to a HUGE size �C and running multiple queries against this table would be impossible to complete with acceptable performance. It also meant grooming this table would take forever, and would be prone to timeouts and failures.
In OpsMgr �C now we aggregate this data into hourly and daily aggregations. These aggregations allow us to “summarize” the performance, or state data, into MUCH smaller table sizes. This means we can keep data for a MUCH longer period of time than ever before. We also optimized this by splitting these into multiple tables. When a table reaches a pre-determined size, or number of records �C we will start a new table for inserting. This allows grooming to be incredibly efficient �C because now we can simply drop the old tables when all of the data in a table is older than the grooming retention setting.
Ok �C that’s the background on aggregations. To see this information �C we will need to look at the StandardDatasetAggregation table.
select * from StandardDatasetAggregation
That table contains all the datasets, and their aggregation settings. To help make more sense of this - I will join the dataset and the StandardDatasetAggregation tables in a single query �C to only show you what you need to look at:
SELECT DataSetDefaultName,
AggregationTypeId,
MaxDataAgeDays
FROM StandardDatasetAggregation sda
INNER JOIN dataset ds on ds.datasetid = sda.datasetid
ORDER BY DataSetDefaultName
This query will give us the common dataset name, the aggregation type, and the current maximum retention setting.
For the AggregationTypeId:
0 = Raw
20 = Hourly
30 = Daily
Here is my output:
DataSetDefaultName | AggregationTypeId | MaxDataAgeDays |
Alert data set | 0 | 400 |
Client Monitoring data set | 0 | 30 |
Client Monitoring data set | 30 | 400 |
Event data set | 0 | 100 |
Microsoft.Windows.Client.Vista.Dataset.ClientPerf | 0 | 7 |
Microsoft.Windows.Client.Vista.Dataset.ClientPerf | 30 | 91 |
Microsoft.Windows.Client.Vista.Dataset.DiskFailure | 0 | 7 |
Microsoft.Windows.Client.Vista.Dataset.DiskFailure | 30 | 182 |
Microsoft.Windows.Client.Vista.Dataset.Memory | 0 | 7 |
Microsoft.Windows.Client.Vista.Dataset.Memory | 30 | 91 |
Microsoft.Windows.Client.Vista.Dataset.ShellPerf | 0 | 7 |
Microsoft.Windows.Client.Vista.Dataset.ShellPerf | 30 | 91 |
Performance data set | 0 | 10 |
Performance data set | 20 | 400 |
Performance data set | 30 | 400 |
State data set | 0 | 180 |
State data set | 20 | 400 |
State data set | 30 | 400 |
You will probably notice �C that we only keep 10 days of RAW Performance by default. Generally �C you don't want to mess with this. This is simply to keep a short amount of raw data �C to build our hourly and daily aggregations from. All built in performance reports in SCOM run from Hourly, or Daily aggregations by default.
Now we are cooking!
Fortunately �C there is a command line tool published that will help make changes to these retention periods, and provide more information about how much data we have currently. This tool is called DWDATARP.EXE. It is available for download HERE.
This gives us a nice way to view the current settings. Download this to your tools machine, your RMS, or directly on your warehouse machine. Run it from a command line.
Run just the tool with no parameters to get help:
C:\>dwdatarp.exe
To get our current settings �C run the tool with ONLY the �Cs (server\instance) and �Cd (database) parameters. This will output the current settings. However �C it does not format well to the screen �C so output it to a TXT file and open it:
C:\>dwdatarp.exe -s OMDW\i01 -d OperationsManagerDW > c:\dwoutput.txt
Here is my output (I removed some of the vista/client garbage for brevity)
Dataset name | Aggregation name | Max Age | Current Size, Kb |
Alert data set | Raw data | 400 | 18,560 ( 1%) |
Client Monitoring data set | Raw data | 30 | 0 ( 0%) |
Client Monitoring data set | Daily aggregations | 400 | 16 ( 0%) |
Configuration dataset | Raw data | 400 | 153,016 ( 4%) |
Event data set | Raw data | 100 | 1,348,168 ( 37%) |
Performance data set | Raw data | 10 | 467,552 ( 13%) |
Performance data set | Hourly aggregations | 400 | 1,265,160 ( 35%) |
Performance data set | Daily aggregations | 400 | 61,176 ( 2%) |
State data set | Raw data | 180 | 13,024 ( 0%) |
State data set | Hourly aggregations | 400 | 305,120 ( 8%) |
State data set | Daily aggregations | 400 | 20,112 ( 1%) |
Right off the bat �C I can see how little data that daily performance actually consumes. I can see how much data that only 10 days of RAW perf data consume. I also see a surprising amount of event data consuming space in the database. Typically �C you will see that perf hourly will consume the most space in a warehouse.
So �C with this information in hand �C I can do two things….
I can know what is using up most of the space in my warehouse.
I can know the Dataset name, and Aggregation name… to input to the command line tool to adjust it!
Now �C on to the retention adjustments.
First thing �C I will need to gather my Reporting service level agreement from management. This is my requirement for how long I need to keep data for reports. I also need to know “what kind” of reports they want to be able to run for this period.
From this discussion with management �C we determined:
We require detailed performance reports for 90 days (hourly aggregations)
We require less detailed performance reports (daily aggregations) for 1 year for trending and capacity planning.
We want to keep a record of all ALERTS for 6 months.
We don't use any event reports, so we can reduce this retention from 100 days to 30 days.
We don't use AEM (Client Monitoring Dataset) so we will leave this unchanged.
We don't report on state changes much (if any) so we will set all of these to 90 days.
Now I will use the DWDATARP.EXE tool �C to adjust these values based on my company reporting SLA:
dwdatarp.exe -s OMDW\i01 -d OperationsManagerDW -ds "Performance data set" -a "Hourly aggregations" -m 90
dwdatarp.exe -s OMDW\i01 -d OperationsManagerDW -ds "Performance data set" -a "Daily aggregations" -m 365
dwdatarp.exe -s OMDW\i01 -d OperationsManagerDW -ds "Alert data set" -a "Raw data" -m 180
dwdatarp.exe -s OMDW\i01 -d OperationsManagerDW -ds "Event data set" -a "Raw Data" -m 30
dwdatarp.exe -s OMDW\i01 -d OperationsManagerDW -ds "State data set" -a "Raw data" -m 90
dwdatarp.exe -s OMDW\i01 -d OperationsManagerDW -ds "State data set" -a "Hourly aggregations" -m 90
dwdatarp.exe -s OMDW\i01 -d OperationsManagerDW -ds "State data set" -a "Daily aggregations" -m 90
Now my table reflects my reporting SLA �C and my actual space needed in the warehouse will be much reduced in the long term:
Dataset name | Aggregation name | Max Age | Current Size, Kb |
Alert data set | Raw data | 180 | 18,560 ( 1%) |
Client Monitoring data set | Raw data | 30 | 0 ( 0%) |
Client Monitoring data set | Daily aggregations | 400 | 16 ( 0%) |
Configuration dataset | Raw data | 400 | 152,944 ( 4%) |
Event data set | Raw data | 30 | 1,348,552 ( 37%) |
Performance data set | Raw data | 10 | 468,960 ( 13%) |
Performance data set | Hourly aggregations | 90 | 1,265,992 ( 35%) |
Performance data set | Daily aggregations | 365 | 61,176 ( 2%) |
State data set | Raw data | 90 | 13,024 ( 0%) |
State data set | Hourly aggregations | 90 | 305,120 ( 8%) |
State data set | Daily aggregations | 90 | 20,112 ( 1%) |
Here are some general rules of thumb (might be different if your environment is unique)
Only keep the maximum retention of data in the warehouse per your reporting requirements.
Do not modify the performance RAW dataset.
Most performance reports are run against Perf Hourly data for detail performance throughout the day. For reports that span long periods of time (weeks/months) you should generally use Daily aggregation.
Daily aggregations should generally be kept for the same retention as hourly �C or longer.
Hourly datasets use up much more space than daily aggregations.
Most people don't use events in reports �C and these can often be groomed much sooner than the default of 100 days.
Most people don't do a lot of state reporting beyond 30 days, and these can be groomed much sooner as well if desired.
Don't modify a setting if you don't use it. There is no need.
The Configuration dataset generally should not be modified. This keeps data about objects to report on, in the warehouse. It should be set to at LEAST the longest of any perf, alert, event, or state datasets that you use for reporting.