Management pack synchronization, MP sync for short, is a key component of the Data Warehouse in Service Manager. The synchronization of management packs from the Service Manager management server to the Data Warehouse (DW) is responsible for defining the content of the data in the DW.
The figure below shows an overview of how MP sync works. The MP Sync job synchronizes all the management packs from the Service Manager source. This job starts to run as soon as you register the Service Manager management group, and it can take several hours to complete on its initial run, with subsequent runs taking a much smaller amount of time to complete. Additionally the MP sync job is currently hardcoded to run on an hourly basis.
The issue here is that MP Sync has been a hot bed for errors related to the deployment and cleanup of management packs in the data warehouse for a while now. The primary complaint around MP sync is that when it “fails” it usually gets stuck in a running state and does not recover / fail safe. Consequently, the fragility of the DW has earned it a reputation of being a component which is rather easy to break, but incredibly difficult to recover.
To make matters worse, when our support team gets hit with MP sync issues, the difficult of diagnoses, and the amount of time and care required for a reliable recovery can in many cases be extremely high. In fact, one of our senior escalation engineers, Richard Usher, likened solving DW issues to “brain surgery”.
Based on the information collected from our support team, we’ve learned that in most MP sync issues in fact stem from custom MPs being imported into the management server. For example, most customers when “trying out” custom MPs, are unaware of an hourly sync job in the background trying to import those custom MPs into the DWStagingAndConfig database. Or for that matter, the clean-up required on the DW side when those same MPs are later deleted. This, as you can imagine, causes considerable churn in the system, substantially increasing the odds of MP sync getting stuck in a failed state.
The plan is to make MP sync happen on demand, allowing users to choose the time and possibly MPs they want synchronized, instead of the automatic inflexible schedule and scope it currently follows. The idea was originally proposed by Manoj Parvathaneni, a long time expert on SM and one of our best escalation engineers. A typical user scenario would look something like this:
1. The DW is registered with the Management Server, triggering a one-time forced synchronization of all MPs.
2. Content described by the imported MPs starts funneling from the CMDB to the DW for archival.
3. Admin imports new MPs to the management server to enable additional functionality and customizations, but since MP sync is not a scheduled job, no new changes get migrated over.
4. Once the admin is happy with the new MPs, they navigate over to the Data Warehouse Jobs view in the console, and hit resume under tasks.
At this time we synchronize all MPs over to the DW, just like a regular MP sync job would have done.
We show a new form, allowing administrators to choose the MPs they would like to synchronize (see mockup below)
is clearly much easier to implement, and while it does allow for some measure of control on when the job runs, it doesn’t provide control over what gets synchronized and can be delivered fairly quickly, possibly in the coming update rollup in October.
allows explicit specification of MPs targeted for synchronization and provides greater control. However, it does incur heavier engineering costs making it a more expensive solution which will probably be delivered in vNext.
Here’s how you can help
We would love to hear your thoughts on this. Would you consider either of these approaches feasible? Which one is more preferable to you, and why? Are there any scenarios we might be overlooking?
Please use the comments section below to let us know what you think :)