Howdy System Center Data Protection Manager followers, Chris Butcher here again. I recently came across an issue I thought was interesting but also thought was something that may be discoverable and fixable, so you get another overly wordy blog that promises to keep you on the edge of your seat!
So, what did I find you ask? Great question. This problem appears to rear its head in DPM 2012 Service Pack 1 (SP1) and DPM 2012 R2. The symptoms may actually be somewhat hidden as it will depend on whether you are logged on during the failure or not.
Here are the specifics where you will see this:
Protection for SS (System State) or BMR (Bare Metal Recovery).
On the protected server, there are multiple drives and the amount of free space on one drive has changed drastically.
The errors/failures in DPM will simply show that jobs have failed as the DPM service has failed. The real telling part of this is what you will see in the event log.
Event ID: 999
An unexpected error caused a failure for process 'msdpm'. Restart the DPM process 'msdpm'.
<FatalServiceError><__System><ID>19</ID><Seq>5210</Seq><TimeCreated>19.12.2013 11:09:04</TimeCreated><Source>DpmThreadPool.cs</Source><Line>163</Line><HasError>True</HasError></__System><ExceptionType>FormatException</ExceptionType><ExceptionMessage>Input string was not in a correct format.</ExceptionMessage><ExceptionDetails>System.FormatException: Input string was not in a correct format.
at System.Text.StringBuilder.AppendFormat(IFormatProvider provider, String format, Object args)
at System.String.Format(IFormatProvider provider, String format, Object args)
at Microsoft.Internal.EnterpriseStorage.Dls.Trace.TraceProvider.Trace(TraceFlag flag, String fileName, Int32 fileLine, Guid* taskId, Boolean taskIdSpecified, String formatString, Object args)
at Microsoft.Internal.EnterpriseStorage.Dls.Trace.TraceProvider._TraceMessage(TraceFlag flag, String fileName, Int32 fileLine, String formatString, Object args)
at Microsoft.Internal.EnterpriseStorage.Dls.WriterHelper.SystemStateWriterHelper.RenameBMRReplicaFolderIfNeeded(String roFileSpec)
at Microsoft.Internal.EnterpriseStorage.Dls.WriterHelper.SystemStateWriterHelper.ValidateROListOnPreBackupSuccess(Message msg, RADataSourceStatusType raDatasourceStatus, Guid volumeBitmapId, List`1& missingVolumesList, ReplicaDataset& lastFullReplicaDataset, ROListType& roList)
at Microsoft.Internal.EnterpriseStorage.Dls.Prm.ReplicaPreBackupBlock.ValidateROList(Message msg, RADataSourceStatusType raDatasourceStatus, Guid datasetId)
at Microsoft.Internal.EnterpriseStorage.Dls.Prm.ReplicaPreBackupBlock.RAPreBackupSuccess(Message msg)
at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.Fsm.Transition.Execute(Message msg)
at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.Fsm.Engine.ChangeState(Message msg)
at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.TaskInstance.Process(Object dummy)
at Microsoft.Internal.EnterpriseStorage.Dls.TaskExecutor.FsmThreadFunction.Function(Object taskThreadContextObj)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallbackInternal(_ThreadPoolWaitCallback tpWaitCallBack)
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(Object state)</ExceptionDetails></FatalServiceError>
There are a couple of key points to this event.
First is “An unexpected error caused a failure for process 'msdpm'.” This is indicating that MSDPM.exe has crashed. When this happens, all jobs that are running at that time will fail.
The next item is <ExceptionMessage>Input string was not in a correct format.</ExceptionMessage>. It is possible for MSDPM.exe to crash, but this one is different in that
Input string was not in a correct format
When DPM begins protection for SS or BMR, the
file is created on the protected server. This file will indicate the drive on this system that has the most free space. At some point, this changes on the protected server and the drive with the most free space changes. If the PS then at any points recreates the
file, it will create it with the new “drive with the most free space” in it. DPM will attempt to change this information on the DPM server and ultimately this fails, resulting in the crash in MSDPM.exe seen above.
The good news is that this problem was identified and has been fixed. In order to correct this issue, get most recent version of DPM.