Forum Discussion
Persistent problem with DPMRA.EXE crashing and multiple inconsistent replicas in DPM 2019
As an update to this, I have stopped all jobs from running / being able to run during the troubleshooting process and used PowerShell to view the logs in real time on both the DPM server and the clients while running a consistency check one at a time on the previously failed jobs. From this process I have noticed that the Watson error coincides with the DPMRA.exe related Application Error in the event log, however the backup job does not fail immediately. After cancelling the failed job, I then removed the datasource from it's Protection Group, importantly ensuring no disk or tape data was retained, and continued this process until all offending datasources have been identified and removed (I had 2 which needed removing). After this the jobs were all able to be successfully resolved and once all the failed jobs had run successfully I could add the two problem datasources back in to their Protection Groups.
This has worked around the problem for me and I now have a functioning DPM server, but as to why the datasources have caused this problem, I am still no wiser. This is the second time in 2 months this has happened, and I don't know if it was the same datasources that caused the previous problem (a DPM database restore was how I got around it last time), but I will monitor this and check it next month to see if it happens again.
If anyone has any ideas as to why this may be occurring, or even how to investigate the cause further, that would be great. Also please let me know if anyone else experiences this problem. I have worked with DPM for a number of years now, and since DPM 2010, not had many issues - certainly not ones as disruptive as this. I do find the DPM logs not the most intuitive as they seem to add a lot of warnings and errors as part of normal behaviour which makes using them for troubleshooting quite hard at times. They are also not the easiest to decipher which adds further frustrations when using them for general troubleshooting.
I had the same thing happen, and began happening since DPM 2019 UR4.
I thought I isolated it down to certain replicas having files/folders with paths longer than 250 characters.
Usually stopping protection of those members mitigates the issue, and putting those members back into protection will cause DPMRA.exe to crash every 15 minutes (according to the Event Viewer's Application Logs).
I've also had this happen when using one DPM server for secondary protection of a primary server's protected items (or in some cases, just the database of the primary server).
Either way, it's bad design to have a backup destination crash your entire backup product.