Running into a problem with Exchange 2010, a FAS2240-4 and SnapManager for Exchange 7.1 where backups would randomly fail every now and then started failing consistently.
Our DFM server would send us an e-mail when the failure occurred that looked like this:
CLIENT APP ERROR Backup: SME Version 7.1: (111) on dora02 at Sun Sep 04 22:09:31 PDT 2016
The backup failure would also knock the databases offline and require us to re-sync them the next day.
Digging into the SME logs we found the following:
[22:10:42.635] *****BACKUP DETAIL SUMMARY***** [22:10:42.635] Backup group set #1: [22:10:42.635] Backup SG/DB [FK] Error: SnapManager detected the following Exchange writer error. Please retry SnapManager operation. VSS_E_WRITERERROR_RETRYABLE: The writer failed due to an error that might not occur if another snapshot copy is created. [22:10:42.635] Backup SG/DB [LQ] Error: SnapManager detected the following Exchange writer error. Please retry SnapManager operation. VSS_E_WRITERERROR_RETRYABLE: The writer failed due to an error that might not occur if another snapshot copy is created. [22:10:42.635] Backup SG/DB [RZ] Error: SnapManager detected the following Exchange writer error. Please retry SnapManager operation. VSS_E_WRITERERROR_RETRYABLE: The writer failed due to an error that might not occur if another snapshot copy is created. [22:10:42.635] Backup SG/DB [AE] Error: SnapManager detected the following Exchange writer error. Please retry SnapManager operation. VSS_E_WRITERERROR_RETRYABLE: The writer failed due to an error that might not occur if another snapshot copy is created. [22:10:42.635] Backup SG/DB [Public Folders (<SERVER>)] Error: SnapManager detected the following Exchange writer error. Please retry SnapManager operation. VSS_E_WRITERERROR_RETRYABLE: The writer failed due to an error that might not occur if another snapshot copy is created. [22:10:42.635] ***SNAPMANAGER BACKUP JOB ENDED AT: [09-04-2016_22.10.42] [22:10:42.635] Failed to backup storage groups/databases.
NetApp provides this page for what they call “Common VSS errors”: https://kb.netapp.com/support/index?page=content&id=1010785&locale=en_US
None of the suggestions there helped us.
In the end I found this forum post for a different product: https://community.emc.com/thread/168678?tstart=0 and applied the registry edits they suggested here: https://community.emc.com/message/705346#705346
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\TcpWindowSize=256000 HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\GlobalMaxTcpWindowSize=16777216 HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\KeepAliveInterval=1000 HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\KeepAliveTime=600000
and then rebooted our Exchange server running SME.
Since making the change roughly 20 days ago we haven’t had a single failed backup.