Noticed that one of our scheduled SQL Server Agent jobs occasionally fail.
Reviewed the scheduled job
Here is the Job
Job Details – SQL Statement
sqlcmd -E -S $(ESCAPE_SQUOTE(SRVR)) -d AdminDB -Q "EXECUTE [dbo].[DatabaseBackup] @Databases = 'USER_DATABASES', @Directory = N'Z:\sqlbackups', @BackupType = 'LOG', @Verify = 'Y', @CleanupTime = 72, @CheckSum = 'Y',@Compress='Y', @LogToTable = 'Y'" -b
- For all User Databases, there transaction logs are being backed up
- Good Settings
- Backup files are verified
- Log backup files greater than 3 days are shed
- CheckSum = Y
- Checksum is validated
- Compression is on
- Commands and Results are logged to table
Audit Job Logs
Aforementioned, Ola hallengren’s beautiful Database Backup scripts has been setup to journal its steps and result of each step into the dbo.CommandLog table.
Let us review the result of that script.
SELECT top 1000 [src] = 'Errors' , tblCL.DatabaseName , tblCL.Command , tblCL.CommandType , tblCL.StartTime , tblCL.ErrorNumber , tblCL.ErrorMessage FROM [dbo].[CommandLog] tblCL where tblCL.ErrorNumber != 0 order by tblCL.StartTime desc
- We can see that on 2016-09-25, 2016-09-18, 2016-09-11, we are failing at midnight with an error code of 3009
What is SQL Server Error Code 3009?
Error Number & Details
From Jeremy Kadlec – SQL Server Database Backup and Restore Failure Notifications ( Link ), here is a brief listing of pertinent Backup Failures.
And, so thanks to Jeremy Kadlec, we know that 3009 means we have a problem writing data to related MSDB tables.
MSDB Database Problems?
More digging around we found that another job runs at the same time.
Here is the log of our sp_delete_backup history job
At midnight every Sunday, we are scheduled to prune the backup history tables.
SQL Server Errors – Levels of Severity
Backups succeed, but the logging to the MSDB tables sometimes fails.
SQL Server nicely eats up the error and assigns it a severity of 16.
Database Engine Error Severity
We see it that severity of 16 translates to … “Indicates general errors that can be corrected by the user“.
Dedicated to Ola Hallengren.
There are commercial products out there that run into the same cockroach.
But, in some cases, the error is not properly logged well enough to fully trace and diagnose it.
There are a couple of ways to get around this problem.
I will have you try out options on your own.