SQL Server – Number of Error Log Files

 

Background

Trying to review error logs on one of our SQL Server Instances and discovered that it is likely that the error logs we might have to mine is so far back and we might no longer have it.
And, so took to see if we have properly calibrated the Number of Error Logs Files to keep before rollover.

 

Review

GUI

SQL Server Management Studio ( SSMS )

Screen Shot

configuresqlservererrorlogs

 

Explanation

  1. In the screenshot above we can see that the Number of Errors has not been explicit set and so the system will use the default value of 6
  2. To set it we can check the “Limit the number of error log files before they are recycled” and provide a number to set it to

 

Script

For automation, script is a bit better and here is one …

Code

Stored Procedure

Stored Procedure – errorLog.sp_ConfigureNumberofErrorLogFiles


set nocount on;
go

USE [master]
GO

if schema_id('errorLog') is null
begin
	
	exec('create schema [errorLog] authorization [dbo] ')

end
go

if object_id('[errorLog].[sp_ConfigureNumberofErrorLogFiles]') is null
begin

	exec('create procedure [errorLog].[sp_ConfigureNumberofErrorLogFiles] as ')

end
go

alter procedure [errorLog].[sp_ConfigureNumberofErrorLogFiles] 
(
	  @numberofErrorLogsDesired int = 99
	, @scriptOnly bit = 0
)
as 
begin

	set nocount on;
	set XACT_ABORT on;

	declare @strNumberofErrorLogsCurrent	nvarchar(60)
	declare @iNumberofErrorLogsDefault		int
	declare @iNumberofErrorLogsCurrent		int
	declare @numberofErrorLogsMax			int

	declare @strLog							varchar(120)
	
	/*
		Set default values
	*/
	set @numberofErrorLogsMax = 99
	set @strNumberofErrorLogsCurrent = null
	set @iNumberofErrorLogsDefault = -1
	set @iNumberofErrorLogsCurrent = @iNumberofErrorLogsDefault

	if (@numberofErrorLogsDesired > 99)
	begin

		set @numberofErrorLogsDesired = @numberofErrorLogsMax

		set @strLog = 'Maximum Number of errors specified is : ' 
						+ cast(@numberofErrorLogsDesired as varchar(10))
						+ ' but max allowed is '		
						+ cast(@numberofErrorLogsMax as varchar(10))
						+ '!'
						+ ' will use ' 
						+ cast(@numberofErrorLogsMax as varchar(10))

		print @strLog

	end	

	/*
		Read Registry Value
			HKEY_LOCAL_MACHINE
			Branch : Software\Microsoft\MSSQLServer\MSSQLServer
			Item   :- NumErrorLogs
	*/
	exec master.dbo.xp_instance_regread
			  N'HKEY_LOCAL_MACHINE'
			, N'Software\Microsoft\MSSQLServer\MSSQLServer'
			, N'NumErrorLogs'
			, @iNumberofErrorLogsCurrent output

	/*
		If Number of Error Logs is set, then  convert to string
	*/
	if (@iNumberofErrorLogsCurrent is not null )
	begin


		/*
			If Number of Current Logs has been set
		*/
		if (
				(@iNumberofErrorLogsCurrent != @iNumberofErrorLogsDefault )
			)
		begin

			print '@iNumberofErrorLogsCurrent : ' 
					+ cast(
							isNull(@iNumberofErrorLogsCurrent, '')
								as varchar(10)
						  )


		end

		set @strNumberofErrorLogsCurrent
				= cast
					(
						isNull(@iNumberofErrorLogsCurrent, -1)
						as varchar(10)
					)


		/*
			If String is not null
		*/
		if (
				( @strNumberofErrorLogsCurrent is not null )
		   )
		begin

			set @strLog = '@strNumberofErrorLogsCurrent : ' 
								+ @strNumberofErrorLogsCurrent

			print @strLog

		end


	end -- if (@iNumberofErrorLogsCurrent is not null )


	/*
		If Number of Errors Logs is not set to what we desire, then review it
	*/
	if (
			( @iNumberofErrorLogsCurrent != @numberofErrorLogsDesired )
		)
	begin

		if (@scriptOnly = 0)
		begin

			set @strLog = 'Adjusting Number of Error Logs from ' 
						+ convert(varchar(10), @iNumberofErrorLogsCurrent )
						+  ' to '
						+ convert(varchar(10), @numberofErrorLogsDesired )
						+ ' .. '

			print @strLog


			EXEC master.dbo.xp_instance_regwrite 
						  N'HKEY_LOCAL_MACHINE'
						, N'Software\Microsoft\MSSQLServer\MSSQLServer'
						, N'NumErrorLogs'
						, REG_DWORD
						, @numberofErrorLogsDesired

			set @strLog = 'Adjusted Number of Error Logs from ' 
						+ convert(varchar(10), @iNumberofErrorLogsCurrent )
						+  ' to '
						+ convert(varchar(10), @numberofErrorLogsDesired )

			print @strLog

		end --script only
		else
		begin

			set @strLog = 'Skipping Adjustment of Number of Error Logs from ' 
						+ convert(varchar(10), @iNumberofErrorLogsCurrent )
						+  ' to '
						+ convert(varchar(10), @numberofErrorLogsDesired )

			print @strLog

		end
	end

end

GO


Invoke


use [master]
go

declare @numberofErrorLogsDesired int
declare @scriptOnly				  bit

set @numberofErrorLogsDesired = 99
set @scriptOnly =0

exec [errorLog].[sp_ConfigureNumberofErrorLogFiles] 
		  @numberofErrorLogsDesired = @numberofErrorLogsDesired
		, @scriptOnly = @scriptOnly

Output

RegQueryValueEx() returned error 2 – The System cannot find the file specified

Image

cannotfindthefilespecified

Textual
RegQueryValueEx() returned error 2, 'The system cannot find the file specified.'
Msg 22001, Level 1, State 1
@strNumberofErrorLogsCurrent : -1
Adjusting Number of Error Logs from -1 to 99 .. 
Adjusted Number of Error Logs from -1 to 99

Explanation

By default, the value is missing and so when we invoke master.dbo.xp_instance_regread to read it, we receive a warning message that reads “RegQueryValueEx() returned error 2, ‘The system cannot find the file specified.’

 

Registry

The actual registry branch where the information is kept is dependent on the version of the SQL Server Engine and the Instance Name

v2012

Here us what things look list for a SQL Server 2012 Instance, before setting the value

By Default

Image

v2012-default

Explanation

  1. This registry branch above is for a SQL Server v2012 instance
  2. The branch is HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQLServer

 

Post Setting

Once setting the value for Number of Error Logs to keep our registry will look like what we have set below.

v2012-set

 

Source Control

GitHub

Uploaded to github here

SQL Server – Backup to Network Storage – Error – The system cannot find the file specified ( 18210 )

Background

Taking a backup to get our LogShipping coupling back operational, but the backup is failing.

The error message is posted below.

Error Message

 

Event ID Message
18210 BackupMedium::ReportIoError: write failure on backup device ‘\\networkstorage\SQLBackups\MSSQL\HRDBProd\Full\hrdb.bak’. Operating system error 2(The system cannot find the file specified.).

 

Possible Remedy

There are a couple of areas that we will review:

  1. Adjust Network Session Timeout setting
  2. Disable and Stop “SQL Server VSS Service
  3. If on MS Windows 2003, apply KB 843515

Settings

  1. Increase Network Session Timeout
    • Branch :- HKLM\SYSTEM\CurrentControlSet\Services\LanmanWorkstation\Parameters
    • Item :- SessTimeOut
    • Default :- 15
    • Range :- 10–65,535 ( seconds )
    • What to do: Increase to 360
      • This will change the timeout to 6 minutes

It does not appear that the System needs to be restarted for this change to be effectual.

SQL Server VSS Service

Disable SQL Server VSS Service

Services

Hotfix

 

Microsoft

MS Windows 2003

  1. KB 843515
    • Your network connection may be reset when you try to perform a file operation on a remote Windows Server 2003-based computer
      https://support.microsoft.com/en-us/kb/843515

      • This patch alleviates a problem where concurrent SMB Sessions criss cross each other

 

Restore

After making the changes described above our backups are now running successfully.

But, still not out of the woods.

As we started the restore (with stats=1) noticed that the beacon is a bit slow.

Let us see what the metrics looks like…

DB Restore Profiling

Here we query dynamic management views  (dmvs)  to see expected timeline.

Code

No code writing here, as SQLDenis already has code.

SELECT
    d.PERCENT_COMPLETE AS [%Complete],
    d.TOTAL_ELAPSED_TIME/60000 AS ElapsedTimeMin,
    d.ESTIMATED_COMPLETION_TIME/60000   AS TimeRemainingMin,
    d.TOTAL_ELAPSED_TIME*0.00000024 AS ElapsedTimeHours,
    d.ESTIMATED_COMPLETION_TIME*0.00000024  AS TimeRemainingHours,
    s.text AS Command
FROM    sys.dm_exec_requests d
CROSS APPLY sys.dm_exec_sql_text(d.sql_handle)as s
WHERE  d.COMMAND LIKE 'RESTORE DATABASE%'
ORDER   BY 2 desc, 3 DESC

 

Output

RequestsTime

 

Explanation

  1. We already used up 2 hours
  2. And, have 12 more hours to go

 

Review Metrics

Resource Monitor

If you are using MS Windows 2008 and higher, please, please use “Resource Monitor”.

We can quickly see our Disk IO Stats.

ResourceMonitor-Disk

Quick Explanation:

  1. In the example above, we see that the System process is giving us about 10MB/sec for reads.  The data is being read from our backup (.bak) file
  2. The data is being written to our data file at about 10 MB/sec to our datafile (.mdf)

 

Resource Monitor

Performance Monitor

PerformanceMonitor.201601050457PM

Metric:

  1. SQLServer:BackupDevice
    • Device Throughput Bytes/sec at about 11 MB/sec
  2. PhysicalDisk
    • %Disk Read Time at 1.711
    • %Disk Time at 101.563
    • %Disk Write Time at 99.853
    • Avg. Disk Write Queue Length at 4.998
  3. Network Interface
    • Bytes Received/sec of 700 KB/sec
    • Bytes Sent/sec  of 300 KB/sec
    • Bytes Total/sec of 1MB/sec

Explanation

  1. Our backup device throughput matches our Resource Monitor File I/O Stats
  2. Our Network Interface is very tiny
  3. The big numbers is with Disk Utilization
    • Disk Read percentile is only at 1.711
    • Disk Time is at 101.563%
    • And, Disk Write is at 99.853%
    • Confirms that our burden is Disk Writes

 

Restore Stats

RestoreDBStats

Explanation
  1. The complete restore took 17 hours
  2. Thankfully the restore engine gives us a breakdown of pages restored at the filegroup level

 

Fess up

The reason why I wanted to review the backup file stats and compare with network stats is that I stupidly used a UNC path to reference the backup file, while infact the backup file is stored locally on our Log Shipping secondary.

Thankfully, MS Windows is covering for us, and not treating the backup file as a Network resource.

 

Commentary

We see we have problems with Disk I/O throughput.

Is the problem Disk RAID Level?

 

References

Hotfix

  1. Board index ‹ Discontinued and Previous Versions ‹ SQL Backup Previous Versions < Write failure on backup device
    https://forums.red-gate.com/viewtopic.php?p=11826

 

Q/A

  1. Home » SQL Server 7,2000 » Backups » Problems Backing up SQL 2000 over the network…
    http://www.sqlservercentral.com/Forums/Topic532254-24-1.aspx
  2. SBS – 2008 – VSS Backup
    http://serverfault.com/questions/294086/sbs-2008-vss-backup

 

Support

  1. Error message when you run the “vssadmin list writers” command on a Windows Server 2003-based computer: “Error: 0x8000FFFF”
    https://support.microsoft.com/en-us/kb/940184


Microsoft – System Monitor

  1. SQL Server: Backup Device Object
    https://technet.microsoft.com/en-us/library/aa905131(v=sql.80).aspx

 

Blogs

  1. Performance Tuning Tips for SQL Server Backup and Restore
    http://www.sql-server-performance.com/2007/backup-restore-tuning/

 

MSSQL Server – Database Recovery [Manual] (after a crash) – MS SQL Server v2005 and above

 

Tried out Paul Randal’s database recovery steps (http://www.sqlskills.com/blogs/paul/post/TechEd-Demo-Creating-detaching-re-attaching-and-fixing-a-suspect-database.aspx)  and it works a charm —

Note that the database server will try to and most like recover itself.  But, in our case, the time for a successful recovery was too much and so we chose to ‘circumvent’ possible recovery by renaming the database log files.

Here are the steps to re-create database crash & manual recovery:

  • Create database Demo
  • Create Table
  • Generate Transactions
  • Get MS SQL Server’s Process ID
  • Kill MS SQL Server Process
  • Rename Database Log Files
  • Restart MS SQL Server Service
  • Execute Manual Restore
    • Set database to emergency mode
    • Set database to single user
    • Execute DBCC CheckDB with repair_allow_data_loss
    • Set database to multi_user

 

Create Database Demo

 


if db_id('Demo') is not null
begin
    DROP DATABASE Demo;
end
go

CREATE DATABASE Demo
ON
PRIMARY
(
  NAME = Demo_PrimaryDataFile,
  FILENAME = 'E:\MSSQL\datafiles\DEMO\datafiles\Demo_PrimaryDataFile.mdf'
    , SIZE = 3 MB
    , MAXSIZE = 10 MB
    , FILEGROWTH = 10 %
),
(
  NAME = Demo_SecondaryDataFile,
  FILENAME = 'E:\MSSQL\datafiles\DEMO\datafiles\Demo_SecondaryDataFile.ndf'
    , SIZE = 1 MB
    , MAXSIZE = 10 MB
    , FILEGROWTH = 10 %
),
(
  NAME = Demo_SecondaryDataFile2,
  FILENAME = 'E:\MSSQL\datafiles\DEMO\datafiles\Demo_SecondaryDataFile2.ndf'
    , SIZE = 1 MB
    , MAXSIZE = 10 MB
    , FILEGROWTH = 10 %
)
LOG ON
(
  NAME = Demo_LogFile,
  FILENAME = 'E:\MSSQL\datafiles\DEMO\logfiles\Demo_LogFile.ldf'
    , SIZE = 2 MB
    , MAXSIZE = 15 MB
    , FILEGROWTH = 3 MB
)
,
(
  NAME = Demo_SecondaryLogFile,
  FILENAME = 'E:\MSSQL\datafiles\DEMO\logfiles\Demo_SecondaryLogFile.ldf'
    , SIZE = 2 MB
    , MAXSIZE = 15 MB
    , FILEGROWTH = 3 MB
)
COLLATE Latin1_General_CI_AS;

 

Create Table – Session

 

use [Demo]
go

if object_id('session') is not null
begin
drop table session
end
go

create table session
(
  [id] int not null identity(1,1) primary key
, [dateAdded] datetime not null default getdate()
, [serverName] sysname not null default @@servername
)
go

 

Generate Transactions


use [demo]
go

set nocount on
go

declare @id int

set @id = 1

while (@id != 0)
begin

      insert into session default values

      set @id = @id + 1

      print @id

end
go

 

Get MS SQL Server’s System Process ID (PID)

use [master]
go

select serverproperty('processid') as processID


 

Using Task Manager kill that process (PID)

  • Launch Task Manager
  • Identify the matching Process ID
  • Kill that Process

 

 

 

Using Windows Explorer – Rename \ Move the Log files

  • Rename E:\MSSQL\datafiles\DEMO\logfiles\Demo_LogFile.ldf to xDemo_LogFile.ldf

 

  • Rename E:\MSSQL\datafiles\DEMO\logfiles\Demo_SecondaryLogFile.ldf to xDemo_SecondaryLogFile.ldf

 

 

 

Restart MS SQL Server – And, the erroglog  will indicate that the Demo database can not be brought online


2010-09-27 22:34:28.48 spid24s     Error: 17207, Severity: 16, State: 1.

2010-09-27 22:34:28.48 spid24s     FileMgr::StartLogFiles: Operating system error 2(The system cannot find the file specified.) occurred while creating or opening file 'E:\MSSQL\datafiles\DEMO\logfiles\Demo_LogFile.ldf'. Diagnose and correct the operating system error, and retry the operation.

2010-09-27 22:34:28.48 spid24s     File activation failure. The physical file name "E:\MSSQL\datafiles\DEMO\logfiles\Demo_LogFile.ldf" may be incorrect.
2010-09-27 22:34:28.48 spid24s     Error: 17207, Severity: 16, State: 1.

2010-09-27 22:34:28.48 spid24s     FileMgr::StartLogFiles: Operating system error 2(The system cannot find the file specified.) occurred while creating or opening file 'E:\MSSQL\datafiles\DEMO\logfiles\Demo_SecondaryLogFile.ldf'. 

Diagnose and correct the operating system error, and retry the operation.
2010-09-27 22:34:28.48 spid24s     File activation failure. The physical file name "E:\MSSQL\datafiles\DEMO\logfiles\Demo_SecondaryLogFile.ldf" may be incorrect.

2010-09-27 23:20:26.75 spid24s     The log cannot be rebuilt because the database was not cleanly shut down.

 

To fix do the following:

  • Set database to emergency mode
  • Set database to single user
  • Execute DBCC CheckDB with repair_allow_data_loss
  • Set database to multi_user

 

 


ALTER DATABASE Demo SET EMERGENCY;
GO 

ALTER DATABASE Demo SET SINGLE_USER;
GO

DBCC CHECKDB (Demo, REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS, ALL_ERRORMSGS;
GO

ALTER DATABASE Demo SET MULTI_USER;
GO

 

 

References

  1. Creating, detaching, re-attaching, and fixing a suspect database
    http://www.sqlskills.com/blogs/paul/post/TechEd-Demo-Creating-detaching-re-attaching-and-fixing-a-suspect-database.aspx