SQL Server – Transaction Log Usage – Metrics

Background

Reviewing Transaction Log Utilization and wanted to briefly document the SQL Code snippets out on the Internet.

 

Outline

  1. DBCC SQLPERF(logspace)
  2. sys.dm_os_performance_counters
    • Object Name
      • SQLServer:Databases
    • Counter Name
      • Log File(s) Size (KB)
      • Log File(s) Used Size (KB)

Performance Counters

DBCC

DBCC sqlperf

Code

dbcc sqlperf(logspace)
go

Output

DBCCSQLPerf_20180611_1139AM.png

Performance Counters

Performance Counters – “Object Name :- SQLServer:Databases” // “Counter Name :- LogFile”

Code – Tabulated

SELECT
        [object]
		= rtrim(pc1.[object_name])

	, [database]
		= rtrim(pc1.instance_name)

	, [logSizeMB]
		= cast
			(
				pc1.cntr_value/1024.0
					as decimal(30, 2)
			)

	, [logUsedMB]
		= cast
			(
				pc2.cntr_value/1024.0
					as decimal(30, 2)
			)

	, [logAvailMB]
		= cast
			(
				(pc1.cntr_value - pc2.cntr_value)
					/1024.0
					as decimal(30, 2)
			)

	, [percentUtilization]
		= cast
			(
				(

						( pc2.cntr_value*100.00 )
					/ NULLIF(pc1.cntr_value, 0)

				)
				as decimal(10, 2)
			)

FROM   sys.dm_os_performance_counters as pc1

JOIN   sys.dm_os_performance_counters as pc2

	ON   pc1.[instance_name] = pc2.[instance_name]
	AND  pc1.[object_name] = pc2.[object_name]

where ( 1 = 1 )

AND pc1.[object_name] = 'SQLServer:Databases'

AND pc2.[object_name] = 'SQLServer:Databases'

AND pc1.counter_name = 'Log File(s) Size (KB)'

AND pc2.counter_name = 'Log File(s) Used Size (KB)'

AND pc1.instance_name not in
		(
				'_Total'
			, 'mssqlsystemresource'
		)

AND pc1.cntr_value > 0

order by

	[logUsedMB] desc

 

Output

performanceCounter_20180611_1133AM.png

Code – XML

select 

	tblInner.[node]

from
	(

		SELECT
              [object]
			    = rtrim(pc1.[object_name])

			, [database]
				= rtrim(pc1.instance_name)

			, [logSizeMB]
				= cast
					(
						pc1.cntr_value/1024.0
							as decimal(30, 2)
					)

			, [logUsedMB]
				= cast
					(
						pc2.cntr_value/1024.0
							as decimal(30, 2)
					)

			, [logAvailMB]
				= cast
					(
						(pc1.cntr_value - pc2.cntr_value)
							/1024.0
							as decimal(30, 2)
					)

			, [percentUtilization]
				= cast
					(
						(

								( pc2.cntr_value*100.00 )
							/ NULLIF(pc1.cntr_value, 0)

						)
						as decimal(10, 2)
					)

		FROM   sys.dm_os_performance_counters as pc1

		JOIN   sys.dm_os_performance_counters as pc2

			ON   pc1.[instance_name] = pc2.[instance_name]
			AND  pc1.[object_name] = pc2.[object_name]

		where ( 1 = 1 )

		AND pc1.[object_name] = 'SQLServer:Databases'

		AND pc2.[object_name] = 'SQLServer:Databases'

		AND pc1.counter_name = 'Log File(s) Size (KB)'

		AND pc2.counter_name = 'Log File(s) Used Size (KB)'

		AND pc1.instance_name not in
				(
					  '_Total'
					, 'mssqlsystemresource'
				)

		AND pc1.cntr_value > 0

		order by
			[logUsedMB] desc

		for xml
			  PATH('database')
			, root('list')

	) tblInner(node)

Output – XML

performanceCounter_XML_20180611_1137AM

 

References

  1. Microsoft
    • Developer
      • Downloads > Programs > Community > Documentation
        • Is there something like DBCC SQLPERF(LOGSPACE) that will show be a list of the databases and how much free space in each?
          Link

SQLServer – BulkCopy – Optimization – Reviewing Impact of Block Size via the Transaction Log Monitoring – Take 2

Background

As a follow-up to SQLServer – BulkCopy – Optimization – Take 1 ( https://danieladeniji.wordpress.com/2015/06/05/sqlserver-bulkcopy-optimization-take-1/ ), let us see how much churn occurs in our Transaction Log as we vary our Bulk Copy Program ( BCP ) block sizes.

Assumption

Let us make the following assumptions

  • Database Recovery Model
    • Recovery Model – Simple or Bulk Logged
    • Recovery Model – If recovery mode is Full, all changes, row allocation and data changes, are logged
  • We can truncate the destination table
  • We can lock the destination table

Diagnostic Tool

There are a couple of barometers we can use to read Transaction Log Utilization:

  1. DBCC SQL Perf
  2. sys.dm_db_log_space_usage
  3. fn_dblog
    • Undocumented SQL Server Dynamic Management View
    • Takes two parameters; the beginning and ending LSN

DBCC SQL Perf and sys.dm_db_log_usage offers database level information.  Whereas, fn_dblog works at the detail level.

Table Definition

Table - DBLab.NumbersSource

set noexec off;
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO

if schema_id('DBLab') is null
begin

    exec('create schema [DBLab] ')

end;

go

if object_id('[DBLab].[NumbersSource]') is not null
begin

    set noexec on;

end
go

CREATE TABLE [DBLab].[NumbersSource]
(
	[Number] [int] NOT NULL,

     CONSTRAINT [DBLab.PK_NumbersSource] PRIMARY KEY CLUSTERED
        (
	        [Number] ASC
        )

) ON [PRIMARY]

GO

set noexec off;
go


Table - DBLab.Numbers


set noexec off;
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO

if schema_id('DBLab') is null
begin

    exec('create schema [DBLab] ')

end;

go

if object_id('[DBLab].[Numbers]') is not null
begin

    set noexec on;

end
go

CREATE TABLE [DBLab].[Numbers]
(
	[Number] [int] NOT NULL,

     CONSTRAINT [DBLab.PK_Numbers] PRIMARY KEY CLUSTERED
        (
	        [Number] ASC
        )

) ON [PRIMARY]

GO

set noexec off;
go

Populate Data


SET NOCOUNT ON
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
set noexec off
go

if schema_id('dblab') is null
begin

    exec('create schema [dblab] authorization [dbo]')

end
go

set noexec off
go

if exists
(
	select 1
	from   [dblab].[NumbersSource]
)
begin

	set noexec on

end
go
--Create a simple table with 10,000 rows
--load Numbers table with 1,000,000 numbers
WITH t1 AS
(
        SELECT 0 AS n UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0)
        ,t2 AS (SELECT 0 AS n FROM t1 t1a, t1 t1b, t1 t1c, t1 t1d)
        ,t3 AS (SELECT 0 AS n FROM t2 t2a, t2 t2b, t2 t2c)
        ,tblVarNumbers AS (SELECT ROW_NUMBER() OVER(ORDER BY n) - 1 AS number FROM t3
)
INSERT INTO [dblab].[NumbersSource] WITH (TABLOCKX)
([Number])
SELECT number
FROM   tblVarNumbers
GO

set noexec off
go

LAB

To make thing more palatable, let us use a small dataset; a thousand records or so.

Data Extract


SETLOCAL

	set "DBSourceServer=.\SQLEXPRESS_V2014"
	set "TEMP_SESSION=%TEMP%"

	echo TEMP_SESSION is %TEMP_SESSION%	

	rem If folder does not exists, then create folder
	if not exist %TEMP_SESSION% mkdir %TEMP_SESSION%

	Rem Remove existing files
	if exist  %TEMP_SESSION%\*.data del %TEMP_SESSION%\*.data /Q

	bcp "select top 1000 * from DBLAB.dblab.NumbersSource" queryout %TEMP_SESSION%\dbo.NumbersSmall.data -T -n -S%DBSourceServer%

ENDLOCAL

Clean up Transaction Log

Let us clean up our transaction Log by doing the following:

  1. Set Recovery Mode to “Bulk Logged”
  2. Perform a full database backup to the NUL Device
  3. Perform a transaction log backup to the NUL Device

     alter database [DBLAB] set recovery bulk_logged;
   
     backup database [DBLAB] to disk = 'NUL';

     print '--------------------------------------------------------'
     backup log [DBLAB] to disk = 'NUL';
     print '--------------------------------------------------------'

    go

Import in blocks

Let us import records, one at a time


SETLOCAL

   set "DBSourceServer=.\SQLEXPRESS_V2014"
   set "TEMP_SESSION=%TEMP%"
   set "BLOCK_SIZE=100"

   echo TEMP_SESSION is %TEMP_SESSION%

   rem If folder does not exists, then create folder
   if not exist %TEMP_SESSION% mkdir %TEMP_SESSION%

   sqlcmd -Q"truncate table DBLAB.DBLab.Numbers" -S%DBSourceServer%

   bcp DBLAB.DBLab.Numbers in %TEMP_SESSION%\dbo.NumbersSmall.data -T -n -S%DBSourceServer% -b%BLOCK_SIZE% -h "TABLOCK, ORDER (Number ASC)"

  ENDLOCAL

Command Shell Output

bcpDataInBatchesYes [Cmhshell]

SQL Transaction Log – Aggregate Contents

Here is the code we will use to aggregate our Transaction Log


declare @spid int

--set @spid = 60

select
          tblFNDBLog.AllocUnitName
        , tblFNDBLog.Operation
        , [NumberofRecords] = count(*)
        , [Log Record Fixed Length] = sum(tblFNDBLog.[Log Record Fixed Length])
        , [Log Record Length] = sum(tblFNDBLog.[Log Record Length])
from   fn_dblog(null, null) tblFNDBLog

--where  spid = @spid

where tblFNDBLog.AllocUnitName is not null

and   tblFNDBLog.AllocUnitName not in ('Unknown Alloc Unit')

and   tblFNDBLog.AllocUnitName not like 'sys.%'

group by
          tblFNDBLog.AllocUnitName
        , tblFNDBLog.Operation

SQL Transaction Log – Aggregate Contents – 100 records per block

bcpDataInBatches1E2

Import in blocks of 1 million

Go back and clean up the Transaction Log


   alter database [DBLAB] set recovery bulk_logged;

   backup database [DBLAB] to disk = 'NUL';  

   print '--------------------------------------------------------'
   backup log [DBLAB] to disk = 'NUL'; 
   print '--------------------------------------------------------'

   go

Code – 1 million records

Let us import a million records at a time.


SETLOCAL

set "DBSourceServer=.\SQLEXPRESS_V2014"
set "TEMP_SESSION=%TEMP%"
set "BLOCK_SIZE=1000000"

echo TEMP_SESSION is %TEMP_SESSION%

rem If folder does not exists, then create folder
if not exist %TEMP_SESSION% mkdir %TEMP_SESSION%

sqlcmd -Q"truncate table DBLAB.DBLab.Numbers" -S%DBSourceServer%

bcp DBLAB.DBLab.Numbers in %TEMP_SESSION%\dbo.NumbersSmall.data -T -n -S%DBSourceServer% -b%BLOCK_SIZE% -h "TABLOCK, ORDER (Number ASC)"

ENDLOCAL

Command Shell – Output

bcpDataInBatchesNo [Cmhshell]

Aggregate Transaction Log – 1 million records

bcpDataInBatches1Million

SQL Profiler

SQL Server Profiler is another popular tool for modeling system performance.

Batch Size of 100

Here is what we capture for Batch Sizes of 100.

bulkInsertInBatches

Batch Size of 1 Million

Here is what we capture for Batch Sizes of 1 Million.

bulkInsertInBatchesNo

Quick Comparison of SQL Profiler Result

  • It is obvious there is a lot more dialog when we have a batch size of 100
  • It is not so obvious that our writes is so little.  I think it is likely that the actual BCP Writes are not reflected in the captured profiler readings

fn_dblog

More on fn_dblog.

There are quite a few things one should consider when thinking of using fn_dblog.  Inclusive are:

  1. fn_dblog accesses and reads a database log file
    • It is very expensive as it is reading an actual physical file
    • Let us assume that the file does not have indexes and so it can be taxing on the system
  2. fn_dblog accepts two arguments
    • The starting LSN Number
    • The ending LSN Number
  3. Log files are database specific
    • And, so one needs to change to the specific database
  4. fn_dblog exposes a lot of detailed information

Code

Here is a sample script for accessing a couple of columns.


--use [database-name];
declare @iNumberofRecordsMax int

set @iNumberofRecordsMax = 50;

; with cteAllocationUnit
as
(
    select
              tblSAU.data_space_id
            , tblSAU.type_desc

            , tblSAU.allocation_unit_id

            , tblSDF.physical_name

    from   sys.allocation_units tblSAU

            inner join sys.database_files tblSDF

                on tblSAU.data_space_id = tblSDF.data_space_id
)

, ctePartitionObject
as
(
    select
              tblSP.partition_id
            , tblSP.partition_number
            , tblSP.data_compression_desc

            , tblSP.object_id
            , tblSO.schema_id
            , [schemaName] = schema_name(tblSO.schema_id)
            , objectName = tblSO.name

    from   sys.partitions tblSP

            inner join sys.objects tblSO

                on tblSP.object_id = tblSO.object_id
)
select top (@iNumberofRecordsMax)
          tblFNDBLog.spid
        , tblFNDBLog.AllocUnitName
        , tblFNDBLog.Operation
        , [Log Record Fixed Length] = (tblFNDBLog.[Log Record Fixed Length])
        , [Log Record Length] = (tblFNDBLog.[Log Record Length])
        , tblFNDBLog.Context

        , cteAU.physical_name

        , tblFNDBLog.PartitionId
        , ctePO.object_id
        , ctePO.partition_number
        , ctePO.data_compression_desc

        , [schemaName] = ctePO.schemaName
        , [tableName] = ctePO.objectName

        , tblFNDBLog.[Article ID]

from   fn_dblog(null, null) tblFNDBLog

        left outer join cteAllocationUnit cteAU

            on tblFNDBLog.AllocUnitId = cteAU.allocation_unit_id

        left outer join ctePartitionObject ctePO

            on tblFNDBLog.PartitionId = ctePO.[partition_id]

where tblFNDBLog.AllocUnitName is not null

and   tblFNDBLog.AllocUnitName not in ('Unknown Alloc Unit')

and   tblFNDBLog.AllocUnitName not like 'sys.%'

Summary

Conclusively, if we use a high enough block size, we do not incur as much Transaction Log, as we would otherwise have used.

Specifically, we shed the LOP_INSERT_ROWS Operation.

The reason is salient, but it is because each time bcp starts working on a new block, it revisits the underlying table and inspects whether it is empty.

When the block size is smaller that the number of records, this consideration occurs and tests negative, every other time besides the initial consideration.

For a very high block size, the decision tree only occurs during the first block and is not repeated, as it is a singleton operation.

References

Will love to say “Life is Hard, but really?

…What can be so hard, when you Google on things and led to pasture by Sunil Agarwal and Remus Rusanu.

  1. Bulk Import Optimizations (Minimal Logging) – Sunil Agarwal
    http://blogs.msdn.com/b/sqlserverstorageengine/archive/2008/02/05/bulk-logging-optimizations-minimal-logging.aspx
  2. How to read and Interpret the SQL Server Log – Remus Rusanu – ( I write database engines by day – I use databases by night ).
    http://rusanu.com/2014/03/10/how-to-read-and-interpret-the-sql-server-log/
  3. SSIS and Minimally Logged Inserts – Jakub K
    http://jakubka.blogspot.com/2014/06/ssis-and-minimally-logged-inserts.html