SQL Server – Table Spool – Causation – Indexes on Non-Persisted Columns (along with an Output Clause)

Background

Table Spools is a fact of Life in SQL Server.

One of the cases that I recently discovered is when one has an index on a non-persisted computed column and one introduces an Output Clause to capture the effects of the change.

 

Create Tables

if DB_ID('Demo') is null
begin

    exec('create database [Demo]');

end
go

USE [Demo]
GO

if OBJECT_ID('accountActivity') is not null
begin

    drop table [accountActivity]

end
go

if OBJECT_ID('accountActivityPersisted') is not null
begin

    drop table [accountActivityPersisted]

end
go

CREATE TABLE [dbo].[accountActivity]
(
	[AccountID] [bigint] NOT NULL , --IDENTITY(1,1) NOT NULL,
	[AccountCreationDate] [datetime] NULL,
	[TransactionDate] [datetime] NULL,
	[ageCalc]  
            AS (datediff(day,[AccountCreationDate],[TransactionDate])),
     CONSTRAINT PK_AccountActivity PRIMARY KEY CLUSTERED 
    (
	    [AccountID] ASC
    )
) 
ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [idx_AgeCalc] 
ON [dbo].[accountActivity] 
(
	[ageCalc] ASC
)
go

CREATE TABLE [dbo].[accountActivityPersisted]
(
	[AccountID] [bigint] NOT NULL , --IDENTITY(1,1) NOT NULL,
	[AccountCreationDate] [datetime] NULL,
	[TransactionDate] [datetime] NULL,
	[ageCalc]  AS 
          (datediff(day,[AccountCreationDate],[TransactionDate])) 
           persisted  ,
    CONSTRAINT PK_AccountActivityPersisted PRIMARY KEY CLUSTERED 
    (
	    [AccountID] ASC
    )
) 
ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [idx_AgeCalc] 
ON [dbo].[accountActivityPersisted] 
(
	[ageCalc] ASC
)
go

</code></pre>
<h2></h2>
<h1>Test SQL</h1>
<pre><code>

use [Demo]
go

set statistics io on
set nocount on
go

begin tran

    DECLARE @Results AS TABLE
    (
      AccountID BIGINT
    )

    declare	
           @AccountID bigint
	 , @AccountCreationDate datetime
    	 , @TransactionDate datetime

    set @AccountID = RAND() * 1E10
    set @AccountCreationDate = '1/1/2001'
    set @TransactionDate = GETUTCDATE()

    Insert into dbo.[accountActivity]
    (
         [AccountID]
       , [AccountCreationDate]
       , [TransactionDate]

   )
   OUTPUT 
      INSERTED.AccountID

  INTO @Results		
   (
       AccountID
   )
   select
              @AccountID
	    , @AccountCreationDate
	    , @TransactionDate

   Insert into dbo.[accountActivityPersisted]
   (
        [AccountID]
      , [AccountCreationDate]
      , [TransactionDate]
   )
   OUTPUT 
      INSERTED.AccountID
   INTO @Results		
   (
       AccountID
   )
   select
             @AccountID
           , @AccountCreationDate
	   , @TransactionDate

rollback tran

Execution Plan

  • On a scale of 100, the cost of maintaining an index when the columns are not persisted is about 59%
  • Whereas for non-persisted columns, it is 41%

Table Spool - Index on Non-Persisted Columns

Statistics I/O

  • The worktable maintenance cost is quite a bit 

Non-Persisted Columns



Table '#744F2D60'. Scan count 0, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 2, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'accountActivity'. Scan count 0, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Persisted Columns


Table '#744F2D60'. Scan count 0, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'accountActivityPersisted'. Scan count 0, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Screen Shot:
Table Spool - Index on Non-Persisted Columns (Statistics IO)

 

Moral of the Story

Persist your computed columns.

 

References

Microsoft – SQL Server – Cached Plans – Looking for Table Spools

Introduction

Many times I have taking the time to sing the praise of Microsoft SQL Server Dynamic Management Views (DMVs).

If MSFT has an idealistic view of Science and Technology, it is its openness in allowing Operational Engineers and Developers to see and question how well things are going.

And, then being able to push and argue it out in public.

There is one vendor that has a closed knit support group and even upon commenting one’s comment are quickly hidden. It is almost like why bother!

Personally, I am not really writing for any particular Vendor, but more so for the community.

Here is a query that I have being meaning to write out for a while, but I could not find a good code (via Google) to fashion mine from.

So what I am trying to do is to is to get a good, firm view around which queries are using Table Spools.

I know Jonathan Kehayias has probably written the most around “Extended Events”.  And, I know he probably has something I can use.  But, I do not have it like that as I do not always have friends in High places…So I can not call or look in his cheat sheets.

All I can do is “Check out his demo(es)” @ http://sqlblog.com/blogs/jonathan_kehayias/.

Specifically, I am trying to comb through his “XEvent a Day”.  I mean he has an entry for each day; and he did not take a Leap Feb.  

Taking about high achievers and extra credits –  He did all 31.

Here is what I came up for Spools…


SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
GO

;WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan')

select 

          tblIn.plan_handle
        , tblIn.PhysicalOp        
        , tblIn.LogicalOp                
        , tblIn.useCounts
        , tblSQLText.text
        , tblQueryPlan.query_plan

from
(

  SELECT 

          cp.plan_handle
        , avg(cast(cp.usecounts as bigint)) as useCounts
        , max(operators.value('@PhysicalOp','nvarchar(50)'))
               as PhysicalOp
        , max(operators.value('@LogicalOp','nvarchar(50)'))
               as LogicalOp

   FROM sys.dm_exec_cached_plans cp

   CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) qp

   CROSS APPLY query_plan.nodes('//RelOp') rel(operators)

where 
(

  ---Look for Spools
  (

      --Spool (Physical)
      (operators.value('@PhysicalOp','nvarchar(50)')  like 'Table Spool')
   or (operators.value('@PhysicalOp','nvarchar(50)')  like 'Index Spool')

  )

  or

  (

    --Spool (Logical)
       (operators.value('@LogicalOp','nvarchar(50)')  like 'Eager Spool')
   or  (operators.value('@LogicalOp','nvarchar(50)')  like 'Lazy Spool')

  )

)        

group by

    cp.plan_handle

) tblIn

   CROSS APPLY sys.dm_exec_query_plan(tblIn.plan_handle) tblQueryPlan

   CROSS APPLY sys.dm_exec_sql_text(tblIn.[plan_handle]) tblSQLText

 order by tblIn.useCounts desc

A good reference for what Microsoft calls “Relational Operation Values” is available @

Using the USE PLAN Query Hint
http://msdn.microsoft.com/en-us/library/ms186954(v=sql.105).aspx

Here are the relevant entries for Spools – Physical and Logical Operation:
Relational Operator Values

Use Cases

What are the most probably causes for Table Spools

A) INSERT/SELECT from same table

For Halloween protection


use [tempdb]
go

if OBJECT_ID('dbo.Course') is not null
begin

    drop table dbo.[Course]

end
go

create table dbo.course
(
      [courseID] int
    , [courseName] sysname
    , [calendarYear] int
)
go

declare @year int
declare @yearNext int

set @year = DATEPART(year, getdate())
set @year = @year + 1

insert into dbo.[course]
([courseID], [courseName], [calendarYear])
select 1010, 'CS100', @year

--triggers Halloween Protection
insert into dbo.[course]
([courseID], [courseName], [calendarYear])
select courseID, [courseName], @year + 1 
from dbo.course 
where calendarYear = @year 

drop table dbo.[course] 
go

Screen Dump

Table Spool - Haloween Protection - Insert--select

B) When accessing and storing data from System Tables

 

 


set nocount on;

	declare @databaseMirroringState TABLE
	(
		  [id] int not null identity(1,1)
		, [DatabaseName] sysname not null
		, [mirroringState] sysname not null
		, [mirrroringSafetyLevel] sysname null

	)

        if OBJECT_ID('tempdb.dbo.#databaseMirroringState') is not null
        begin

          drop table #databaseMirroringState

        end

	create table #databaseMirroringState
	(
		  [id] int not null identity(1,1)
		, [DatabaseName] sysname not null
		, [mirroringState] sysname not null
		, [mirrroringSafetyLevel] sysname null

	)

	select 

	  	  tblD.name
		, tblDM.mirroring_state_desc
                , tblDM.mirroring_safety_level_desc

	from   sys.databases tblD

	  inner join sys.database_mirroring tblDM

	     on tblD.database_id = tblDM.database_id					

	where tblDM.mirroring_state_desc is not null;


	insert into @databaseMirroringState
	(
		  [DatabaseName]
		, [mirroringState]
		, [mirrroringSafetyLevel]
	)

	select 
	  	  tblD.name
		, tblDM.mirroring_state_desc
                , tblDM.mirroring_safety_level_desc

	from   sys.databases tblD

	  inner join sys.database_mirroring tblDM

	     on tblD.database_id = tblDM.database_id					

	where tblDM.mirroring_state_desc is not null


        insert into #databaseMirroringState
	(
		  [DatabaseName]
		, [mirroringState]
		, [mirrroringSafetyLevel]
	)

	select 

  		  tblD.name
		, tblD.mirroring_state_desc
                , tblD.mirroring_safety_level_desc

	from   sys.databases tblD

  	  inner join sys.database_mirroring tblDM

		    on tblD.database_id = tblDM.database_id					
	where tblDM.mirroring_state_desc is not null

    drop table #databaseMirroringState

    return

Querying and Capturing data in System Tables

Table Spool - Persising results of quering system tables

  • Noticed that Table Spools were occurring when we did not expect it to occur
  • Traced it to a piece of code where we are capturing the results of accessing system tables
  • If we only display the results of querying the system table, we do incur the Table Spool penalty

BTW, if you take a little time and review Query Plans, you will see that a lot of DMVs are just interfaces to a main table.

ie sys.databases and sys.database_mirroring is merely an interface to sys.sysdbreg

Kalen Delaney has a good coverage in:

Geek City : The Resource Database

http://sqlblog.com/blogs/kalen_delaney/archive/2007/09/13/geek-city-the-resource-database.aspx

 

C) Merge Statement

Merge Statements causes Table Spools, as well.

Here is a DDL and a DML.

 

DDL



if OBJECT_ID('dbo.customerMerge') is not null
begin
    drop table [dbo].[customerMerge] 
end    
go	

CREATE TABLE [dbo].[customerMerge]
(

    [customerID] [varchar](30) NOT NULL,
    [regionName] [varchar](50) NULL,

    [InsertTime] [datetime] 
        CONSTRAINT defaultCustomerMergeInsertTime 
          DEFAULT (getutcdate()) NULL,

    [salesDate] [datetime] not null 
        constraint defaultCustomerMergeSalesDate
          DEFAULT (getutcdate()),

    [salesYear]  
     as cast(isNull(datepart(year,[salesDate]), 1990) as smallint) 
        persisted not null,

 	CONSTRAINT [PK_customer_merge] PRIMARY KEY NONCLUSTERED 
	(
		  [customerID] ASC
		, [salesYear]
	)
	WITH (
	              PAD_INDEX  = OFF
	            , STATISTICS_NORECOMPUTE  = OFF
	            , IGNORE_DUP_KEY = OFF
	            , ALLOW_ROW_LOCKS  = ON
	            , ALLOW_PAGE_LOCKS  = ON
	            , FILLFACTOR = 50
            ) 
	--ON Default
)  --Default
go

DML


begin tran

    merge into [dbo].[customerMerge] as Target
    USING (
            VALUES 
                    (
                          'A1' --as customerID
                        , 'CA' --as RegionName
                        , getutcdate() --as InsertTime
                        , getutcdate() --as SalesDate                    
                    )
            )        
             As Source
                    (

                          [NewCustomerID]
                        , [NewRegionName]
                        , [NewInsertTime]
                        , [NewSalesDate]
                    )       

    ON Target.[customerID] = Source.[NewCustomerID]

    WHEN MATCHED THEN

        UPDATE
        SET  [regionName] = Source.[NewRegionName]

    WHEN NOT MATCHED THEN
    INSERT 
    (
          [customerID]
        , [regionName]
        , [InsertTime]
        , [SalesDate]
    )    
    VALUES
    (

          [NewcustomerID]
        , [NewRegionName]
        , [NewInsertTime]
        , [NewSalesDate]
    )    
    ;   

rollback tran


Screen shot:

Table - Spool - Merge Statements

D) When an output clause is used during Inserting\Update of a table that contains an un-persisted computed column.

Please read more here…

https://danieladeniji.wordpress.com/2013/05/08/technical-microsoft-sql-server-table-spool-causation-indexes-on-non-persisted-columns/

E) Insert Into Table Variable / from Scaler function



	set ansi_padding on;
	set statistics io on;
	set nocount on;

	declare @accountHolder TABLE
	(
		  [id]			int not null
		, [entityID]	nvarchar(60) not null
		, [firstname]	nvarchar(60) null
		, [lastname]	nvarchar(60) null
	)

begin

	-- Open the symmetric key with which to encrypt the data.
	OPEN SYMMETRIC KEY [SymmetricKeyPIA]
	   DECRYPTION BY CERTIFICATE  [certPIA];

		select

			 tblAH.id

		   , cast(
					DECRYPTBYKEY
					   (
							   tblAH.[entityID]
							 , 1
							 --, CONVERT(varbinary(8000), tblAH.[id])
							, dbo.Salt(tblAH.[id])
						)
					as varchar(255)
			) as [entityID]
 
		from    [dataEncrypted].[accountHolder] tblAH
 
		order by tblAH.[id] asc

		print replicate('*', 140)

		insert into @accountHolder
		(
			  [id]
			, [entityID]
		)
		select

			 tblAH.id
		   , cast(
					DECRYPTBYKEY
					   (
							   tblAH.[entityID]
							 , 1
						 	, dbo.Salt(tblAH.[id])
						)
					as varchar(255)
			) as [entityID]


		from    [dataEncrypted].[accountHolder] tblAH
 
		order by tblAH.[id] asc

		print replicate('*', 140)

		insert into @accountHolder
		(
			  [id]
			, [entityID]
		)
		select

			 tblAH.id
		   , cast(
			     DECRYPTBYKEY
			     (
				   tblAH.[entityID]
				 , 1
			 	--, dbo.Salt(tblAH.[id])
				, itvfS.[Salt]
			    )
			    as varchar(255)
			) as [entityID]


		from    [dataEncrypted].[accountHolder] tblAH

			  cross apply [dbo].[ITVF_Salt](tblAH.[id]) itvfS

 
		order by tblAH.[id] asc

	-- Close symmetric key
	CLOSE SYMMETRIC KEY [SymmetricKeyPIA]

end

go

 

InsertIntoTableVaribleFromSelectThatIncludesScalerFunction

 


Table 'accountHolder'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
********************************************************************************************************************************************
Table '#0D7A0286'. Scan count 0, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 1, logical reads 13, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'accountHolder'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
********************************************************************************************************************************************
Table '#0D7A0286'. Scan count 0, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'accountHolder'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


 

Worst Offenders

Vendor – Microsoft

sp_tables

msdb.dbo.sp_sqlagent_get_perf_counters

  • Uses quite a bit of Table and Index spools
  • Violates Microsoft’s own advisory about using sp_ in a user database

#am_generate_waitstats

  • Activity Monitor ( I think)

#am_get_querystats

 Activity Monitor ( I think)

sp_sqlagent_get_perf_counters

Review

BTW, if you make some changes, and you want to clear your “Procedure Cache” and see how things play out thereafter, please do so via:


DBCC FREEPROCCACHE;

But, please do not clear everything all at once.  Rather clear pointed ones; get specific plan handle(s) and use those Plan Handles as the lone argument to FreeProcCache.

 


DBCC FREEPROCCACHE(plan_handle);

 

References:

 

 

Microsoft – SQL Server – Table Inserts – Table Spool [Eager Spool] – Is misaligned partitioning really the issue?

This is a follow-up to a couple of other posts:

Table Design

Create Database – demo


  if db_id('Demo') is not null
  begin

     create database [Demo];

  end
  go

Define Partition Scheme and Function

use [Demo]
go

if not exists   
                (

                    select *
                    from   sys.partition_functions
                    where  name = 'pfnYear'

                )
begin

    print 'Creating partition function pfnYear(smallint) ...'

    create partition function pfnYear(smallint)
    as range right for values
    (
          2010   
        , 2011
        , 2012
        , 2013
        , 2014
        , 2015

    )

        print 'Created partition function pfnYear(smallint) '

end

go

if not exists   
                (

                    select *
                    from   sys.partition_schemes
                    where  name = 'partitionSchemeYear'

                )
begin

    print 'Creating partition scheme partitionSchemeYear  ...'

        create partition scheme partitionSchemeYear 
        as partition pfnYear
        ALL TO ([default])

    print 'Created partition scheme partitionSchemeYear'

end

Define Table with misaligned index partition

To satisfy half-way correction, we will do the following:

  • Create a new table dbo.customer
  • The Primary Key will be a Non-clustered Index and it will be named PK_customer and it will be partitioned
  • We will create a Clustered Index (idx_DBA_pf_customerIDChecksumModulo20__customerID)
  • An Index on InsertTime (idx_InsertTime) will not be partitioned
  • An Index on regionName (idx_Region) will not be partitioned


/*

    exec sp_helpindex 'dbo.customer'
*/

if OBJECT_ID('dbo.customer') is not null
begin
    drop table [dbo].[customer] 
end    
go	

CREATE TABLE [dbo].[customer]
(

	[customerID] [varchar](30) NOT NULL,
	[regionName] [varchar](50) NULL,
	[InsertTime] [datetime] CONSTRAINT defaultCustomerInsertTime DEFAULT (getutcdate()) NULL,
	[salesDate] [datetime] not null constraint defaultCustomerSalesDate DEFAULT (getutcdate()),
	[SalesYear] as cast(isNull(datepart(year,[salesDate]), 1990) as smallint) persisted not null,

		CONSTRAINT [PK_customer] PRIMARY KEY NONCLUSTERED 
		(
			  [customerID] ASC
			, [salesYear]
		)
		WITH (
		              PAD_INDEX  = OFF
		            , STATISTICS_NORECOMPUTE  = OFF
		            , IGNORE_DUP_KEY = OFF
		            , ALLOW_ROW_LOCKS  = ON
		            , ALLOW_PAGE_LOCKS  = ON
		            , FILLFACTOR = 50
            ) 
		ON [partitionSchemeYear]([SalesYear])
) ON [partitionSchemeYear]([SalesYear])

GO

CREATE CLUSTERED INDEX [idx_pfnYear_SalesYear__customerID] 
ON [dbo].[customer] 
(
	[salesYear] ASC,
	[customerID] ASC
)
WITH 
(
      PAD_INDEX  = OFF
    , STATISTICS_NORECOMPUTE  = OFF
    , SORT_IN_TEMPDB = OFF
    , IGNORE_DUP_KEY = OFF
    , DROP_EXISTING = OFF
    , ONLINE = OFF
    , ALLOW_ROW_LOCKS  = ON
    , ALLOW_PAGE_LOCKS  = ON
    , FILLFACTOR = 50
) 
ON [partitionSchemeYear]([SalesYear])
GO

CREATE NONCLUSTERED INDEX [idx_InsertTime] 
ON [dbo].[customer] 
    (
        [InsertTime] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          ON [PRIMARY]

GO

CREATE NONCLUSTERED INDEX [idx_Region] 
ON [dbo].[customer] 
    (
        [regionName] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          ON [PRIMARY]

GO

Insert into Misaligned partition



set statistics time off
set statistics io off

truncate table [dbo].[customer];
go

set statistics profile on; 
set statistics time off;   
set statistics io on;
go

insert into [dbo].[customer]
(
    [customerID], [salesDate]
)
values
(
   'A1', GETUTCDATE()
)

Results of misalignment:

Misaligned Index - Full

Misaligned Index - Full - Statistics IO

Explanation:

  • Expensive Table Spool (Eager Spool) – Taking up 31%
  • Expensive Work Table using up 7 logical reads

Define Table with corrected index partition

To satisfy full correction, we will do the following:

  • Create a new table dbo.customerRevised
  • The Primary Key will be a Non-clustered Index and it will be named PK_customerRevised and it will be partitioned
  • We will create a Clustered Index (idx_DBA_pf_customerIDChecksumModulo20__customerID)
  • An Index on InsertTime (idx_InsertTime) will be partitioned
  • An Index on regionName (idx_Region) will be partitioned


/*
    exec sp_helpindex 'dbo.customerRevised'
*/

if OBJECT_ID('dbo.customerRevised') is not null
begin
    drop table [dbo].[customerRevised] 
end    
go	

CREATE TABLE [dbo].[customerRevised]
(

	[customerID] [varchar](30) NOT NULL,
	[regionName] [varchar](50) NULL,
	[InsertTime] [datetime] 
             CONSTRAINT defaultcustomerRevisedInsertTime 
                DEFAULT (getutcdate()) NULL,
	[salesDate] [datetime] not null 
             constraint defaultcustomerRevisedSalesDate 
                DEFAULT (getutcdate()),
	[SalesYear] as 
              cast(isNull(datepart(year,[salesDate]), 1990) as smallint) 
                  persisted not null,

		CONSTRAINT [PK_customerRevised] PRIMARY KEY NONCLUSTERED 
		(
			  [customerID] ASC
			, [salesYear]
		)
		WITH (
		              PAD_INDEX  = OFF
		            , STATISTICS_NORECOMPUTE  = OFF
		            , IGNORE_DUP_KEY = OFF
		            , ALLOW_ROW_LOCKS  = ON
		            , ALLOW_PAGE_LOCKS  = ON
		            , FILLFACTOR = 50
            ) 
		ON [partitionSchemeYear]([SalesYear])
) ON [partitionSchemeYear]([SalesYear])

GO

CREATE CLUSTERED INDEX [idx_DBA_pf_customerIDChecksumModulo20__customerID] 
ON [dbo].[customerRevised] 
(
	[salesYear] ASC,
	[customerID] ASC
)
WITH 
(
      PAD_INDEX  = OFF
    , STATISTICS_NORECOMPUTE  = OFF
    , SORT_IN_TEMPDB = OFF
    , IGNORE_DUP_KEY = OFF
    , DROP_EXISTING = OFF
    , ONLINE = OFF
    , ALLOW_ROW_LOCKS  = ON
    , ALLOW_PAGE_LOCKS  = ON
    , FILLFACTOR = 50
) 
ON [partitionSchemeYear]([SalesYear])
GO

CREATE NONCLUSTERED INDEX [idx_InsertTime] 
ON [dbo].[customerRevised] 
    (
        [InsertTime] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          --ON [PRIMARY]
          ON [partitionSchemeYear]([SalesYear])

GO

CREATE NONCLUSTERED INDEX [idx_Region] 
ON [dbo].[customerRevised] 
    (
        [regionName] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          --ON [PRIMARY]
          ON [partitionSchemeYear]([SalesYear])
GO


Misalignment Correction (Insert Statement)


set statistics time off
set statistics io off

truncate table [dbo].[customer];
truncate table [dbo].[customerRevised];
go

set statistics profile on; 
set statistics time off;   
set statistics io on;
go

insert into [dbo].[customer]
(
    [customerID], [salesDate]
)
values
(
   'A1', GETUTCDATE()
)

insert into [dbo].[customerRevised]
(
    [customerID], [salesDate]
)
values
(
    'B1', getutcdate()
)

Misalignment Correction:

Misaligned Index - Correction - Full

Misalignmed Index - Correction - Statistics IO

Explanation:

  • With proper partition correction we now have a much cleaner Plan (At a base of 100% -the original plan costed 57% and the revised plan costed 43%)
  • We have shed the 7 logical reads attributed to the “worktable”

Define Table with half-way corrected index partition

To satisfy half-way correction, we will do the following:

  • Create a new table dbo.customerRevisedHalfway
  • The Primary Key will be a Non-clustered Index and it will be named PK_customerRevisedHalfway and it will be partitioned
  • We will create a Clustered Index (idx_DBA_pf_customerIDChecksumModulo20__customerID)
  • An Index on InsertTime (idx_InsertTime) will be partitioned
  • An Index on regionName (idx_Region) will not be partitioned


/*

    exec sp_helpindex 'dbo.customerRevisedHalfway'
*/

if OBJECT_ID('dbo.customerRevisedHalfway') is not null
begin
    drop table [dbo].[customerRevisedHalfway] 
end    
go	

CREATE TABLE [dbo].[customerRevisedHalfway]
(

	[customerID] [varchar](30) NOT NULL,
	[regionName] [varchar](50) NULL,
	[InsertTime] [datetime] 
            CONSTRAINT defaultcustomerRevisedHalfwayInsertTime 
                DEFAULT (getutcdate()) NULL,
	[salesDate] [datetime] not null 
            constraint defaultcustomerRevisedHalfwaySalesDate
                DEFAULT (getutcdate()),
	[SalesYear] as 
            cast(isNull(datepart(year,[salesDate]), 1990) as smallint) 
               persisted not null,

		CONSTRAINT [PK_customerRevisedHalfway] PRIMARY KEY
                 NONCLUSTERED 
		(
			  [customerID] ASC
			, [salesYear]
		)
		WITH (
		              PAD_INDEX  = OFF
		            , STATISTICS_NORECOMPUTE  = OFF
		            , IGNORE_DUP_KEY = OFF
		            , ALLOW_ROW_LOCKS  = ON
		            , ALLOW_PAGE_LOCKS  = ON
		            , FILLFACTOR = 50
            ) 
		ON [partitionSchemeYear]([SalesYear])
) ON [partitionSchemeYear]([SalesYear])

GO

CREATE CLUSTERED INDEX [idx_DBA_pf_customerIDChecksumModulo20__customerID] 
ON [dbo].[customerRevisedHalfway] 
(
	[salesYear] ASC,
	[customerID] ASC
)
WITH 
(
      PAD_INDEX  = OFF
    , STATISTICS_NORECOMPUTE  = OFF
    , SORT_IN_TEMPDB = OFF
    , IGNORE_DUP_KEY = OFF
    , DROP_EXISTING = OFF
    , ONLINE = OFF
    , ALLOW_ROW_LOCKS  = ON
    , ALLOW_PAGE_LOCKS  = ON
    , FILLFACTOR = 50
) 
ON [partitionSchemeYear]([SalesYear])
GO

CREATE NONCLUSTERED INDEX [idx_InsertTime] 
ON [dbo].[customerRevisedHalfway] 
    (
        [InsertTime] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          --ON [PRIMARY]
          ON [partitionSchemeYear]([SalesYear])

GO

--index is pointing @ default file group
CREATE NONCLUSTERED INDEX [idx_Region] 
ON [dbo].[customerRevisedHalfway] 
    (
        [regionName] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          ON [PRIMARY]
          --ON [partitionSchemeYear]([SalesYear])

GO

Insert into Half way aligned partition


set statistics time off
set statistics io off

truncate table [dbo].[customer];
truncate table [dbo].[customerRevised];
truncate table [dbo].[customerRevisedHalfway];
go

set statistics profile on; 
set statistics time off;   
set statistics io on;
go

insert into [dbo].[customer]
(
    [customerID], [salesDate]
)
values
(
   'A1', GETUTCDATE()
)

insert into [dbo].[customerRevised]
(
    [customerID], [salesDate]
)
values
(
    'B1', getutcdate()
)

insert into [dbo].[customerRevisedHalfway]
(
  [customerID], [salesDate]
)
values
(
  'C1', getutcdate()
)

Misalignment Correction (Halfway):

Misaligned Index - Correction - Halfway

Statistics I/O

Misaligned Index - Correction - Halfway (Statistics IO)

Explanation:

  • The Query cost of misalignment is at 40%, fully corrected alignment is at 30%, and halfway correction is at 30%
  • When partitioning is fully corrected and when only halfway corrected we still shed the same amount of logical reads (7); directly attributable to the “worktable”
  • For the half way corrected partitioned index, you will see two distinct index inserts (the first one for the Clustered index and the secondary one for the non-partitioned aligned index); note that the half way corrected case is pointing at two different file groups (different partitions)
  • Whereas for the fully corrected index, we have a single index operator (all indexes are tied into the same partition)

CodeBase

To share the code, I have posted it to my GitHub Account.

Conclusion

Eager Spool

ScaryDBA in his web post,  Spools in Execution Plan ( http://www.scarydba.com/2009/09/09/spools-in-execution-plans/), did a good job describing this Table Spool thing.

Spool operations are temporary storage of the data for later reuse in a query plan. There are two types of spool operations, eager spool and lazy spool. A spool is basically a temporary table created within the execution of the query that is used when it’s likely that data will be needed again, and again during the execution of the query. This is not an explicit #temp temporary table, but a work table for operations within the processing necessary for a given query’s behavior. A spool is created when the optimizer thinks that it can work better with a semi-permanent sub-set of data rather than have to perform multiple seeks or scans against a table or index or in other places where data re-use is important (more in a bit).

Seemingly, in our original table definition, with more than one misaligned index, we ended up with a situation where post the “Clustered Index” operation, the Query Engine opted to go the route of creating and populating “Table Spools”.

It is not immediately clear at this point why the Query Engine chose this fairly in-efficient path.

Addendum

Addendum (2013-04-27)

The original incident was captured on MS SQL Server – v2008R2.

I have revised per v2012.

Execution Plan

TableSpool__MSSQLServer_v2012

IO Cost

TableSpool__MSSQLServer_v2012_IOStats

So Microsoft’s documentation did a fairly good job on describing what a Table Spool is:

http://msdn.microsoft.com/en-us/library/ms181032(v=sql.105).aspx
The Table Spool operator scans the input and places a copy of each row in a hidden spool table that is stored in the tempdb database and existing only for the lifetime of the query. If the operator is rewound (for example, by a Nested Loops operator) but no rebinding is needed, the spooled data is used instead of rescanning the input.

But, I will allow you the Performance Engineer to look at your queries and understand which ones can potentially follow this path and guard against it when other paths can be a bit less pernicious.

Microsoft – SQL Server – Insert into a table leads to “Table Spool \ Eager Spool”

Background

You never quite know about MS SQL Server Query Engine.

Here we found one of our database systems running really slow.  We tried everything, but nothing lead to Isaac’s well.

The system was just too slow.

Looked at every query and nothing came up.

Finally settled on an insert statement.  Straight record insert, not even an insert into <table-destination> select * from <table-source>.

The query plan of a performant query looked like the query on top.  And, that of the troublesome query looked like the one below.

The non-performant query has a couple of items that I could not place:

  • Table Spool
  • Eager Spool

SQL plan

Table Inserts - SQL Plan - Table Spool - Eager Spool

The Statistic I/O for the non-performance Query Plan is also a bit out-of-sorts.  It had :

  • More I/O (Logical Reads)
  • A worktable read

Statistics I/O

Table Inserts - Causes Statisics IO

So hear I am, goggled until I am sore.

There is a bit of ‘cries’ on the Net that talks about what can cause Table Spool \ Eager Spool.

Microsoft definition for “Eager Spool” is:

Eager Spool Operator
http://msdn.microsoft.com/en-us/library/ms190435.aspx

The Eager Spool operator takes the entire input, storing each row in a hidden temporary object stored in the tempdb database. If the operator is rewound (for example, by a Nested Loops operator) but no rebinding is needed, the spooled data is used instead of rescanning the input. If rebinding is needed, the spooled data is discarded and the spool object is rebuilt by rescanning the (rebound) input.

The Eager Spool operator builds its spool file in an “eager” manner: when the spool’s parent operator asks for the first row, the spool operator consumes all rows from its input operator and stores them in the spool.

Eager Spool is a logical operator.

Remas Rusanu did a noteworthy job – “Getting rid of Table Spool in SQL Server Execution plan” — http://stackoverflow.com/questions/1931100/getting-rid-of-table-spool-in-sql-server-execution-plan :

I certainly understand if you cannot show us the query. But is really hard to guess why the spooling is preffered by the engine whithout knowing any specifics. I recommend you go over Craig Freedman’s blog, he is an engineer in the query optimizer team and has explained a lot of the inner workings of SQL 2005/2008 optimizer. Here are some entries I could quickly find that touch the topic of spooling in one form or another:

 

SQL customer support team also has an interesting blog at http://blogs.msdn.com/psssql/

And ‘sqltips’ (the relational engine’s team blog) has some tips, like Spool operators in query plan…
And, so I have a bit more familiarity with what can trigger Table Spool.

To me it can occur when one is querying data from a Common Table Expression or sorting data, and when the system is caching that data for later re-use.

So here I am thinking this is a simple insert / values sql.

What can possibly trigger need for caching and later consideration or usage.

Checked the following:

  • Triggers
  • Computed columns
  • Complex column defaults

Finally, checked the indexes as the operator that follows the Table Spool / Eager Spool operators is the “Index Insert”.

As we have two indexes, disabled one of them.

And, retested.

Table Inserts - SQL Plan - Table Spool - Eager Spool (Good)

 

Wow – I am loving Isaac’s well.

So what was the problem.  It appeared that Microsoft’s engine triggers “Table Spool” / “Eager Spool” combination when there are two or more indexes that are defined with different partition schemes.

In our case:

First Index : Clustered

    create clustered index idx_pf_seed on
    [dbo].[customer]
    on [partitionScheme]
    ([regionID])

Second Index: NonClustered

create nonclustered index idx_CustomerName
on [dbo].[customer]
(
  [insertTime] asc
)
;

Just a thought as there is so little documentation.

A few days later, I posted an addendum to this entry.  That entry is available @

Microsoft – SQL Server – Table Inserts – Table Spool [Eager Spool] – Explained

https://danieladeniji.wordpress.com/2013/03/22/microsoft-sql-server-table-inserts-table-spool-eager-spool-explained/

References: