SQL Server – Table Spool – Causation – Indexes on Non-Persisted Columns (along with an Output Clause)

Background

Table Spools is a fact of Life in SQL Server.

One of the cases that I recently discovered is when one has an index on a non-persisted computed column and one introduces an Output Clause to capture the effects of the change.

 

Create Tables

if DB_ID('Demo') is null
begin

    exec('create database [Demo]');

end
go

USE [Demo]
GO

if OBJECT_ID('accountActivity') is not null
begin

    drop table [accountActivity]

end
go

if OBJECT_ID('accountActivityPersisted') is not null
begin

    drop table [accountActivityPersisted]

end
go

CREATE TABLE [dbo].[accountActivity]
(
	[AccountID] [bigint] NOT NULL , --IDENTITY(1,1) NOT NULL,
	[AccountCreationDate] [datetime] NULL,
	[TransactionDate] [datetime] NULL,
	[ageCalc]  
            AS (datediff(day,[AccountCreationDate],[TransactionDate])),
     CONSTRAINT PK_AccountActivity PRIMARY KEY CLUSTERED 
    (
	    [AccountID] ASC
    )
) 
ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [idx_AgeCalc] 
ON [dbo].[accountActivity] 
(
	[ageCalc] ASC
)
go

CREATE TABLE [dbo].[accountActivityPersisted]
(
	[AccountID] [bigint] NOT NULL , --IDENTITY(1,1) NOT NULL,
	[AccountCreationDate] [datetime] NULL,
	[TransactionDate] [datetime] NULL,
	[ageCalc]  AS 
          (datediff(day,[AccountCreationDate],[TransactionDate])) 
           persisted  ,
    CONSTRAINT PK_AccountActivityPersisted PRIMARY KEY CLUSTERED 
    (
	    [AccountID] ASC
    )
) 
ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [idx_AgeCalc] 
ON [dbo].[accountActivityPersisted] 
(
	[ageCalc] ASC
)
go

</code></pre>
<h2></h2>
<h1>Test SQL</h1>
<pre><code>

use [Demo]
go

set statistics io on
set nocount on
go

begin tran

    DECLARE @Results AS TABLE
    (
      AccountID BIGINT
    )

    declare	
           @AccountID bigint
	 , @AccountCreationDate datetime
    	 , @TransactionDate datetime

    set @AccountID = RAND() * 1E10
    set @AccountCreationDate = '1/1/2001'
    set @TransactionDate = GETUTCDATE()

    Insert into dbo.[accountActivity]
    (
         [AccountID]
       , [AccountCreationDate]
       , [TransactionDate]

   )
   OUTPUT 
      INSERTED.AccountID

  INTO @Results		
   (
       AccountID
   )
   select
              @AccountID
	    , @AccountCreationDate
	    , @TransactionDate

   Insert into dbo.[accountActivityPersisted]
   (
        [AccountID]
      , [AccountCreationDate]
      , [TransactionDate]
   )
   OUTPUT 
      INSERTED.AccountID
   INTO @Results		
   (
       AccountID
   )
   select
             @AccountID
           , @AccountCreationDate
	   , @TransactionDate

rollback tran

Execution Plan

  • On a scale of 100, the cost of maintaining an index when the columns are not persisted is about 59%
  • Whereas for non-persisted columns, it is 41%

Table Spool - Index on Non-Persisted Columns

Statistics I/O

  • The worktable maintenance cost is quite a bit 

Non-Persisted Columns



Table '#744F2D60'. Scan count 0, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 2, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'accountActivity'. Scan count 0, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Persisted Columns


Table '#744F2D60'. Scan count 0, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'accountActivityPersisted'. Scan count 0, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Screen Shot:
Table Spool - Index on Non-Persisted Columns (Statistics IO)

 

Moral of the Story

Persist your computed columns.

 

References

Microsoft – SQL Server – Cached Plans – Looking for Table Spools

Introduction

Many times I have taking the time to sing the praise of Microsoft SQL Server Dynamic Management Views (DMVs).

If MSFT has an idealistic view of Science and Technology, it is its openness in allowing Operational Engineers and Developers to see and question how well things are going.

And, then being able to push and argue it out in public.

There is one vendor that has a closed knit support group and even upon commenting one’s comment are quickly hidden. It is almost like why bother!

Personally, I am not really writing for any particular Vendor, but more so for the community.

Here is a query that I have being meaning to write out for a while, but I could not find a good code (via Google) to fashion mine from.

So what I am trying to do is to is to get a good, firm view around which queries are using Table Spools.

I know Jonathan Kehayias has probably written the most around “Extended Events”.  And, I know he probably has something I can use.  But, I do not have it like that as I do not always have friends in High places…So I can not call or look in his cheat sheets.

All I can do is “Check out his demo(es)” @ http://sqlblog.com/blogs/jonathan_kehayias/.

Specifically, I am trying to comb through his “XEvent a Day”.  I mean he has an entry for each day; and he did not take a Leap Feb.  

Taking about high achievers and extra credits –  He did all 31.

Here is what I came up for Spools…


SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
GO

;WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan')

select 

          tblIn.plan_handle
        , tblIn.PhysicalOp        
        , tblIn.LogicalOp                
        , tblIn.useCounts
        , tblSQLText.text
        , tblQueryPlan.query_plan

from
(

  SELECT 

          cp.plan_handle
        , avg(cast(cp.usecounts as bigint)) as useCounts
        , max(operators.value('@PhysicalOp','nvarchar(50)'))
               as PhysicalOp
        , max(operators.value('@LogicalOp','nvarchar(50)'))
               as LogicalOp

   FROM sys.dm_exec_cached_plans cp

   CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) qp

   CROSS APPLY query_plan.nodes('//RelOp') rel(operators)

where 
(

  ---Look for Spools
  (

      --Spool (Physical)
      (operators.value('@PhysicalOp','nvarchar(50)')  like 'Table Spool')
   or (operators.value('@PhysicalOp','nvarchar(50)')  like 'Index Spool')

  )

  or

  (

    --Spool (Logical)
       (operators.value('@LogicalOp','nvarchar(50)')  like 'Eager Spool')
   or  (operators.value('@LogicalOp','nvarchar(50)')  like 'Lazy Spool')

  )

)        

group by

    cp.plan_handle

) tblIn

   CROSS APPLY sys.dm_exec_query_plan(tblIn.plan_handle) tblQueryPlan

   CROSS APPLY sys.dm_exec_sql_text(tblIn.[plan_handle]) tblSQLText

 order by tblIn.useCounts desc

A good reference for what Microsoft calls “Relational Operation Values” is available @

Using the USE PLAN Query Hint
http://msdn.microsoft.com/en-us/library/ms186954(v=sql.105).aspx

Here are the relevant entries for Spools – Physical and Logical Operation:
Relational Operator Values

Use Cases

What are the most probably causes for Table Spools

A) INSERT/SELECT from same table

For Halloween protection


use [tempdb]
go

if OBJECT_ID('dbo.Course') is not null
begin

    drop table dbo.[Course]

end
go

create table dbo.course
(
      [courseID] int
    , [courseName] sysname
    , [calendarYear] int
)
go

declare @year int
declare @yearNext int

set @year = DATEPART(year, getdate())
set @year = @year + 1

insert into dbo.[course]
([courseID], [courseName], [calendarYear])
select 1010, 'CS100', @year

--triggers Halloween Protection
insert into dbo.[course]
([courseID], [courseName], [calendarYear])
select courseID, [courseName], @year + 1 
from dbo.course 
where calendarYear = @year 

drop table dbo.[course] 
go

Screen Dump

Table Spool - Haloween Protection - Insert--select

B) When accessing and storing data from System Tables

 

 


set nocount on;

	declare @databaseMirroringState TABLE
	(
		  [id] int not null identity(1,1)
		, [DatabaseName] sysname not null
		, [mirroringState] sysname not null
		, [mirrroringSafetyLevel] sysname null

	)

        if OBJECT_ID('tempdb.dbo.#databaseMirroringState') is not null
        begin

          drop table #databaseMirroringState

        end

	create table #databaseMirroringState
	(
		  [id] int not null identity(1,1)
		, [DatabaseName] sysname not null
		, [mirroringState] sysname not null
		, [mirrroringSafetyLevel] sysname null

	)

	select 

	  	  tblD.name
		, tblDM.mirroring_state_desc
                , tblDM.mirroring_safety_level_desc

	from   sys.databases tblD

	  inner join sys.database_mirroring tblDM

	     on tblD.database_id = tblDM.database_id					

	where tblDM.mirroring_state_desc is not null;


	insert into @databaseMirroringState
	(
		  [DatabaseName]
		, [mirroringState]
		, [mirrroringSafetyLevel]
	)

	select 
	  	  tblD.name
		, tblDM.mirroring_state_desc
                , tblDM.mirroring_safety_level_desc

	from   sys.databases tblD

	  inner join sys.database_mirroring tblDM

	     on tblD.database_id = tblDM.database_id					

	where tblDM.mirroring_state_desc is not null


        insert into #databaseMirroringState
	(
		  [DatabaseName]
		, [mirroringState]
		, [mirrroringSafetyLevel]
	)

	select 

  		  tblD.name
		, tblD.mirroring_state_desc
                , tblD.mirroring_safety_level_desc

	from   sys.databases tblD

  	  inner join sys.database_mirroring tblDM

		    on tblD.database_id = tblDM.database_id					
	where tblDM.mirroring_state_desc is not null

    drop table #databaseMirroringState

    return

Querying and Capturing data in System Tables

Table Spool - Persising results of quering system tables

  • Noticed that Table Spools were occurring when we did not expect it to occur
  • Traced it to a piece of code where we are capturing the results of accessing system tables
  • If we only display the results of querying the system table, we do incur the Table Spool penalty

BTW, if you take a little time and review Query Plans, you will see that a lot of DMVs are just interfaces to a main table.

ie sys.databases and sys.database_mirroring is merely an interface to sys.sysdbreg

Kalen Delaney has a good coverage in:

Geek City : The Resource Database

http://sqlblog.com/blogs/kalen_delaney/archive/2007/09/13/geek-city-the-resource-database.aspx

 

C) Merge Statement

Merge Statements causes Table Spools, as well.

Here is a DDL and a DML.

 

DDL



if OBJECT_ID('dbo.customerMerge') is not null
begin
    drop table [dbo].[customerMerge] 
end    
go	

CREATE TABLE [dbo].[customerMerge]
(

    [customerID] [varchar](30) NOT NULL,
    [regionName] [varchar](50) NULL,

    [InsertTime] [datetime] 
        CONSTRAINT defaultCustomerMergeInsertTime 
          DEFAULT (getutcdate()) NULL,

    [salesDate] [datetime] not null 
        constraint defaultCustomerMergeSalesDate
          DEFAULT (getutcdate()),

    [salesYear]  
     as cast(isNull(datepart(year,[salesDate]), 1990) as smallint) 
        persisted not null,

 	CONSTRAINT [PK_customer_merge] PRIMARY KEY NONCLUSTERED 
	(
		  [customerID] ASC
		, [salesYear]
	)
	WITH (
	              PAD_INDEX  = OFF
	            , STATISTICS_NORECOMPUTE  = OFF
	            , IGNORE_DUP_KEY = OFF
	            , ALLOW_ROW_LOCKS  = ON
	            , ALLOW_PAGE_LOCKS  = ON
	            , FILLFACTOR = 50
            ) 
	--ON Default
)  --Default
go

DML


begin tran

    merge into [dbo].[customerMerge] as Target
    USING (
            VALUES 
                    (
                          'A1' --as customerID
                        , 'CA' --as RegionName
                        , getutcdate() --as InsertTime
                        , getutcdate() --as SalesDate                    
                    )
            )        
             As Source
                    (

                          [NewCustomerID]
                        , [NewRegionName]
                        , [NewInsertTime]
                        , [NewSalesDate]
                    )       

    ON Target.[customerID] = Source.[NewCustomerID]

    WHEN MATCHED THEN

        UPDATE
        SET  [regionName] = Source.[NewRegionName]

    WHEN NOT MATCHED THEN
    INSERT 
    (
          [customerID]
        , [regionName]
        , [InsertTime]
        , [SalesDate]
    )    
    VALUES
    (

          [NewcustomerID]
        , [NewRegionName]
        , [NewInsertTime]
        , [NewSalesDate]
    )    
    ;   

rollback tran


Screen shot:

Table - Spool - Merge Statements

D) When an output clause is used during Inserting\Update of a table that contains an un-persisted computed column.

Please read more here…

https://danieladeniji.wordpress.com/2013/05/08/technical-microsoft-sql-server-table-spool-causation-indexes-on-non-persisted-columns/

E) Insert Into Table Variable / from Scaler function



	set ansi_padding on;
	set statistics io on;
	set nocount on;

	declare @accountHolder TABLE
	(
		  [id]			int not null
		, [entityID]	nvarchar(60) not null
		, [firstname]	nvarchar(60) null
		, [lastname]	nvarchar(60) null
	)

begin

	-- Open the symmetric key with which to encrypt the data.
	OPEN SYMMETRIC KEY [SymmetricKeyPIA]
	   DECRYPTION BY CERTIFICATE  [certPIA];

		select

			 tblAH.id

		   , cast(
					DECRYPTBYKEY
					   (
							   tblAH.[entityID]
							 , 1
							 --, CONVERT(varbinary(8000), tblAH.[id])
							, dbo.Salt(tblAH.[id])
						)
					as varchar(255)
			) as [entityID]
 
		from    [dataEncrypted].[accountHolder] tblAH
 
		order by tblAH.[id] asc

		print replicate('*', 140)

		insert into @accountHolder
		(
			  [id]
			, [entityID]
		)
		select

			 tblAH.id
		   , cast(
					DECRYPTBYKEY
					   (
							   tblAH.[entityID]
							 , 1
						 	, dbo.Salt(tblAH.[id])
						)
					as varchar(255)
			) as [entityID]


		from    [dataEncrypted].[accountHolder] tblAH
 
		order by tblAH.[id] asc

		print replicate('*', 140)

		insert into @accountHolder
		(
			  [id]
			, [entityID]
		)
		select

			 tblAH.id
		   , cast(
			     DECRYPTBYKEY
			     (
				   tblAH.[entityID]
				 , 1
			 	--, dbo.Salt(tblAH.[id])
				, itvfS.[Salt]
			    )
			    as varchar(255)
			) as [entityID]


		from    [dataEncrypted].[accountHolder] tblAH

			  cross apply [dbo].[ITVF_Salt](tblAH.[id]) itvfS

 
		order by tblAH.[id] asc

	-- Close symmetric key
	CLOSE SYMMETRIC KEY [SymmetricKeyPIA]

end

go

 

InsertIntoTableVaribleFromSelectThatIncludesScalerFunction

 


Table 'accountHolder'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
********************************************************************************************************************************************
Table '#0D7A0286'. Scan count 0, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 1, logical reads 13, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'accountHolder'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
********************************************************************************************************************************************
Table '#0D7A0286'. Scan count 0, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'accountHolder'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


 

Worst Offenders

Vendor – Microsoft

sp_tables

msdb.dbo.sp_sqlagent_get_perf_counters

  • Uses quite a bit of Table and Index spools
  • Violates Microsoft’s own advisory about using sp_ in a user database

#am_generate_waitstats

  • Activity Monitor ( I think)

#am_get_querystats

 Activity Monitor ( I think)

sp_sqlagent_get_perf_counters

Review

BTW, if you make some changes, and you want to clear your “Procedure Cache” and see how things play out thereafter, please do so via:


DBCC FREEPROCCACHE;

But, please do not clear everything all at once.  Rather clear pointed ones; get specific plan handle(s) and use those Plan Handles as the lone argument to FreeProcCache.

 


DBCC FREEPROCCACHE(plan_handle);

 

References:

 

 

Microsoft – SQL Server – Table Inserts – Table Spool [Eager Spool] – Is misaligned partitioning really the issue?

This is a follow-up to a couple of other posts:

Table Design

Create Database – demo


  if db_id('Demo') is not null
  begin

     create database [Demo];

  end
  go

Define Partition Scheme and Function

use [Demo]
go

if not exists   
                (

                    select *
                    from   sys.partition_functions
                    where  name = 'pfnYear'

                )
begin

    print 'Creating partition function pfnYear(smallint) ...'

    create partition function pfnYear(smallint)
    as range right for values
    (
          2010   
        , 2011
        , 2012
        , 2013
        , 2014
        , 2015

    )

        print 'Created partition function pfnYear(smallint) '

end

go

if not exists   
                (

                    select *
                    from   sys.partition_schemes
                    where  name = 'partitionSchemeYear'

                )
begin

    print 'Creating partition scheme partitionSchemeYear  ...'

        create partition scheme partitionSchemeYear 
        as partition pfnYear
        ALL TO ([default])

    print 'Created partition scheme partitionSchemeYear'

end

Define Table with misaligned index partition

To satisfy half-way correction, we will do the following:

  • Create a new table dbo.customer
  • The Primary Key will be a Non-clustered Index and it will be named PK_customer and it will be partitioned
  • We will create a Clustered Index (idx_DBA_pf_customerIDChecksumModulo20__customerID)
  • An Index on InsertTime (idx_InsertTime) will not be partitioned
  • An Index on regionName (idx_Region) will not be partitioned


/*

    exec sp_helpindex 'dbo.customer'
*/

if OBJECT_ID('dbo.customer') is not null
begin
    drop table [dbo].[customer] 
end    
go	

CREATE TABLE [dbo].[customer]
(

	[customerID] [varchar](30) NOT NULL,
	[regionName] [varchar](50) NULL,
	[InsertTime] [datetime] CONSTRAINT defaultCustomerInsertTime DEFAULT (getutcdate()) NULL,
	[salesDate] [datetime] not null constraint defaultCustomerSalesDate DEFAULT (getutcdate()),
	[SalesYear] as cast(isNull(datepart(year,[salesDate]), 1990) as smallint) persisted not null,

		CONSTRAINT [PK_customer] PRIMARY KEY NONCLUSTERED 
		(
			  [customerID] ASC
			, [salesYear]
		)
		WITH (
		              PAD_INDEX  = OFF
		            , STATISTICS_NORECOMPUTE  = OFF
		            , IGNORE_DUP_KEY = OFF
		            , ALLOW_ROW_LOCKS  = ON
		            , ALLOW_PAGE_LOCKS  = ON
		            , FILLFACTOR = 50
            ) 
		ON [partitionSchemeYear]([SalesYear])
) ON [partitionSchemeYear]([SalesYear])

GO

CREATE CLUSTERED INDEX [idx_pfnYear_SalesYear__customerID] 
ON [dbo].[customer] 
(
	[salesYear] ASC,
	[customerID] ASC
)
WITH 
(
      PAD_INDEX  = OFF
    , STATISTICS_NORECOMPUTE  = OFF
    , SORT_IN_TEMPDB = OFF
    , IGNORE_DUP_KEY = OFF
    , DROP_EXISTING = OFF
    , ONLINE = OFF
    , ALLOW_ROW_LOCKS  = ON
    , ALLOW_PAGE_LOCKS  = ON
    , FILLFACTOR = 50
) 
ON [partitionSchemeYear]([SalesYear])
GO

CREATE NONCLUSTERED INDEX [idx_InsertTime] 
ON [dbo].[customer] 
    (
        [InsertTime] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          ON [PRIMARY]

GO

CREATE NONCLUSTERED INDEX [idx_Region] 
ON [dbo].[customer] 
    (
        [regionName] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          ON [PRIMARY]

GO

Insert into Misaligned partition



set statistics time off
set statistics io off

truncate table [dbo].[customer];
go

set statistics profile on; 
set statistics time off;   
set statistics io on;
go

insert into [dbo].[customer]
(
    [customerID], [salesDate]
)
values
(
   'A1', GETUTCDATE()
)

Results of misalignment:

Misaligned Index - Full

Misaligned Index - Full - Statistics IO

Explanation:

  • Expensive Table Spool (Eager Spool) – Taking up 31%
  • Expensive Work Table using up 7 logical reads

Define Table with corrected index partition

To satisfy full correction, we will do the following:

  • Create a new table dbo.customerRevised
  • The Primary Key will be a Non-clustered Index and it will be named PK_customerRevised and it will be partitioned
  • We will create a Clustered Index (idx_DBA_pf_customerIDChecksumModulo20__customerID)
  • An Index on InsertTime (idx_InsertTime) will be partitioned
  • An Index on regionName (idx_Region) will be partitioned


/*
    exec sp_helpindex 'dbo.customerRevised'
*/

if OBJECT_ID('dbo.customerRevised') is not null
begin
    drop table [dbo].[customerRevised] 
end    
go	

CREATE TABLE [dbo].[customerRevised]
(

	[customerID] [varchar](30) NOT NULL,
	[regionName] [varchar](50) NULL,
	[InsertTime] [datetime] 
             CONSTRAINT defaultcustomerRevisedInsertTime 
                DEFAULT (getutcdate()) NULL,
	[salesDate] [datetime] not null 
             constraint defaultcustomerRevisedSalesDate 
                DEFAULT (getutcdate()),
	[SalesYear] as 
              cast(isNull(datepart(year,[salesDate]), 1990) as smallint) 
                  persisted not null,

		CONSTRAINT [PK_customerRevised] PRIMARY KEY NONCLUSTERED 
		(
			  [customerID] ASC
			, [salesYear]
		)
		WITH (
		              PAD_INDEX  = OFF
		            , STATISTICS_NORECOMPUTE  = OFF
		            , IGNORE_DUP_KEY = OFF
		            , ALLOW_ROW_LOCKS  = ON
		            , ALLOW_PAGE_LOCKS  = ON
		            , FILLFACTOR = 50
            ) 
		ON [partitionSchemeYear]([SalesYear])
) ON [partitionSchemeYear]([SalesYear])

GO

CREATE CLUSTERED INDEX [idx_DBA_pf_customerIDChecksumModulo20__customerID] 
ON [dbo].[customerRevised] 
(
	[salesYear] ASC,
	[customerID] ASC
)
WITH 
(
      PAD_INDEX  = OFF
    , STATISTICS_NORECOMPUTE  = OFF
    , SORT_IN_TEMPDB = OFF
    , IGNORE_DUP_KEY = OFF
    , DROP_EXISTING = OFF
    , ONLINE = OFF
    , ALLOW_ROW_LOCKS  = ON
    , ALLOW_PAGE_LOCKS  = ON
    , FILLFACTOR = 50
) 
ON [partitionSchemeYear]([SalesYear])
GO

CREATE NONCLUSTERED INDEX [idx_InsertTime] 
ON [dbo].[customerRevised] 
    (
        [InsertTime] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          --ON [PRIMARY]
          ON [partitionSchemeYear]([SalesYear])

GO

CREATE NONCLUSTERED INDEX [idx_Region] 
ON [dbo].[customerRevised] 
    (
        [regionName] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          --ON [PRIMARY]
          ON [partitionSchemeYear]([SalesYear])
GO


Misalignment Correction (Insert Statement)


set statistics time off
set statistics io off

truncate table [dbo].[customer];
truncate table [dbo].[customerRevised];
go

set statistics profile on; 
set statistics time off;   
set statistics io on;
go

insert into [dbo].[customer]
(
    [customerID], [salesDate]
)
values
(
   'A1', GETUTCDATE()
)

insert into [dbo].[customerRevised]
(
    [customerID], [salesDate]
)
values
(
    'B1', getutcdate()
)

Misalignment Correction:

Misaligned Index - Correction - Full

Misalignmed Index - Correction - Statistics IO

Explanation:

  • With proper partition correction we now have a much cleaner Plan (At a base of 100% -the original plan costed 57% and the revised plan costed 43%)
  • We have shed the 7 logical reads attributed to the “worktable”

Define Table with half-way corrected index partition

To satisfy half-way correction, we will do the following:

  • Create a new table dbo.customerRevisedHalfway
  • The Primary Key will be a Non-clustered Index and it will be named PK_customerRevisedHalfway and it will be partitioned
  • We will create a Clustered Index (idx_DBA_pf_customerIDChecksumModulo20__customerID)
  • An Index on InsertTime (idx_InsertTime) will be partitioned
  • An Index on regionName (idx_Region) will not be partitioned


/*

    exec sp_helpindex 'dbo.customerRevisedHalfway'
*/

if OBJECT_ID('dbo.customerRevisedHalfway') is not null
begin
    drop table [dbo].[customerRevisedHalfway] 
end    
go	

CREATE TABLE [dbo].[customerRevisedHalfway]
(

	[customerID] [varchar](30) NOT NULL,
	[regionName] [varchar](50) NULL,
	[InsertTime] [datetime] 
            CONSTRAINT defaultcustomerRevisedHalfwayInsertTime 
                DEFAULT (getutcdate()) NULL,
	[salesDate] [datetime] not null 
            constraint defaultcustomerRevisedHalfwaySalesDate
                DEFAULT (getutcdate()),
	[SalesYear] as 
            cast(isNull(datepart(year,[salesDate]), 1990) as smallint) 
               persisted not null,

		CONSTRAINT [PK_customerRevisedHalfway] PRIMARY KEY
                 NONCLUSTERED 
		(
			  [customerID] ASC
			, [salesYear]
		)
		WITH (
		              PAD_INDEX  = OFF
		            , STATISTICS_NORECOMPUTE  = OFF
		            , IGNORE_DUP_KEY = OFF
		            , ALLOW_ROW_LOCKS  = ON
		            , ALLOW_PAGE_LOCKS  = ON
		            , FILLFACTOR = 50
            ) 
		ON [partitionSchemeYear]([SalesYear])
) ON [partitionSchemeYear]([SalesYear])

GO

CREATE CLUSTERED INDEX [idx_DBA_pf_customerIDChecksumModulo20__customerID] 
ON [dbo].[customerRevisedHalfway] 
(
	[salesYear] ASC,
	[customerID] ASC
)
WITH 
(
      PAD_INDEX  = OFF
    , STATISTICS_NORECOMPUTE  = OFF
    , SORT_IN_TEMPDB = OFF
    , IGNORE_DUP_KEY = OFF
    , DROP_EXISTING = OFF
    , ONLINE = OFF
    , ALLOW_ROW_LOCKS  = ON
    , ALLOW_PAGE_LOCKS  = ON
    , FILLFACTOR = 50
) 
ON [partitionSchemeYear]([SalesYear])
GO

CREATE NONCLUSTERED INDEX [idx_InsertTime] 
ON [dbo].[customerRevisedHalfway] 
    (
        [InsertTime] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          --ON [PRIMARY]
          ON [partitionSchemeYear]([SalesYear])

GO

--index is pointing @ default file group
CREATE NONCLUSTERED INDEX [idx_Region] 
ON [dbo].[customerRevisedHalfway] 
    (
        [regionName] ASC
    )
    INCLUDE ( [customerID]) 
    WITH (

              ONLINE = ON
            , ALLOW_ROW_LOCKS  = ON
            , ALLOW_PAGE_LOCKS  = ON
          ) 
          ON [PRIMARY]
          --ON [partitionSchemeYear]([SalesYear])

GO

Insert into Half way aligned partition


set statistics time off
set statistics io off

truncate table [dbo].[customer];
truncate table [dbo].[customerRevised];
truncate table [dbo].[customerRevisedHalfway];
go

set statistics profile on; 
set statistics time off;   
set statistics io on;
go

insert into [dbo].[customer]
(
    [customerID], [salesDate]
)
values
(
   'A1', GETUTCDATE()
)

insert into [dbo].[customerRevised]
(
    [customerID], [salesDate]
)
values
(
    'B1', getutcdate()
)

insert into [dbo].[customerRevisedHalfway]
(
  [customerID], [salesDate]
)
values
(
  'C1', getutcdate()
)

Misalignment Correction (Halfway):

Misaligned Index - Correction - Halfway

Statistics I/O

Misaligned Index - Correction - Halfway (Statistics IO)

Explanation:

  • The Query cost of misalignment is at 40%, fully corrected alignment is at 30%, and halfway correction is at 30%
  • When partitioning is fully corrected and when only halfway corrected we still shed the same amount of logical reads (7); directly attributable to the “worktable”
  • For the half way corrected partitioned index, you will see two distinct index inserts (the first one for the Clustered index and the secondary one for the non-partitioned aligned index); note that the half way corrected case is pointing at two different file groups (different partitions)
  • Whereas for the fully corrected index, we have a single index operator (all indexes are tied into the same partition)

CodeBase

To share the code, I have posted it to my GitHub Account.

Conclusion

Eager Spool

ScaryDBA in his web post,  Spools in Execution Plan ( http://www.scarydba.com/2009/09/09/spools-in-execution-plans/), did a good job describing this Table Spool thing.

Spool operations are temporary storage of the data for later reuse in a query plan. There are two types of spool operations, eager spool and lazy spool. A spool is basically a temporary table created within the execution of the query that is used when it’s likely that data will be needed again, and again during the execution of the query. This is not an explicit #temp temporary table, but a work table for operations within the processing necessary for a given query’s behavior. A spool is created when the optimizer thinks that it can work better with a semi-permanent sub-set of data rather than have to perform multiple seeks or scans against a table or index or in other places where data re-use is important (more in a bit).

Seemingly, in our original table definition, with more than one misaligned index, we ended up with a situation where post the “Clustered Index” operation, the Query Engine opted to go the route of creating and populating “Table Spools”.

It is not immediately clear at this point why the Query Engine chose this fairly in-efficient path.

Addendum

Addendum (2013-04-27)

The original incident was captured on MS SQL Server – v2008R2.

I have revised per v2012.

Execution Plan

TableSpool__MSSQLServer_v2012

IO Cost

TableSpool__MSSQLServer_v2012_IOStats

So Microsoft’s documentation did a fairly good job on describing what a Table Spool is:

http://msdn.microsoft.com/en-us/library/ms181032(v=sql.105).aspx
The Table Spool operator scans the input and places a copy of each row in a hidden spool table that is stored in the tempdb database and existing only for the lifetime of the query. If the operator is rewound (for example, by a Nested Loops operator) but no rebinding is needed, the spooled data is used instead of rescanning the input.

But, I will allow you the Performance Engineer to look at your queries and understand which ones can potentially follow this path and guard against it when other paths can be a bit less pernicious.