SQL Server – CXPacket Waits – Compare Across SQL Server Instances

Background

Noticing varying performance across a few SQL Server Instances that are hosting the same Application.

There are four instances overall, the first Dev, the second Test, the third QA, and the fourth Production.

 

TroubleShooting

Dynamic Management Views ( DMV )

Wait Stats

Here is a SQL Code snippet lifted from Paul Randall @ SQLSkills.

The code is here.

Code



WITH [Waits] AS
(
	SELECT
        [wait_type],
        [wait_time_ms] / 1000.0 AS [WaitS],
        ([wait_time_ms] - [signal_wait_time_ms]) / 1000.0 AS [ResourceS],
        [signal_wait_time_ms] / 1000.0 AS [SignalS],
        [waiting_tasks_count] AS [WaitCount],
       100.0 * [wait_time_ms] / SUM ([wait_time_ms]) OVER() AS [Percentage],
        ROW_NUMBER() OVER(ORDER BY [wait_time_ms] DESC) AS [RowNum]
    FROM sys.dm_os_wait_stats
    WHERE [wait_type] NOT IN (
        N'BROKER_EVENTHANDLER', N'BROKER_RECEIVE_WAITFOR',
        N'BROKER_TASK_STOP', N'BROKER_TO_FLUSH',
        N'BROKER_TRANSMITTER', N'CHECKPOINT_QUEUE',
        N'CHKPT', N'CLR_AUTO_EVENT',
        N'CLR_MANUAL_EVENT', N'CLR_SEMAPHORE',
 
        -- Maybe uncomment these four if you have mirroring issues
        N'DBMIRROR_DBM_EVENT', N'DBMIRROR_EVENTS_QUEUE',
        N'DBMIRROR_WORKER_QUEUE', N'DBMIRRORING_CMD',
 
        N'DIRTY_PAGE_POLL', N'DISPATCHER_QUEUE_SEMAPHORE',
        N'EXECSYNC', N'FSAGENT',
        N'FT_IFTS_SCHEDULER_IDLE_WAIT', N'FT_IFTSHC_MUTEX',
 
        -- Maybe uncomment these six if you have AG issues
        N'HADR_CLUSAPI_CALL', N'HADR_FILESTREAM_IOMGR_IOCOMPLETION',
        N'HADR_LOGCAPTURE_WAIT', N'HADR_NOTIFICATION_DEQUEUE',
        N'HADR_TIMER_TASK', N'HADR_WORK_QUEUE',
 
        N'KSOURCE_WAKEUP', N'LAZYWRITER_SLEEP',
        N'LOGMGR_QUEUE', N'MEMORY_ALLOCATION_EXT',
        N'ONDEMAND_TASK_QUEUE',
        N'PREEMPTIVE_XE_GETTARGETSTATE',
        N'PWAIT_ALL_COMPONENTS_INITIALIZED',
        N'PWAIT_DIRECTLOGCONSUMER_GETNEXT',
        N'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP', N'QDS_ASYNC_QUEUE',
        N'QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP',
        N'QDS_SHUTDOWN_QUEUE', N'REDO_THREAD_PENDING_WORK',
        N'REQUEST_FOR_DEADLOCK_SEARCH', N'RESOURCE_QUEUE',
        N'SERVER_IDLE_CHECK', N'SLEEP_BPOOL_FLUSH',
        N'SLEEP_DBSTARTUP', N'SLEEP_DCOMSTARTUP',
        N'SLEEP_MASTERDBREADY', N'SLEEP_MASTERMDREADY',
        N'SLEEP_MASTERUPGRADED', N'SLEEP_MSDBSTARTUP',
        N'SLEEP_SYSTEMTASK', N'SLEEP_TASK',
        N'SLEEP_TEMPDBSTARTUP', N'SNI_HTTP_ACCEPT',
        N'SP_SERVER_DIAGNOSTICS_SLEEP', N'SQLTRACE_BUFFER_FLUSH',
        N'SQLTRACE_INCREMENTAL_FLUSH_SLEEP',
        N'SQLTRACE_WAIT_ENTRIES', N'WAIT_FOR_RESULTS',
        N'WAITFOR', N'WAITFOR_TASKSHUTDOWN',
        N'WAIT_XTP_RECOVERY',
        N'WAIT_XTP_HOST_WAIT', N'WAIT_XTP_OFFLINE_CKPT_NEW_LOG',
        N'WAIT_XTP_CKPT_CLOSE', N'XE_DISPATCHER_JOIN',
        N'XE_DISPATCHER_WAIT', N'XE_TIMER_EVENT')
 
    AND [waiting_tasks_count] > 0
 )
SELECT
    [serverName] = cast(serverproperty('servername') as sysname)
    , MAX ([W1].[wait_type]) AS [WaitType]
    , CAST (MAX ([W1].[WaitS]) AS DECIMAL (16,2)) AS [Wait_S]
    , CAST (MAX ([W1].[ResourceS]) AS DECIMAL (16,2)) AS [Resource_S]
    , CAST (MAX ([W1].[SignalS]) AS DECIMAL (16,2)) AS [Signal_S]
    , MAX ([W1].[WaitCount]) AS [WaitCount]
    , CAST (MAX ([W1].[Percentage]) AS DECIMAL (5,2)) AS [Percentage]
	/*
		, CAST ((MAX ([W1].[WaitS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgWait_S]
		, CAST ((MAX ([W1].[ResourceS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgRes_S]
		, CAST ((MAX ([W1].[SignalS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgSig_S]
		, CAST ('https://www.sqlskills.com/help/waits/' + MAX ([W1].[wait_type]) as XML) AS [Help/Info URL]
	*/

FROM [Waits] AS [W1]

INNER JOIN [Waits] AS [W2]
    ON [W2].[RowNum] <= [W1].[RowNum]

GROUP BY [W1].[RowNum]

HAVING SUM ([W2].[Percentage]) - MAX( [W1].[Percentage] ) < 95; -- percentage threshold
GO

Output

Dev

Test

Q/A

Prod

Explanation

  1. waittype=CXPACKET
    • Dev, Test, QA, and Prod have percentiles of 71%, 50%, 16%, 68%
  2. waittype=LCK_M_X
    • Dev, Test, QA, and Prod have percentiles of  8%, 14%, 17%, 20%

 

 

OS Schedulers

Code


; with cte
(
	  scheduler_id
	, cpu_id
	, [status]
	, [is_online]
	, [current_workers_count]
)
as
(
	select
			  tblOS.scheduler_id
			, tblOS.cpu_id
			, tblOS.[status]
			, tblOS.[is_online]
			, tblOS.current_workers_count
  
	 from   sys.dm_os_schedulers tblOS

)
select 
		  [serverName] = cast(serverproperty('servername') as sysname)
		, cte.[status]
		, [count] = count(*)

from   cte

group by
		cte.[status]

Output

Dev

 

Test

QA

 

Prod

 

Explanation

  1. status=VISIBLE ONLINE
    • Dev, QA, and Prod have 4 CPUs
    • Test has 8 CPUs

Windows – System Information

Image

Test

QA

 

Prod

Explanation

  1. Test :- 4 CPUS, 2 Cores
  2. QA  :- 8 CPUS, 1 Core
  3. Prod :- 8 CPUS, 1 Core

Windows – Resource Monitor

CPU

Image

Prod

 

Test

 

Explanation

  1. Dev
    • 4 CPUs
    • 25% CPU Usage
    • CPU 0, CPU 1, CPU 2, CPU 3 – Parked
    • All 4 CPUs are being used
  2. Test
    • 8 CPUs
    • 14% CPU Usage
    • All 8 CPUs are being used

Conclusion

Depending on the edition of SQL Server that you are running, you might run into throttling walls.

Summary

As a tech head, I know I gotta to be critical and leave the groupie thing for the Rock Concert.
And, if I have to take the time to say it, better be something cunningly clever.

But, like Snoop said, I had to say all that just to get to say this.
Surprisingly, it looks the Product that couldn’t scale is managing to do so.
Not Edwin Moses like hurdles, but little DW Jobs.

 

Reference

  1. VARONIS
    • Kieran Laffan
      • SQL Server Best Practices, Part I: Configuration
        Link
      • SQL Server Best Practices, Part II: Virtualized Environments
        Link
  2. MSSQLWIKI
    • Karthick P.K
      • Troubleshooting SQL Server high CPU usage
        Link
      • How to find SQL Server and system CPU usage history
        Link
  3. Docs / S​QL / S​QL ​Server
    • Compute Capacity Limits by Edition of SQL Server
      Link

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s