SQL Server – BulkCopy ( BCP ) – Which data file?

Background

Using BCP, we are churning though quite a bit of files.

Unfortunately, the DOS batch file that I quickly stitched together missed an importantly functionality.

And, that functionality is to log the current file being processed.

Diagnostic

Microsoft

Resource Monitor

I am a big fan of Microsoft’s Resource Monitor.

Let us use it.

Preparation

We remote connect to the source computer and launched task Manager; from Task Manager accessed Resource Monitor.

Resource Monitor – Tab – Memory

Image

ResourceMonitor_Tab_CPU_ProcessesAndAssociatedHandles_20180706_0850AM.png

Explanation
  1. When we filter on the bcp.exe process, we see our data file as one of the files mentioned in the “Associated Handles” tab

 

Resource Monitor – Tab – Memory

Image

ResourceMonitor_Tab_Memory_20180706_0852AM.png

Explanation
  1. The BCP process is using about 14 MB

 

Resource Monitor – Tab – Disk

Image

ResourceMonitor_Tab_Disk_20180706_0907AM.png

Explanation
  1. sqlserver,exe is mentioned
  2. But, not our bcp.exe file

 

Summary

To get a reliable insight into which files are being accessed, please consider Microsoft’s own Resource Monitor; specifically the CPU tab.

 

 

SQL Server – Integration Services – Network Flow and Rules

Background

Wanted to cover the Network Ports that are used by Microsoft’s Integration Services.

Network Trace

Wireshark

Port Mapper ( Port 135 )

Network Flow

rpc-20160106-1111pm-cleanedup

 

Explanation

  1. From Ephemeral Port ( 57916) connect to Server Port 135
  2. Request from client to server to issue RemoteCreateInstance
  3. Authenticate User
    • via NTLMSSP_AUTH
    • Pass in username

 

Integration Services

Network Flow

ssis-20170106-1122pm-brushed-up

 

Explanation

  1. From Ephemeral Port ( 57917) connect to SQL Server Integration Services Component
  2. This is important has it depends on how thru Component Services the Integration Services Component’s endpoint is configured

 

Component – Microsoft SQL Server Integration Services [NN.MM]

Using Component Services, let us review the Component’s endpoint configuration

 

Configuration

Here are our choices:

  1. Disable Protocol sequence
  2. Use default endpoints
  3. Use static endpoint
  4. Use intranet range of dynamic endpoints
  5. Use internet range of dynamic endpoints

 

Digging Deeper
  1. Disable Protocol sequence
    • Disable Network
  2. Use default endpoints
    • Use ephemeral ports
  3. Use static endpoint
    • Use static endpoint
  4. Use intranet range of dynamic endpoints
    • Use endpoint’s defined for Intranet
  5. Use internet range of dynamic endpoints
    • Use endpoint’s defined for Internet

 

Our Choice

To streamline our conversation with the Firewall team, we chose to use a static endpoint

dcomendpointconfigurations-usestaticport

 

 

NetLogonSAMAccount

Network Flow

rpcnetlogon_20170107_1206am-brushed-up

 

Explanation

This area covers the Network Authentication.

We did not have to make special care in our environment and so I can not cover in details.

But, please keep it mind when connecting between hosts that are not in the same Active Directory Domain, etc.

 

Component – Windows Management & Instrumentation ( WMI )

Network Flow

wmi-20170106-1144pm-brushed-up

 

Explanation

  1. From Ephemeral Port ( 57919) we connect to the port we dedicated to WMI
  2. This is important has it depends on how thru Component Services the WMI Component’s endpoint is configured
Configuration

Using Component Services, we will configure Windows Management and Instrumentation to listen on a specific port

wmi-configure-staticport

 

Network Listening Ports

Resource Monitor

On newer MS Windows Oses, you will be well served to remote connect to the Integration Services host, and run Resource Monitor

MsDtsSrvr.exe

msdtssrvr-20170107-0617am

Explanation

We can see that MsDtsSrvr.exe is:

  1. listening on Network Port 50000
  2. We have a record each for IPv4 and IPv6
  3. The internal MS Windows Firewall is allowing access to the Port

RPCC – svchost (winmgmt)

svchost-rpcss-20170107-0640am

 

Explanation

We can see that svchost.exe ( RPCSS ) is:

  1. listening on Network Port 135
  2. We have a record each for IPv4 and IPv6
  3. The internal MS Windows Firewall is allowing access to the Port

Unlike Integration Service which has its own process, RPCSS is being processed by a svchost.exe process.

 

Windows Management & Instrumentation – svchost (winmgmt)

 

svchost-exe-winmgmt-20170107-0630am

 

Explanation

We can see that svchost.exe ( winmgmt ) is:

  1. listening on Network Port 50090
  2. We have a record each for IPv4 and IPv6
  3. The internal MS Windows Firewall is allowing access to the Port

Unlike Integration Service which has its own process, winmgmt is being processed by a svchost.exe process.

 

Tabulated View

Objective Filter
 RPC Port Mapper Port 135
 Microsoft Integration Services Ephemeral Ports
Static Port
Intranet range of dynamic endpoints
Internet range of dynamic endpoints
 Network Authentication Ephemeral Ports
 Windows Management and Instrumentation ( WMI ) Ephemeral Ports
Static Ports
Intranet range of dynamic endpoints
Internet range of dynamic endpoints

Scheduled Task – Xcopy does not copy file

Background

We have a scheduled script that was not working as new files were not being copied over.

But, also there were no errors logged.

 

Script – Backup

Here is what the original script looks like.

Script – Backup ( Original )


set "app=xcopy"

set "_HOME_DIR=%CD%"

set "_DATABASE=DBBackup"
set "_srcFolder=\\DBPROD\SQLBackups\%_DATABASE%\%_DATABASE%.bak"
set "_destFolder=F:\Microsoft\SQLServer\SQLRestore"

@rem if Folder Log does not exist, please create it
if not exist log mkdir log

@rem Initiate xcopy
%app% %_srcFolder% %_destFolder% /D /Y /J


Script – Backup ( Revision )

Added code to capture and expose ERRORLEVEL.


set "app=xcopy"
set "_DATABASE=HRDB"
set "_srcFolder=\\DBPROD\SQLBackups\%_DATABASE%\%_DATABASE%.bak"
set "_destFolder=F:\Microsoft\SQLServer\SQLRestore"

@rem if Folder Log does not exist, please create it
if not exist log mkdir log

@rem Initiate xcopy
%app% %_srcFolder% %_destFolder% /D /Y /J

set _errLevel=%errorlevel%

if _errLevel neq 0 (
	echo _errLevel %_errLevel% 
	echo ErrorLevel %errorLevel% 
	exit /b %_errLevel%		 
)	

Output

Image

xcopyDoesNotCopyNewFiles

Text


D:\Scripts\RestoreDB>xcopy \\DBPROD\SQLBackups\HRDB\HRDB.bak F:\Microsoft\SQLServer\SQLRestore /D /Y /J
\\DBPROD\SQLBackups\HRDB\HRDB.bak
Sharing violation

D:\Scripts\RestoreDB>set _errLevel=4

D:\Scripts\RestoreDB>if _errLevel NEQ 0 (
echo _errLevel 4
 echo ErrorLevel 0
 exit /b 4
)
_errLevel 4
ErrorLevel 0

D:\Scripts\RestoreDB>set _errLevel=4

D:\Scripts\RestoreDB>if _errLevel NEQ 0 (
echo restoreDB_DBBackup::_errLevel 4
 echo restoreDB_DBBackup::ErrorLevel 0
 exit /b 4
)


Image

We are experiencing a sharing violation.

Diagnostics

Who is using the file?

Resource Monitor

In use files

RestoreMonitor-InuseFiles

 

SQL Server

sp_whoIsActive

sp_whoIsActive

 

sys.dm_exec_requests

Is there a SQL script that I can use to determine the progress of a SQL Server backup or restore process?
Answered By – Veldmuis
Link

Code


SELECT 
		 r.session_id
		,r.command
		,[Percent Complete]
			= CONVERT(NUMERIC(6,2),r.percent_complete)
		, [ETA Completion Time]
			= CONVERT(VARCHAR(20),DATEADD(ms,r.estimated_completion_time,GetDate()),20) 
		, [Elapsed Min]
			= CONVERT(NUMERIC(10,2),r.total_elapsed_time/1000.0/60.0) 
		, [ETA Min]
			= CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0) 
		, [ETA Hours]
			= CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0/60.0) 
		, [sqlText]
			= CONVERT(VARCHAR(1000)
				,(
					SELECT SUBSTRING(text,r.statement_start_offset/2,
						CASE WHEN r.statement_end_offset = -1 THEN 1000 ELSE (r.statement_end_offset-r.statement_start_offset)/2 END)
						FROM sys.dm_exec_sql_text(sql_handle))
				  )
FROM sys.dm_exec_requests r 
WHERE command IN ('RESTORE DATABASE','BACKUP DATABASE')

 

Output

sys-dm-exec-requests

Summary

Our problem is that a prior run has the file we are trying to copy via xcopy locked.

We can wait for the database restore to complete, or terminate it.

Also, xcopy with /D works fine … just make sure that you check the ERRORLEVEL.

 

Fastcopy – Scheduling through Scripting

Background

Have a need to schedule weekly restoration of Production DB unto BETA.

As our DB is a bit big, I just did not want to use DOS Copy/XCOPY and so was happy that fastcopy is already available on server.

Code

List of files

  1. Files.txt
    • full file names of each source file
    • Delimited by newline
  2. CopyFiles.cmd
    • Invokes the fastcopy.exe
    • Added options not to display error message as we will be scheduling on Server and so UI will be counter-productive
    • Added speed=7
      • Files are copied in parallel and if we are not careful, we can overwhelm network
  3. RestoreDB.cmd
    • Calls copyfiles.cmd
    • Pause
      • Added pause and left it commented out
      • Will enable it, if we need to debug MS Windows Task Scheduler

 

Files.txt


\\HRDBPROD\SQLBackups\HRDB\HRDB.bak

CopyFiles.cmd


set "app=D:\Downloads\FastCopy\FastCopy311_x64\fastcopy.exe"

@rem set "srcfile=D:\Scripts\SyncFiles\Backup\files.txt"
set "_HOME_DIR=%CD%"

set "_srcfile=files.txt"
set "_srcfileFull=%_HOME_DIR%\%_srcfile%"
set "_logFolder=%_HOME_DIR%\log\"

set "_destFolder=F:\Microsoft\SQLServer\SQLRestore"

@rem if Folder Log does not exist, please create it
if not exist log mkdir log

@rem Initiate Fastcopy
%app% /cmd=diff ^
	 /srcfile=%_srcfileFull% ^
         /auto_close /force_close ^
	 /estimate ^
	 /no_ui ^
	 /no_confirm_del ^
	 /no_confirm_stop ^
	 /log=true ^
	 /logfile=%_logFolder%\fastcopy.log ^
	 /filelog=%_logFolder%\filecopy.log ^
	 /speed=7 ^
	 /to=%_destFolder%"

RestoreDB.cmd

copyfiles.cmd
@rem pause

Fastcopy Log files

Fastcopy produces plentiful throughput and error info.

There are two file fastcopy.log and filecopy.log

fastcopy.log

Here is what fastcopy.log looks like:

fastcopy

 

Monitoring

Windows Task Manager

WindowsTaskManager-20160513-0918AM

Explanation

  1. Our process, fastcopy.exe, is using some I/O

Resource Monitor

One can monitor ongoing network and IO throughput via built-in MS Windows Resource Monitor.

Disk

ResourceMonitor-20160513-0847AM

Explanation

  1. Note that Image is not fastcopy.exe, but the system
  2. We are getting about 150MB/sec for our copying process
  3. Overall 180 MB/sec IO

 

Network

ResourceMonitor-NetworkIO-20160513-0907AM

 

Explanation

  1. Again, Image is not fastcopy.exe, but the system
  2. Remote Port is 445
  3. Marked Latency

It seems obvious that Network IO is not reflecting true and overall Network I/O

 

Scheduling

MS Windows Task Scheduler

Task – General

Customization

  1. Pay attention to “Configure for:
    • You are given the choices of
      • Windows 7, Windows Server 2008 R2
      • Windows Vista, Windows Server 2008
    • To avoid errors, please choose “Windows Server 2008” even on “Windows Server 2008/R2

 

Task-General

 

Task-General-ConfigureFor

 

Event Viewer

Looking through MS Windows Event Viewer is a great way to troubleshoot scheduled tasks.

The specific branch to look at is Microsoft-Windows-TaskScheduler/Operational

Branch

Branch-Microsoft-Windows

Branch-Microsoft-Windows-TaskScheduler-Operational

 

Filter

Again, there is a lot to choose from and so please filter on …

Filtering Elements

  1. Event ID :-
    • ID is 201
      • 201 means “Action Completed
    • User is the name the task is scheduled to “run as”

 

FilterCurrentLog

Sample Entries

Return Code – 4294967295

EventDetails-4294967295

Return Code – 1

EventDetails-1

Return Code – 0

EventDetails-0

 

Interpretation

Log – General Causation
 Task Scheduler successfully completed task “\Corp\Restore DB” , instance “{76835bc9-9cb6-4d08-82ae-1bbe783f4d71}” , action “C:\Windows\SYSTEM32\cmd.exe” with return code 4294967295. Invalid arguments passed to fast copy
 Task Scheduler successfully completed task “\Corp\Restore DB” , instance “{482d0822-91ea-43a3-abce-fd7a53d14e1d}” , action “C:\Windows\SYSTEM32\cmd.exe” with return code 1. The Task has been configured for “Windows 7, MS Windows 2008/R2“.  Please configure for “Windows Vista, MS Windows 2008“.
 Task Scheduler successfully completed task “\Corp\Restore DB” , instance “{a4c53952-7caa-49f9-8df2-85b11b259b13}” , action “C:\Windows\SYSTEM32\cmd.exe” with return code 0.  We are good

 

 

Performance Comparison

Microsoft Windows 2008/R2

Xcopy


set appLogger="D:\Program Files (x86)\GnuWin32\bin\tee.exe"
set _appLoggerOption=--append 
set "_HOME_DIR=%CD%"
set "_LOG_FOLDER=%CD%\LOG"
set "_LOG_FILE=%CD%\LOG\benchmark_%DATEANDTIME%.log"
set "_SRC_FILES=\\HRDB\f$\SQLBackups\*.bak"
set "_DEST_FOLDER=F:\Microsoft\SQLServer\SQLRestore\"
set "_OPTIONS_XCOPY=/D /Y"

if not exist %_LOG_FOLDER% mkdir %_LOG_FOLDER%

set DATESTAMP=%DATE:~10,4%_%DATE:~4,2%_%DATE:~7,2%
set TIMESTAMP=%TIME:~0,2%_%TIME:~3,2%_%TIME:~6,2%
set DATEANDTIME=%DATESTAMP%_%TIMESTAMP%

ECHO "Before Copy %DATEANDTIME%  .... " | %appLogger% %_appLoggerOption% %_LOG_FILE%

xcopy %_SRC_FILES%  %_DEST_FOLDER% %_OPTIONS_XCOPY% | %appLogger% %_appLoggerOption% %_LOG_FILE%

set DATESTAMP=%DATE:~10,4%_%DATE:~4,2%_%DATE:~7,2%
set TIMESTAMP=%TIME:~0,2%_%TIME:~3,2%_%TIME:~6,2%
set DATEANDTIME=%DATESTAMP%_%TIMESTAMP%

ECHO "After Copy %DATEANDTIME%" | %appLogger% %_appLoggerOption% %_LOG_FILE%

Resource Monitor

Disk I/O

xcopy-Disk-20160526-0939AM

 

Network I/O

xcopy-network-20160526-0939AM

 

Performance Monitor

PerfomanceMonitor-20160526-1212

 

Matrix

 

Tool Disk I/O Network I/O Time Taken
 fastcopy 165 MB/sec 4 Hours 30 minutes
 xcopy 110 MB/sec  115 MB/sec 1 Hour

 

 

Summary

I think the SMB protocol has been greatly optimized in MS Windows 2008/R2 and later OS and so this only applies to OSes prior to that version .

Even if then…

SQL Server – v2012 – High CPU // RESOURCE MONITOR

Background

Here I am trying to steal a bit of work on my brother’s computer and I noticed a Symantec – Norton Anti-Virus Alert.

The Alert states that an Application is using quite a bit of CPU.  And, so I am thinking go away, this is just a home PC used for Development Stuff (C#, but lately more Java Script Framework stuff).

Compared to my machines, his is good – 64-Bit, 2 CPUs, 6 GB of Memory.

So if it is my machine, I will ignore.  But, on his computer and knowing no one is doing anything on his machine, let us go see what is going on.

Why Symantec?

Well as I was comparing Anti Virus (AV) Rescue CD Vendors a couple of weeks ago, I found out that I can get Symantec’s Norton for free.  It is available for 6 months and your path is documented here:

So I know you ‘re saying and thinking big deal.  You use GfiSoft/AVG (http://free.avg.com/us-en/homepage, but my machine is a full-blown server (MS Windows 2012).  And, just not sure of how many vendors give us free AV for Server OSes.

Is it a real problem or just false alert from monitoring tool?

So I went ahead and clicked on the prompt and got a fuller message, like the one pasted:

Symantec-FileInsight

So it looks to be real enough.

Which MS SQL Server Processes is using up CPU?

Let us determine which SQL Server Processes are using up CPU…

;with cte
( 
   [totalCPU]
)
as
(
   select sum(cpu)
   from   master.dbo.sysprocesses
)

select 
	  tblSysprocess.spid
	, tblSysprocess.cpu
	, (tblSysprocess.cpu * 100) / cte.totalCPU 
            as [percentileCPU]
	, tblSysprocess.physical_io
	, tblSysprocess.memusage
	, tblSysprocess.cmd
	, tblSysProcess.lastwaittype
from   master.dbo.sysprocesses tblSysprocess
         cross apply cte
order by tblSysprocess.cpu desc
go

Output:

SessionProcessesOrderedByCPU


Explanation:

  • There we have it, by far a process registered with the name “RESOURCE MONITOR” is using up most of our CPU
  • And, it is waiting quite a bit on PREEMPTIVE_XE_CALLBACKEXECUTE
  • Wait types bearing the signature PREEMPTIVE_XX are well chronicled in CSS SQL Server Engineers – The SQL Server Wait Type Repository… ( http://blogs.msdn.com/b/psssql/archive/2009/11/03/the-sql-server-wait-type-repository.aspx )
  • The blog post promises more to come but for now it states that “Used to indicate a worker is running coded that is not under the SQLOS Scheduling Systems”

SQL Profiler …

Wanted to see what commands are coming across on what should be a very silent and un-used SQL Server Instance.

So launched SQL Profiler and at this time the only active requests is the “SQL Server “Reporting Services” trying to see if there is work to be done.

Access Windows Service Applet and stopped that service and changed it start mode to “manual”.

That took Reporting Services out of the way.

But, still nothing coming in over the wire via SQL Server Profiler.  I suppose that SQL Server System processes are not tracked.

 

Task Manager (Order by CPU)

Is CPU still busy?

TaskManger with SQL Service On Top

Yes, SQL Server is still on top taking up 20% of used CPU and the CPU itself  is 29% of max.

SQL Server Error Log

I am seeing a few entries stating memory pressure:

errorLog

Memory Problem

Jonathan Kehayias has the most accessible script out there to detect memory pressure, how often they are occurring, and how they are being sourced (Internal to SQL Server or System)

Identifying External Memory Pressure with dm_os_ring_buffers and RING_BUFFER_RESOURCE_MONITOR
http://www.sqlskills.com/blogs/jonathan/identifying-external-memory-pressure-with-dm_os_ring_buffers-and-ring_buffer_resource_monitor/


--System Memory Usage
SELECT 
   EventTime
 , record.value('(/Record/ResourceMonitor/Notification)[1]', 'varchar(max)') 
      as [Type]
 , record.value('(/Record/ResourceMonitor/IndicatorsProcess)[1]', 'int') 
      as [IndicatorsProcess]
 , record.value('(/Record/ResourceMonitor/IndicatorsSystem)[1]', 'int')
      as [IndicatorsSystem]
 , record.value('(/Record/MemoryRecord/AvailablePhysicalMemory)[1]', 'bigint') 
      AS [Avail Phys Mem, Kb]
 ,  record.value('(/Record/MemoryRecord/AvailableVirtualAddressSpace)[1]', 'bigint')       AS [Avail VAS, Kb]
FROM (
    SELECT
        DATEADD (ss, (-1 * ((cpu_ticks / CONVERT (float, ( cpu_ticks / ms_ticks )))
           - [timestamp])/1000), GETDATE()) AS EventTime,
        CONVERT (xml, record) AS record
    FROM sys.dm_os_ring_buffers
    CROSS JOIN sys.dm_os_sys_info
    WHERE ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR') AS tab
ORDER BY EventTime DESC;

Output:

RING_BUFFER_RESOURCE_MONITOR

 

 Explanation:

  • Quite a lot of entries came in on 9/21/2013 @ 8:26 bearing the Type = RESOURCE_MEMPHYSICAL_LOW
  • Some are sourced by the SQL Server Process — IndicatorProcess != 0
  • But, most are sourced by the System — IndicatorSystem != 0

Task Manager (Order by Memory)

Who is using memory?

Task Manager - Highest Users of memory

Explanation:

  • So brother uses Firefox and it is good
  • I much prefer Chrome, but Google opens up a new process for Window\Tab and memory usage can get a bit out of hand; especially if you do not go back and close those windows
  • Let us close some of them

Diagnostic Queries

Here are a couple of other diagnostic queries

Jack Li
http://blogs.msdn.com/b/psssql/archive/2009/08/20/high-cpu-consumed-by-resource-monitor-due-to-low-virtual-memory.aspx


declare @curCPU int
declare @prevCPU int
declare @delta int
declare @msg varchar(max)

declare @Iteration int
declare @iMaxNumberofIterations int

set @curCPU =0
set @prevCPU = 0

set @Iteration = 1
set @iMaxNumberofIterations = 5

while (
	 (1 = 1)
	and (@Iteration <= @iMaxNumberofIterations+1)      
     ) 
begin  

    select @curCPU = SUM(cpu_time)  
    from sys.dm_exec_requests 
    where command like '%Resource%Monitor%' 

    set @delta = @curCPU - @prevCPU  
    set @prevCPU = @curCPU  

    if (@Iteration > 1)
    begin

	 set @msg = CAST(GETDATE() as varchar(20)) 
		+  ' -- delta in CPU in sec (wait time 60 sec, ignore first run): '
		+ CAST((@delta / 1000.00) as varchar(max))

	 raiserror (@msg, 10, 1) with nowait
    end

    waitfor delay '0:1:0'

    set @Iteration = @Iteration + 1

end

Output:

JackLi

Supporting Web Literature

High CPU Consumed by Resource Monitor Due to Low Virtual Memory

http://blogs.msdn.com/b/psssql/archive/2009/08/20/high-cpu-consumed-by-resource-monitor-due-to-low-virtual-memory.aspx

We have had a few customers who hit the issue where Resource Monitor consumed a significant portion of CPU in SQL Server 2008.
This is a known issue that has been fixed in SQL Server  2008 RTM CU 6  (http://support.microsoft.com/kb/968722/).
Next CU release of SQL Server 2008 SP1 will also include a fix.

We have only got reports on 32 bit Servers.   The root cause is that something like extended procedure, or COM (loaded by sp_OACreate) exhausted virtual address space of SQL SErver.  SQL Server then is under a low virtual memory state.  This cause Resource Monitor to do extra work.  The best way to address this is to address the virtual memory issue such as using 64 bit or troubleshooting the component that contributed to the memory usage.    To alleviate the CPU overhead, this fix will stop Resource Monitor from doing wasteful work if it detects the memory condition will not go away regardless how hard it works.

Advisory

  • Be careful what Applications you have running on a box that has MS SQL Server running
  • Consider setting “Min Server memory” though not sure that the SQL Instance will not do the gentlemanly thing of trying to respond to the OS requests to release memory
  • High CPUs are more often than not reflective of an over-exercised system.  In this case, due to low memory, SQL Server is being asked to help out the OS and see if it can give some memory back and it is dutifully responding

References

References – Setting Min\Max Memory

References – Identifying Memory Pressure

Microsoft – SQLServer – High CPU – SPID – 1

The last few days has being interesting for this particular MS SQL Server Instance.

One weekend afternoon I saw it so busy.  As a DBA sometimes you learn a bit more from under-provisioned machines.  I hope I can learn a bit from this one.

Here is what I found so far:

  • Take it off the Internet.  It was being attacked and their were numerous entries of failed logins (sa – username no less)
  • Configured Symantec AntiVirus to exclude checking Network and MS SQL Server files; files with “known” SQL Server Extensions (mdf, ndf, ldf)
  • Configured Symantec AV to exclude C:\WINDOWS\system32\NavLogon.dll (http://www.symantec.com/connect/forums/rtvscanexe-high-memory-usage-sav)

Though these changes are in place and server restarted, some simple queries were still timing out and SQL Instance was still registering high CPU even when no queries were being processed.

From experience I know that Full-Text Search sometimes works in the background, but that did not appear to be the case at this time.

So ran a simple query looking for which Session was using high CPU.

SELECT tblSysProcess.cpu as processor, *
from master.dbo.sysprocesses tblSysProcess
order by tblSysProcess.cpu desc

And, here is what we got back:

So after a couple of days running this same query, SPID 1 stayed on top of our list of top CPU users.

Took the easy way and Googled for “SQL Server High CPU SP 1” and ended up where it made sense.

High CPU Usage for SPID #1 (SQL Server 2008)

http://social.msdn.microsoft.com/Forums/en/sqldatabaseengine/thread/bd9e4476-39e2-46f8-a426-bef9601f9ad9

In this particular post, Leila (posting as Amin S) did all the background work and asked the same question I was going to ask:

Hi all,

I have SQL Server 2008 (SP1) on Windows Server 2003 (R2).

Sometimes this server becomes very slow and task manager indicates high CPU usage. I added all SQL Server related threads in Performance Monitor to monitor Processor Time% per Thread.

By correlating Thread ID and KPID, I realized that this abnormal increase for CPU usage is always related to SPID #1 which is system process thus cannot be killed. After restarting SQL Server, everything works fine but after few minutes, exactly the same situation happens and SPID #1 consumes almost whole capacity of processors. I tried different ways to determine the source of problem. First I took our main database (financial) to offline state, and turned it back online. All users for this database were killed and SPID #1 released the processor! Sometimes this solution does not work and I use a loop to kill all users in every database. This works in most of times, but today I encountered situation that even killing all users did not work and I forced to restart the SQL Server engine several times during the day.

I was wondering if somebody could suggest a test for further observation or any solution for this issue.

Many thanks in advance,

Leila

Like I said, Leila did all the work.  A day later (s)he came back and answered his\her own question.

Leilia found a MS Link:

FIX: The CPU usage of the resource monitor is very high when the virtual memory is running low in SQL Server 2008

http://support.microsoft.com/kb/968722/en-us

And so it seems the next thing to do is determine the OS Thread ID for the “Resource Monitor”. This query will do so:

SELECT 
		  STasks.session_id
		, SThreads.os_thread_id
		, b.command 
FROM sys.dm_os_tasks AS STasks
	INNER JOIN sys.dm_os_threads AS SThreads 
		ON STasks.worker_address = SThreads.worker_address
	LEFT OUTER JOIN sys.dm_exec_requests b 
		ON STasks.session_id = b.session_id
WHERE STasks.session_id IS NOT NULL 
and command = 'RESOURCE MONITOR' 
ORDER BY sthreads.os_thread_id

I tried using “Performance Monitor” \ Perfmon. But, really not able to filter by Thread ID.
Probably should have tried SysInternals \ ProcessMonitor; as it has ability to observe at Thread ID, as well.

Need to address Memory Issue.

But, I think I have enough comfort level, that nothing will be hurt by upgrading from MS SQL Server v2008 SP1 to SP2.

Off we go.

 

References: