SQL Server – BCP – Memory Constraints

Background

As a follow-up to a recent post titled “SQLServer – BCP Optimization via Running Multiple Instances in Parallel ”  ( https://danieladeniji.wordpress.com/2015/07/22/sqlserver-bcp-optimization-via-running-multiple-instances-in-parallel/ ), let us see what happens on the DB Server has one pushes it a bit.

Monitoring

Activity Manager

Waits

ResourceWaits-Memory

Explanation:

  • We can see that we are really waiting on Memory

Processes

ProcessWaits

Explanation:

  • A more granular view that shows the same thing; as it indicates that our individual BCP requests are suspended as they wait for their memory requests to be fulfilled

WhoIsActive

Running Adam Machanic’s shows..

WhoIsActive

Explanation:

  1. No Blocking Sessions
  2. Relative to writes, a lot of reads on the active BCP sessions
  3. Other BCP Sessions are waiting on RESOURCE_SEMAPHORE

Dynamic Management Views

sys.dm_exec_query_memory_grants

Let us do a little bit more work.

Not that much more as it is Friday, and so we will just align memory requests to the correspondent session’s query.

Query:

SELECT
          [sqlText] = dmvESQT.[text]
        , dmvDMEQMG.session_id
        , dmvDMEQMG.request_time
        , dmvDMEQMG.grant_time
        , [waitInMinutes] =
             case
                when dmvDMEQMG.grant_time is null then
                   datediff(minute, dmvDMEQMG.request_time, getdate())
                then null
             end
        , dmvDMEQMG.requested_memory_kb
        , requested_memory_mb = (dmvDMEQMG.requested_memory_kb / 1024)
        , dmvDMEQMG.granted_memory_kb
        , dmvDMEQMG.required_memory_kb
        , dmvDMEQMG.used_memory_kb
        , dmvDMEQMG.max_used_memory_kb
        , dmvDMEQMG.query_cost
        , dmvDMEQMG.resource_semaphore_id
        , dmvS.login_name

FROM sys.dm_exec_query_memory_grants dmvDMEQMG

    LEFT OUTER JOIN sys.dm_exec_sessions dmvS

       ON dmvDMEQMG.session_id = dmvS.session_id

    outer apply sys.dm_exec_sql_text(dmvDMEQMG.sql_handle) dmvESQT

Output:

requestedMemory_20150729_0931AM

Explanation:

  • Memory Requested
    • We can see that depending on the targeted table and size of the data feed, the requested memory and the query cost varies a bit
    • We can also see that the requested memory, in our case, is about 885 MB; a little below 1 GB
    • Once we take into account this request is for each BCP session, we need a few few GBs of unused memory to simultaneously service our concurrent requests

sys.dm_tran_locks

Let us see what types of locks are being acquired.

Query:


set transaction isolation level read uncommitted;

SELECT

           lck.resource_type
         , lck.request_mode
         , lck.request_status
         , dmvS.login_name
         , lck.request_session_id
         , par.object_id
         --, object_schema_name(par.object_id) as schhemaName
         --, object_name(par.object_id) as objectName
         , [schema] = [schema].name
         , objectName = obj.name
         , COUNT(*) number_locks

FROM sys.dm_tran_locks lck with (nolock)

        INNER JOIN sys.partitions par  with (nolock)

           ON lck.resource_associated_entity_id = par.hobt_id

        INNER JOIN sys.objects obj  with (nolock)

            ON par.object_id = obj.object_id

        INNER JOIN sys.schemas [schema] with (nolock)

            ON obj.schema_id = [schema].schema_id

        INNER JOIN sys.dm_exec_sessions dmvS  with (nolock)

            ON lck.request_session_id = dmvS.session_id

where  [schema].[name] not in ('sys')

GROUP BY
              lck.resource_type
            , lck.request_mode
            , lck.request_status
            , dmvS.login_name
            , lck.request_session_id
            , par.object_id
            , [schema].name
            , obj.name 

Output:

locks_page

Explanation:

  • It looks like though we requested Table Locks, we are getting more granular locks; in this case PAGE LOCKS

Summary

Again, I will suggest that you have yourself a nice LAB to tinker with things.

Your DBA might not grant your request for “VIEW SERVER STATE“.

But, nevertheless find a way around and earn your “Come up“.

Dedicated

Dedicating Sarah McLachlan’s Shipwreck to the beautiful warriors, Jillian Johnson and Mayci Breaux, who “laid down” viewing the movie Train Wreck in New Orleans last night.

SQL Server – BulkCopy (BCP) – Identity Columns

Background

I am copying data from Production to my personal Development box, but my SQL Join Queries are not yielding any data.

Code

Original Code


SETLOCAL

 set DBTargetServer=.\SQLEXPRESS_V2014
 set DBTargetDB=Dev
 set BATCH_SIZE=30000

 sqlcmd -e -b -Q"truncate table dbo.tblStaffingDateTime" -S%DBTargetServer% -d%DBTargetDB%

 timer.exe

   bcp %DBTargetDB%.dbo.tblStaffingDateTime in %TEMP%\dbo.tblStaffingDateTime.data -T -n -b%BATCH_SIZE% -S%DBTargetServer%

 timer.exe /S

ENDLOCAL 

 

Revised Code

Traced the problem back to ensuring that we preserve our Identity Values.

When issuing bcp, we use -E to insist that our supplied identify values are used, rather than for the system to generate its own.


SETLOCAL

 set DBTargetServer=.\SQLEXPRESS_V2014
 set DBTargetDB=Dev
 set BATCH_SIZE=30000

 sqlcmd -e -b -Q"truncate table dbo.tblStaffingDateTime" -S%DBTargetServer% -d%DBTargetDB%

 timer.exe

   rem -E Keep Identity Values ON
   bcp %DBTargetDB%.dbo.tblStaffingDateTime in %TEMP%\dbo.tblStaffingDateTime.data -T -n -E -b%BATCH_SIZE% -S%DBTargetServer%

 timer.exe /S

ENDLOCAL 

Metrics

Using Gammadyne Timer Utility ( http://www.gammadyne.com/cmdline.htm#timer ) compared how long it takes to insert 150 thousand records.

 -E Skipped :- Does not Preserve Identity Values
keepIdentityOffCleaned

 -E Added :- Preserve Identity Values

keepIdentityOnCleaned

 

Conclusively, when we preserve our Identity Column our throughout is about 15% less.

Dedicated

Will like to go another way with this.  But, got to claim my own.  And, this morning it is Duke Blue Devils,  the National Collegiate Basketball Champion 2014.

But, even then it is April.  And, yet still considered going another road, Dave Loggins’ Pieces of April.

But, will rest with Touch-Of-Pennsylvania ( https://www.youtube.com/watch?v=9_sQv2hfVxg )

Microsoft – SQL Server – Bulking Copying data using XML Template File

Background

Preparing to perform a bit of load test against MS SQL Server.  And, so downloaded spawner, a MS Windows desktop application that allows one to generate test data.

We were able to generate about 1 million records.

 

Generate test data

 

Spawner

Here is what the Spawner desktop looks like:

Spawner

 

 

Create Table

Create Table

use [DBLabInMemory]
go

set noexec off
go


/*
 drop table [dbo].[personTraditionalClustered]
*/
if object_id('[dbo].[personTraditionalClustered]') is not null
begin
 set noexec on
end
go



create table [dbo].[personTraditionalClustered]
(

 [firstname] [varchar](40) 
      COLLATE Latin1_General_100_BIN2 NOT NULL

 , [lastname] [varchar](40) 
      COLLATE Latin1_General_100_BIN2 NOT NULL

 , [emailAddress] [varchar](40) 
      COLLATE Latin1_General_100_BIN2 NOT NULL

 , [phoneNumber] [varchar](40) 
      COLLATE Latin1_General_100_BIN2 NOT NULL

 , [addressStreet] [varchar](100) 
      COLLATE Latin1_General_100_BIN2 NOT NULL

 , [addressCity] [varchar](40) 
      COLLATE Latin1_General_100_BIN2 NOT NULL

 , [addressState] [varchar](40) 
      COLLATE Latin1_General_100_BIN2 NOT NULL

 , [addressPostalCode] [varchar](40) 
      COLLATE Latin1_General_100_BIN2 NOT NULL

 , [dateofBirth] datetime null


 , [uniqueID] uniqueIdentifier not null
     constraint defaultPersonTraditionalUniqueID 
        default newid() 



 , constraint PK_PersonTraditional
     primary key ([uniqueID])


 , index idx_Firstname nonclustered
     ([firstname])

 , index idx_Lastname nonclustered
     ([lastname])

 , index idx_AddressCity nonclustered
     ([addressCity])


 , index idx_AddressState nonclustered
     ([addressState])

 , index idx_AddressPostalCode nonclustered
     ([addressPostalCode])

 , index idx_EmailAddress nonclustered
     ([emailAddress])

 , index idx_DateofBirth nonclustered
     ([dateofBirth])
 

)
go

set noexec off
go

Recommendation

As recommendation, I will suggest that one follows the following strategy:

  • Create table
  • Insert test data into the table
  • create template file
  • Customize template file (separators, column ordering, number of columns)
  • Export a few trial records
  • Experiment with importing data

 

Create Table

Create database table; in our case we created a table earlier in this post

Insert test data

Inserting test data is useful and we can better determine if we have the correct data types, column lengths and nullability indicator.

 

Create XML Template file

Syntax

bcp <table> format nul -c -f <format-file> -x -c -S <server_name> -d <database-name> -T -q 

 

Sample

bcp dbo.personTraditionalClustered format nul -x -c 
   -f targetFileTemplate.xml -x -S (local) -d DBLabInMemory -T -q 

This is sufficient when the incoming data file has all the columns that are in the destination table.  If not all of the columns are present or if they are not present in the same order, I will suggest one use an editor and adjust the created format file.

The error messages that we are trying to avoid is stated below:

When we have more columns in SQL Server than in the datafile

Unexpected EOF encountered in BCP data-file

 

Customize format file

Here is what our format file looks like:

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
 <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="2" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="3" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="5" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="100" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="6" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="7" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="8" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="9" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="24"/>
 <FIELD ID="10" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="37"/>
 </RECORD>
 <ROW>
 <COLUMN SOURCE="1" NAME="firstname" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="2" NAME="lastname" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="3" NAME="emailAddress" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="4" NAME="phoneNumber" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="5" NAME="addressStreet" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="6" NAME="addressCity" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="7" NAME="addressState" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="8" NAME="addressPostalCode" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="9" NAME="dateofBirth" xsi:type="SQLDATETIME"/>
 <COLUMN SOURCE="10" NAME="uniqueID" xsi:type="SQLUNIQUEID"/>
 </ROW>
</BCPFORMAT>


 

There are a couple of changes we will be making to our template file

  • Because our input data file does not have the the uniqueID column in it, we will remove the corresponding lines from the created format file.
  • We will also change our Terminator from \t (tab) to ||

Here is what our new format file:

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
 <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="||" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="2" xsi:type="CharTerm" TERMINATOR="||" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="3" xsi:type="CharTerm" TERMINATOR="||" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="4" xsi:type="CharTerm" TERMINATOR="||" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="5" xsi:type="CharTerm" TERMINATOR="||" MAX_LENGTH="100" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="6" xsi:type="CharTerm" TERMINATOR="||" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="7" xsi:type="CharTerm" TERMINATOR="||" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="8" xsi:type="CharTerm" TERMINATOR="||" MAX_LENGTH="40" COLLATION="Latin1_General_100_BIN2"/>
 <FIELD ID="9" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="24"/>
 </RECORD>
 <ROW>
 <COLUMN SOURCE="1" NAME="firstname" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="2" NAME="lastname" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="3" NAME="emailAddress" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="4" NAME="phoneNumber" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="5" NAME="addressStreet" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="6" NAME="addressCity" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="7" NAME="addressState" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="8" NAME="addressPostalCode" xsi:type="SQLVARYCHAR"/>
 <COLUMN SOURCE="9" NAME="dateofBirth" xsi:type="SQLDATETIME"/>
 </ROW>
</BCPFORMAT>


 

Export a few trial records

Let us export out a few records

bcp "select top 10 [firstname],[lastname],[emailAddress],[phoneNumber],[addressStreet],[addressCity],[addressState],[addressPostalCode],[dateofBirth] from [dbo].[personTraditionalClustered] with (nolock)" queryout extractedData_v4.txt -c -S (local) -d DBLabInMemory -T

 

In the example above there are a couple of things one should note:

  • We are only taking out some data; just the the first 10 records
  • We have also restricted the columns to the ones we have in our input file

 

Insert test data

Let us perform some trial runs

sqlcmd -S (local) -d DBLabInMemory -E -Q "truncate table [dbo].[personTraditionalClustered]"

bcp [dbo].[personTraditionalClustered] in datafile_v4.txt -f fileTemplate_v4.xml -S (local) -d DBLabInMemory -e errfile.log -T -F 2 -L 4

 

  • In our sample data file, we have added a header line and so we need -F 2 to indicate that raw data starts at line 2
  • We have also added -L 4 to indicate import should end at line 4
  • And, so we are in essence only importing 3 lines

 

Actual Run

A couple of changes we will be making in the actual run

  • Added -b to indicate that we will like to change our block size from the default of 1000 to 50000

 

sqlcmd -S (local) -d DBLabInMemory -E -Q "truncate table [dbo].[personTraditionalClustered]"

bcp [dbo].[personTraditionalClustered] in datafile_v4.txt -f fileTemplate_v4.xml -S (local) -d DBLabInMemory -e errfile.log -T -F 2 -L 4

 

Output:

Starting copy...
50000 rows sent to SQL Server. Total sent: 50000
50000 rows sent to SQL Server. Total sent: 100000
50000 rows sent to SQL Server. Total sent: 150000
50000 rows sent to SQL Server. Total sent: 200000
50000 rows sent to SQL Server. Total sent: 250000
50000 rows sent to SQL Server. Total sent: 300000
50000 rows sent to SQL Server. Total sent: 350000
50000 rows sent to SQL Server. Total sent: 400000
50000 rows sent to SQL Server. Total sent: 450000
50000 rows sent to SQL Server. Total sent: 500000
50000 rows sent to SQL Server. Total sent: 550000
50000 rows sent to SQL Server. Total sent: 600000
50000 rows sent to SQL Server. Total sent: 650000
50000 rows sent to SQL Server. Total sent: 700000
50000 rows sent to SQL Server. Total sent: 750000
50000 rows sent to SQL Server. Total sent: 800000
50000 rows sent to SQL Server. Total sent: 850000
50000 rows sent to SQL Server. Total sent: 900000
50000 rows sent to SQL Server. Total sent: 950000
50000 rows sent to SQL Server. Total sent: 1000000
50000 rows sent to SQL Server. Total sent: 1050000
50000 rows sent to SQL Server. Total sent: 1100000
50000 rows sent to SQL Server. Total sent: 1150000

1160001 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total : 160759 Average : (7215.78 rows per sec.)

 

Version Control

Version Control – Github

Files posted to github @ https://github.com/DanielAdeniji/MSSSQLBCPFormatFileXML

 

Had a little hiccup as our data file exceed the 100 MB limit



[dadeniji@adelia XML]$ git push origin master
Counting objects: 10, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (10/10), done.
Writing objects: 100% (10/10), 52.08 MiB | 692 KiB/s, done.
Total 10 (delta 1), reused 0 (delta 0)
remote: error: GH001: Large files detected.
remote: error: Trace: b9a986aaae560141bf94f2213cdb5c18
remote: error: See http://git.io/iEPt8g for more information.
remote: error: File datagenSimple.txt is 120.97 MB; this exceeds GitHub's file size limit of 100 MB
To git@github.com:DanielAdeniji/MSSSQLBCPFormatFileXML.git
! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'git@github.com:DanielAdeniji/MSSSQLBCPFormatFileXML.git'


 

To address tarred file and removed original file from our local git repository.

Tarred file

tar -zcvf datagenSimple.tar.gz datagenSimple.txt

Remove file

git rm --cached datagenSimple.txt

 

Added file (datagenSimple.txt) to .gitignore

vi .gitignore

 

Added file (datagenSimple.tar.gz) to local git

git add datagenSimple.tar.gz

 

Commit file

git commit

 

Output:

[dadeniji@adelia XML]$ git commit -a -m 'First commit'
[master ca1d17a] First commit
2 files changed, 0 insertions(+), 1160002 deletions(-)
create mode 100644 datagenSimple.tar.gz
delete mode 100644 datagenSimple.txt

 

Push to remove repository

git push origin master

 

But, it was still trying to add the removed file

 

remove file from history

Used David Underhill’s (git-remove-history.sh) file to permanently remove the oversized file.  BTW, the shell script is available @ http://dound.com/2009/04/git-forever-remove-files-or-folders-from-history/

 

sh /tmp/git-remove-history.sh datagenSimple.txt

Output:

[dadeniji@adelia XML]$ sh /tmp/git-remove-history.sh datagenSimple.txt
Rewrite 32420099006159bf2413542475cdc556167503d7 (1/2)rm 'datagenSimple.txt'
Rewrite ca1d17a66e273b7ceca218ff2e24392913b36d22 (2/2)
Ref 'refs/heads/master' was rewritten
Counting objects: 16, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (16/16), done.
Writing objects: 100% (16/16), done.
Total 16 (delta 5), reused 0 (delta 0)

 

Push to git

Syntax:


[dadeniji@adelia XML]$ git push origin master

Output:


[dadeniji@adelia XML]$ git push origin master
Counting objects: 12, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (10/10), done.
Writing objects: 100% (12/12), 52.04 MiB | 691 KiB/s, done.
Total 12 (delta 2), reused 11 (delta 2)
remote: warning: GH001: Large files detected.
remote: warning: See http://git.io/iEPt8g for more information.
remote: warning: File datagenSimple.tar.gz is 52.02 MB; 
this is larger than GitHub's recommended maximum file size of 50 MB
To git@github.com:DanielAdeniji/MSSSQLBCPFormatFileXML.git
* [new branch] master -> master

 

References

BCP Reference

BCP Sample

Git Commands