There are no white flags here. We are mostly going to repeat work already done and placed in the public domain by others.
As our cornerstone, we will rest on the work by Michael J. Swart. Michael gave credit to earlier work done by Sam Saffron.
The vexing problem was how to avoid Primary Key Violations errors in UPSERT transactions.
What are UPSERT statements?
With Wikipedia as a fountain of knowledge, there is very little rationale to think about definitions.
Here is it’s definition of Upserts:
A relational database management system uses SQL MERGE (also called upsert) statements to INSERT new records or UPDATE existing records depending on whether or not a condition matches. It was officially introduced in the SQL:2003 standard, and expanded in the SQL:2008 standard.
Why Michael J. Swart’s Work?
Michael is a rare bread has he covers both Database ( SQL Server ) and Application Development, in this case C#, with clear and concise approaches.
He provides ample background on the problem, the more common and accessible solution, why that solution might not work, along with what works.
The blog posting that we will reference is :
Mythbusting: Concurrent Update/Insert Solutions
Does Method X really perform Insert/Update queries concurrently, accurately and without errors?
What is the problem?
The scenario is that data comes in and if the data primary key exists, we update the data. If the data does not exist, we add it.
Unfortunately, a database is a shared repository and things might have changed between the time the question of existence was answered and when the the actual operation, INSERT or UPDATE, occurs.
We are assuming that an UPDATE is an atomic operation and that we are simply adding to existing data in the DB.
On the other hand, with inserts, we determined a matching primary key does not exist and it is thus safe to add it. But, in between the time we checked, another session checked, as well. And, now two or more sessions feel it is safe to insert a record bearing the same primary key.
Also, notice that if the key is automatically generated by the system, i.e. Identity, NEWID, etc, the system is mostly responsibly to ensure uniqueness and ever-increasing data.
- Specifies that update locks are to be taken and held until the transaction completes. UPDLOCK takes update locks for read operations only at the row-level or page-level. If UPDLOCK is combined with TABLOCK, or a table-level lock is taken for some other reason, an exclusive (X) lock will be taken instead.
- Occurs when a task is waiting to acquire a NULL lock on the current key value, and an Insert Range lock between the current and previous key. A NULL lock on the key is an instant release lock.
- Read Configuration file
- Start Stopwatch
- Clean up database table
- Loop Start
- Create local DB Connection
- Open DB Connection
- Create local DB Command
- Set DB Command Connection handle to DB Connection (opened earlier)
- Set DB Connection payload to our payload
- Set DB Connection.CommandType to Stored Procedure
- Add Parameter (@method) = MethodID
- Add Parameter (@method) = Iterator / 10
- If Client side Transaction requested, initiate client-side transaction handling
- Begin Asynchronous Execution of query, pass along
- Callback procedure
- Command Object
- Callback Procedure
- Accept Command Object
- End Execution of Command Object
- If Command Object Processing failed:
- Increment Counter
- Close Command Object
- Increment Number of Attempts
- Loop End
- End Stopwatch
- Get Time Elapsed
- Query DB for number of successfully processed records
- Display summary
Code availed in Github @ https://github.com/DanielAdeniji/SQLServerLockingAndConcurrency.
here are the revisions we made to Michael J. Swart’s original code
- Adder timer to timer various iterations
- Client side transaction support
Application Operating Modes
Here is our metrics …
- Our most stable and performant are
- Serializable along with an UPDLOCK Lock on the Select statement
- Read Committed with UPDLOCK and HOLDLOCK on the Select statement
- Our least performant is the serializable without the UPDLOCK Lock
- Our most stable and performant are
- We experienced quite a bit of deadlocks with every other path when we ommitted the UPDLOCK hint
Dynamic Management Views
To get a quick tally of Transaction Isolation levels currently in-use.
select dmvDES.transaction_isolation_level , [literal] = case dmvDES.transaction_isolation_level when 0 then 'Unspecified' when 1 then 'Read Uncomitted' when 2 then 'Read Comitted' when 3 then 'Repeatable' when 4 then 'Serializable' when 5 then 'Snapshot' end , [Count] = count(*) from sys.dm_exec_sessions dmvDES --leave out background sessions where dmvDES.session_id > 50 group by dmvDES.transaction_isolation_level
/* Rohit Garg EXPLORE & EXPERIENCE THE MSSQL DMV-3 : What is currently going on ?……..sys.dm_exec_requests http://mssqlfun.com/2013/04/01/dmv-3-what-is-currently-going-on-sys-dm_exec_requests-2/ */ SELECT R.SESSION_ID , R.[STATUS] , R.TRANSACTION_ISOLATION_LEVEL , TRANSACTION_ISOLATION_LEVEL_NAME = CASE R.TRANSACTION_ISOLATION_LEVEL WHEN 0 THEN 'UNSPECIFIED' WHEN 1 THEN 'READUNCOMITTED' WHEN 2 THEN 'READCOMMITTED' WHEN 3 THEN 'REPEATABLE' WHEN 4 THEN 'SERIALIZABLE' WHEN 5 THEN 'SNAPSHOT' ELSE CAST(R.TRANSACTION_ISOLATION_LEVEL AS VARCHAR(32)) END , CURRENT_WAIT_TYPE =R.WAIT_TYPE , R.LAST_WAIT_TYPE , DB_NAME(R.DATABASE_ID) AS DATABASE_NAME , R.COMMAND , QUERY_TEXT = [ST].[TEXT] , XML_QUERY_PLAN = QP.QUERY_PLAN , R.BLOCKING_SESSION_ID , R.ROW_COUNT , R.GRANTED_QUERY_MEMORY , R.OPEN_TRANSACTION_COUNT , S.HOST_NAME , C.CLIENT_NET_ADDRESS , LOGIN_NAME = CASE WHEN S.LOGIN_NAME = S.ORIGINAL_LOGIN_NAME THEN S.LOGIN_NAME ELSE S.LOGIN_NAME + ' (' + S.ORIGINAL_LOGIN_NAME + ')' END , S.PROGRAM_NAME , R.USER_ID , R.PERCENT_COMPLETE FROM SYS.DM_EXEC_REQUESTS R LEFT OUTER JOIN SYS.DM_EXEC_SESSIONS S ON S.SESSION_ID = R.SESSION_ID LEFT OUTER JOIN SYS.DM_EXEC_CONNECTIONS C ON C.CONNECTION_ID = R.CONNECTION_ID CROSS APPLY SYS.DM_EXEC_SQL_TEXT(R.SQL_HANDLE) ST CROSS APPLY SYS.DM_EXEC_QUERY_PLAN(R.PLAN_HANDLE) QP WHERE R.STATUS NOT IN ('BACKGROUND','SLEEPING') AND R.SESSION_ID != @@SPID
Other Databases Implementation
Other Vendors such as IBM DB/2 and Oracle, have similar workarounds. It is called “Select for Update“.
It is also known as pessimistic locking. In essence, it says I will like to read records, but I want to keep the records locked for a little as I will be updating same records.
SELECT t.Id FROM Table t WHERE t.Id IN ( SELECT o.Id FROM Table1 o, Table2 x WHERE [...] ) FOR UPDATE WITH RS USE AND KEEP UPDATE LOCKS
- RS says to utilize “Read Stability”
- And, to keep the locks
Oracle has continually tried to fined tune their “Select for Update”. The changes that we will cover is how long we will wait to get the locks.
- Select for update
- Enables you to wait until the lock was released or exit immediately with an error
- Select for update with nowait
- Attempt to lock, if records are unavailable, then error out immediately
- Select for update with wait n
- Wait for n seconds
The error raised is “ORA-30006: resource busy; acquire with WAIT timeout expired“.
SELECT counter_field FROM child_codes FOR UPDATE; UPDATE child_codes SET counter_field = counter_field + 1;
Late last week through the grace of a repeatable clean room experiment, my director and I were able to simulate the same errors we were experiencing in our Production environment.
We have been making the change in each impacted Stored Procedure. And, the results have been been a total obliteration of the deadlock errors we have occasionally experienced during the processing of the UPSERT statements.
Unfortunately, we are now experiencing problems with timeouts in other Stored Procedures. It seems the reason is that adding UPDLOCK and HOLDLOCK Lock Hints is causing Locks to be held longer than before.
Once upon a time our concerns were deadlocks, but now it is timeouts.
Unfortunately, we are not set up to measure overall system throughput.
Thankfully, the queries that are timing out are being logged through exception handling in a single repository.
It is my hope that our next task is to correlate that list with the tables that we are populating with UPSERT statements. With this new bench-marking framework, it should not be too hard to add querying workloads as secondary threads.
Listening to The Paper Kites – Bloom
- sys.dm_os_wait_stats (Transact-SQL)
Execution Related Dynamic Management Views and Functions (Transact-SQL)
- sys.dm_exec_sessions (Transact-SQL)
- Conditional Insert/Update Race Condition
- DB2 reference information>DB2 SQL>Queries>select-statement>isolation-clause
- How to do a safe select for update with a where condition over multiple tables
- Unintended Consequences ( by mbobak )
MySQL – INNODB
- SELECT … FOR UPDATE and SELECT … LOCK IN SHARE MODE Locking Reads
- Running SAP Applications on the Microsoft Platform