Transact SQL – String Splitting Using XML

Background

Unfortunately splitting a String into rows was lacking in Transact SQL for a while.

MSFT changed that in Version 2016.

Split_String is capable and handy.

Here is the link.

 

Prior to v2016

Prior to Version 2016, there are many tools out in the wild for accomplishing same.

There are CLR Functions.  There are functions that use dbo.Numbers

 

XML

And, there are functions that rely on XML’s inherent ability to split an input into nodes.

 

XML

Standing On

Found a nice working model on sqlhints.com.

BTW, sqlHints.com is Basavaraj Biradar.

And, here is the particular post that we will be using.

It is titled “How to Split comma or any other character delimited string into a Table in Sql Server.”

 

Objective

  1. Look for separator
    • Replace separator with endNode and beginNode
      • Close out endNode
      • Start new node with beginNode
  2. Begin & End fragment
    • Begin Fragment with beginNode
    • End Fragment with endNode
  3. Start and end node fragment
    • XML needs root node
      • And, so we start string with rootNode
      • And, end with rootNode
  4. Use XQuery to parse node
    • Parse node using XQuery
    • Nodes ( “//node”)

Code

[stringSplit].[itvf_stringSplitUsingXML]

 


use [master]
go


if schema_id('stringSplit') is null
begin

	exec('create schema [stringSplit] authorization [dbo] ')

end
go

if object_id('[stringSplit].[itvf_stringSplitUsingXML]') is null
begin

	exec
	(
		'create function [stringSplit].[itvf_stringSplitUsingXML]
			()
			RETURNS @RESULT TABLE
			(
				Value VARCHAR(MAX)
			)
		as
		begin

			return

		end

		'
	) 

end
go


ALTER FUNCTION [stringSplit].[itvf_stringSplitUsingXML]
(
      @string  VARCHAR(MAX)
	, @separator CHAR(1) = ','
)
RETURNS @tblResult TABLE
(

	  [rowNumber] smallint not null identity(1,1)
	, [value]	  VARCHAR(600)
	, [xml]		  xml
	, [element]	  xml

)
AS
BEGIN

	/*

		a) sqlhints.com
		   http://sqlhints.com/tag/split-comma-separated-values-in-sql/
	
	*/    

	 DECLARE @xml XML
	 DECLARE @xmlAsString nvarchar(max)

	 declare @rootBegin varchar(10)
	 declare @rootEnd varchar(10)

	 declare @node      varchar(10)
	 declare @nodeBegin varchar(10)
	 declare @nodeEnd varchar(10)

	 declare @xmlNodeAsString as varchar(600)

	 set @rootBegin = '<root>'
	 set @rootEnd = '</root>'

	 set @node = 'node'
	 set @nodeBegin = '<node>'
	 set @nodeEnd = '</node>'

	 /*

		Look for separator, when found
			end earlier node by replacing separator
				 with [nodeEnd] 
				 and starting new node with [nodeBegin]

	 */
	 set @xmlNodeAsString = REPLACE
								(
									  @string
									, @separator
									, @nodeEnd + @nodeBegin
								)

	/*
		Bracket fragment with begin and end node
	*/
 	 SET @xmlNodeAsString =  @nodeBegin 		
							+ @xmlNodeAsString
							+ @nodeEnd

	/*
		Root node
		 separated out contents
		 End Root Node
	*/
 	 SET @xmlAsString = @rootBegin
							+ @xmlNodeAsString
							+ @rootEnd

	/*
		Convert to XML
	*/
	set @xml = @xmlAsString
 

	INSERT INTO @tblRESULT
	(
		  [value]
		, [xml]
		, [element]

	)
	SELECT 

		  [value]
		  = 
			(
				t.i.value
				(
						'(.)[1]'
					, 'varchar(max)'
				)
			)

		, [xml]
			= @xml

		 , [element]
		  = 
			(
				t.i.query
				('.')
			)
	  
	from @xml.nodes('//node') AS t(i)

	RETURN

END
go


 

Invoke

 


	declare @data varchar(600)

	set @data = 'sammie,bobbie,jackie'

	select *

	from   [master].[stringSplit].[itvf_stringSplitUsingXML]
			(
				  @data
				, default
			)

 

Output

Transact SQL – Identify In-Memory Objects

Introduction

Let us identify In Memory Objects in our SQL Server Instance.

 

User Tables – sys.tables

Overview

In the sample code below, we identify In Memory Objects by checking the is_memory_optimized column in the sys.tables DMV.

Code



select
		 [table]
			= quotename(tblSS.[name])
				+ '.'
				+ quotename(tblST.[name])

from   sys.tables tblST

inner join sys.schemas tblSS

		on tblST.[schema_id] = tblSS.[schema_id]

where  tblST.is_memory_optimized = 1

order by
		[table] asc


Output

identifyinmemorytables-systables

User Tables & Indexes – Based on FileGroups

Overview

In the sample code below, we identify In Memory Objects by seeking out objects that are located on In-Memory Filegroups.

Code



select 

		  [object]
			= tblSS.[name]
				+ '.'
				+ tblSO.[name]

		, [objectType]
			= tblSO.[type_desc]


		, [indexName]
			= tblSI.[name]


		, [isPrimaryKey]
			= case
				when tblSI.[is_primary_key] = 1 then 'Yes'
				else 'No'
			  end

		, [indexType]
			= tblSI.[type_desc]
	
		, [fileGroup]
			= tblSFG.[name]

		, [fileGroupType]
			= tblSFG.[type_desc]

from   sys.objects tblSO

INNER JOIN sys.schemas tblSS

	ON tblSO.schema_id = tblSS.schema_id

inner join sys.indexes tblSI

	on tblSO.object_id = tblSI.object_id

inner join sys.partitions tblSP

	on  tblSI.object_id = tblSP.object_id
	and tblSI.index_id = tblSP.index_id

		
INNER JOIN sys.allocation_units tblSAU

	on tblSAU.container_id = tblSP.hobt_id
		 
INNER JOIN sys.filegroups tblSFG
 
	ON tblSFG.data_space_id = tblSAU.data_space_id 

where tblSFG.[type] = 'FX'



Output

identifyinmemoryobjects

 

 

User Tables

DML Stats

Overview

Review Statistics of DML Operations against In-Memory Tables.

Code


SELECT
	  [object]
		= quoteName(tblS.[name])
		   + '.'
		   + quoteName(tblO.[name])

        , tblXTPOS.*

FROM sys.dm_db_xtp_object_stats tblXTPOS

INNER JOIN sys.objects tblO

	ON tblXTPOS.object_id = tblO.object_id

INNER JOIN sys.schemas tblS

	ON tblO.schema_id = tblS.schema_id


 

Output

dmlstats

 

Index Stats

Overview

Review Statistics of DML Operations against In-Memory Tables.

Code





SELECT
        [object]
        = quoteName(tblS.[name])
           + '.'
           + quoteName(tblO.[name])
 
		, [index]
			= tblSI.[name]

		, [indexType]
			= tblSI.[type_desc]

		, [isPrimaryKey]
			= case
				when ( tblSI.[is_primary_key] = 1 ) then 'Y'
				else 'N'
			  end

        , [scansStarted]
			= tblXTPIS.scans_started	

		, [scansRetried]
			= tblXTPIS.scans_retries
			
		, [rowsReturned]
			= tblXTPIS.rows_returned

		, [rowsTouched]
			= tblXTPIS.rows_touched	

		/*

		--- Expiring -----
		, tblXTPIS.rows_expiring
		, tblXTPIS.rows_expired
		, tblXTPIS.rows_expired_removed	
		--- Expiring -----

		-- Phantom ----
		, tblXTPIS.phantom_scans_started	
		, tblXTPIS.phantom_scans_retries	
		, tblXTPIS.phantom_rows_touched	
		, tblXTPIS.phantom_expiring_rows_encountered	
		--, phantom_expired_rows_encountered	
		, tblXTPIS.phantom_expired_removed_rows_encountered	
		, tblXTPIS.phantom_expired_rows_removed
		-- Phantom ----

		*/

FROM sys.dm_db_xtp_index_stats tblXTPIS
 
INNER JOIN sys.objects tblO
 
    ON tblXTPIS.object_id = tblO.object_id

INNER JOIN sys.indexes tblSI
 
    ON  tblXTPIS.object_id = tblSI.object_id 
    AND tblXTPIS.index_id = tblSI.index_id 

INNER JOIN sys.schemas tblS
 
    ON tblO.schema_id = tblS.schema_id

order by
		  tblS.[name]
		, tblO.[name]
		, tblSI.[name]

Output

imtabledmlindexstats

Explanation:

  1. In-Memory tables does not support Clustered Indexes as all data is stored in memory
    • If Clustered,then memory will have to be continuously shuffled to ensure proper sequence
  2. It makes sense to review ScanStarted
    • Doing so is informative in terms of which indexes are actually being used
    • And, to gauge popularity
  3. The RowsReturned Column
    • Reviewing the RowsReturned column is also important to follow trend in terms of access method

Memory Usage Stats

Memory Usage Stats – Object

Overview

Review Memory allocated and in-use by In-Memory Tables.

Code


SELECT

	[object]
		= quotename(tblSS.name)
			+'.'
			+ quotename(tblSO.name)

    , [allocatedMB]
		= SUM(tblXTPMC.[allocated_bytes]) / (1024* 1024)

	, [usedMB]
		= SUM(tblXTPMC.[used_bytes]) / (1024 * 1024)

FROM  sys.objects tblSO

INNER JOIN sys.schemas tblSS
 
	ON tblSO.schema_id = tblSS.schema_id 

INNER JOIN sys.indexes tblSI
 
	ON tblSO.object_id = tblSI.object_id 

INNER JOIN sys.dm_db_xtp_memory_consumers tblXTPMC 

	ON  tblSI.object_id = tblXTPMC.object_id
	AND tblSI.index_id  = tblXTPMC.index_id

GROUP BY 
		  quotename(tblSS.name)
		, tblSO.schema_id
		, tblSO.object_id
		, tblSO.[name]
		
ORDER BY 
		  quotename(tblSS.name)
		, tblSO.[name]
;


Output

memoryallocatedandinusedbyobject

Memory Usage Stats – Index

Overview

Review Memory allocated and in-use by In-Memory Table Indexes.

Code


SELECT

	[object]
		= quotename(tblSS.name)
			+'.'
			+ quotename(tblSO.name)

	, [index]
		= tblSI.name

	, [indexType]
		= tblSI.[type_desc]

    , [isPrimaryKey]
        = case
            when tblSI.[is_primary_key] = 1 then 'Yes'
            else 'No'
            end

    , [allocatedMB]
		= SUM(tblXTPMC.[allocated_bytes]) / (1024* 1024)

	, [usedMB]
		= SUM(tblXTPMC.[used_bytes]) / (1024 * 1024)

FROM  sys.objects tblSO

INNER JOIN sys.schemas tblSS
 
	ON tblSO.schema_id = tblSS.schema_id 

INNER JOIN sys.indexes tblSI
 
	ON tblSO.object_id = tblSI.object_id 

INNER JOIN sys.dm_db_xtp_memory_consumers tblXTPMC 

	ON  tblSI.object_id = tblXTPMC.object_id
	AND tblSI.index_id  = tblXTPMC.index_id

GROUP BY 
		  quotename(tblSS.name)
		, tblSO.schema_id
		, tblSO.object_id
		, tblSO.[name]
		, tblSI.name
		, tblSI.[type_desc]
		, tblSI.[is_primary_key]
		
ORDER BY 
		  quotename(tblSS.name)
		, tblSO.[name]
		, tblSI.name


Output

memoryallocatedandinusedbyobjectandindex

Programmable Objects – sys.modules

Overview

In the sample code below, we identify In Memory Objects by checking the is_memory_optimized column in the sys.tables DMV.

Code



/*

	sys.sql_modules

		execute_as_principal_id

			Value of -2 indicates that the batch submitted does not depend on implicit name resolution and can be shared among different users. 
			This is the preferred method. Any other value represents the user ID of the user submitting the query in the database.

*/

select 
		  [object]
			= quoteName(tblSS.[name])
				+ '.'
				+ quoteName(object_name(tblSSM.[object_id]))

		, [type]
			= tblSO.type_desc

		, [createDate]
			= tblSO.[create_date]

		, [isSchemaBound]
			= case
					when tblSSM.is_schema_bound = 1 then 'Y'
					else 'N'
			  end	

		, [principal]
			= case
			 
					when ( tblSSM.execute_as_principal_id is null ) 
						then tblSS.[name]
			  
					when  ( tblSSM.execute_as_principal_id = -2) 
						then '--Shared--'

					else user_name(tblSSM.execute_as_principal_id)

						  
			  end			 
		      
		, [definition]
			= tblSSM.[definition]

from  sys.objects tblSO

inner join sys.schemas tblSS

		on tblSO.[schema_id] = tblSS.[schema_id]

inner join sys.sql_modules tblSSM

		on tblSO.[object_id] = tblSSM.[object_id]

where tblSSM.[uses_native_compilation] = 1

Output

identifynativecompiledobjects

Loaded Modules – sys.dm_os_loaded_modules

Overview

In the sample code below, we identify In Memory Objects by reviewing loaded OS Modules.

The relevant DMVs are sys.dm_os_loaded_modules and sys.dm_os_virtual_address_dump.

 

Code



; with cteOSLM
(
	  [modulename]
	, [description]
    , [base_address]
	, [filenameFull]
	, [filename]
	, [filenameMassaged]
)
as
(

	SELECT
			  [name]

			, [description]

			, [base_address]

			, [filenameFull] = tblOSLM.[name]

			, [filename]
				= reverse(left(reverse(tblOSLM.[name]),
                    charindex('\',reverse(tblOSLM.[name]), 1) - 1))

			, [filenameMassaged]
				= replace(
							replace(
									 reverse(left(reverse(tblOSLM.[name]),
										charindex('\',reverse(tblOSLM.[name]), 1) - 1))
										, '_'
									, '.'
									)
							, '.dll'
							, ''
						)				

	
	FROM   sys.dm_os_loaded_modules tblOSLM 

	WHERE  tblOSLM.[description] = 'XTP Native DLL'

	
)

, cteOSLMObject
(
	  [modulename]
	, [description]
    , [base_address]
	, [filenameFull]
	, [filename]
	, [filenameMassaged]

	, [objectID] 
	, [databaseID] 
	, [objectType]

)
as
(

	SELECT 

		  	  [modulename]
			, [description]
			, [base_address]
			, [filenameFull]
			, [filename]
			, [filenameMassaged]


			, [objectID] 
				= PARSENAME([filenameMassaged], 1)

			, [databaseID] 
				= PARSENAME([filenameMassaged], 2)
			
			, [objectType]
				= case PARSENAME([filenameMassaged], 3)
						when 't' then 'Table'
						when 'p' then 'Procedure'
						when 'f' then 'Function'
						else PARSENAME([filenameMassaged], 3)
				  end 

	from   cteOSLM

)
, cteVirtualAddress
(
	  [region_allocation_base_address]
	, [regionSizeInBytes]
)
as
(
	select 
			  tblOSVAD.[region_allocation_base_address]
			, [regionSizeInBytes]
				= sum(tblOSVAD.region_size_in_bytes)
	
	from   sys.dm_os_virtual_address_dump tblOSVAD
	
	group by
	 
			tblOSVAD.[region_allocation_base_address]

)	 
SELECT 

		  tblOSLM.[description]

		, tblOSLM.[modulename]

		, tblOSLM.[filename]


		, [database]
			= case
				when tblOSLM.[databaseID] = 32767 then 'Resource DB'
				else db_name(tblOSLM.[databaseID])
			   end

		, [objectName]
			= quoteName
				(
				    object_schema_name
					(
					  tblOSLM.objectID
					, tblOSLM.databaseID
					)
				)
				+ '.'
				+ quoteName
				(
					object_name
					(
					   tblOSLM.objectID
					 , tblOSLM.databaseID
					)
				)

		, tblOSLM.[objectType]

		, [sizeInKB]
			= (tblOSVAD.[regionSizeInBytes])
				/ ( 1024 )

FROM   cteOSLMObject tblOSLM 

INNER JOIN cteVirtualAddress tblOSVAD

		on tblOSLM.[base_address] = tblOSVAD.[region_allocation_base_address]

order by

			  [database]
			, [objectName]


Output

osloadedmodules

 

Explanation

  1. We can see that we can the SQL Server Engine generates Database Object specific modules (dlls)
  2. The modules are aptly named once one know where to look
  3. And, they are very small in size

 

Summary

In summary, to identify In Memory Objects we have a couple of tracks we can take.

Those pathways includes:

  1. Checking Table – sys.tables
    • Column :- is_memory_optimized
  2. Checking objects that are sitting on Memory Optimized Filegroup by relying on the sys.filegroups.

We reviewed user usage by digging into the sys.dm_db_xtp_object_stats & sys.dm_db_xtp_index_stats.

We also touched on memory consumed by In Memory Objects and Indexes by exploring the sys.dm_db_xtp_memory_consumers view.

To identify natively compiled objects we narrow in on the uses_native_compilation column of the sys.sql_modules view

To dig a bit deeper and identify OS Modules that are produced once In-Memory tables are compiled, we look into sys.dm_os_loaded_modules and to size up them up we check the region_size_in_bytes column in the sys.dm_os_virtual_address_dump table.

 

Source Code Repository

GitHub

Blogs are not where to keep code, and so for easier and more polish consumption shipped out to GitHub.

Here is the Repository.

 

Dedicated

Again, the good thing about blogging is that one publicly acknowledges the source of Information.

Here is mine:

  1. SqlHints.com, Basavaraj Biradar
    • Working with In-Memory OLTP (a.k.a. Hekaton) enabled Databases, Memory Optimized Tables and Natively Compiled Stored Procedures and it’s Internals with extensive list of Examples – Sql Server 2014
      Link
  2.  Help: SQL Server, Balmukund
    • A-Z of In-Memory OLTP : Behind the scenes
      Link

 

References

Microsoft

    1. Dynamic Management Views and Objects
      • Transact-SQL Reference (Database Engine) > System Views (Transact-SQL) > Dynamic Management Views and Functions (Transact-SQL)
        Memory-Optimized Table Dynamic Management Views (Transact-SQL)
        Link

        • sys.dm_db_xtp_memory_consumers (Transact-SQL)
          Link
        • sys.dm_db_xtp_index_stats ( Transact SQL )
          Link
        • sys.dm_db_xtp_object_stats ( Transact-SQL )
          Link
        • sys.memory_optimized_tables_internal_attributes ( Transact-SQL )
          Link
    2. Developer Network
      • Database Features > In-Memory OLTP (In-Memory Optimization)  > Memory-Optimized Tables
        Native Compilation of Tables and Stored Procedures
        Link