MySQL – Identify potential duplicate Indexes

Background

A little while ago I was trying to get a quick education on the Information schema in MySQL.

Shlomi Noach

Ran into a nice post by Shlomi Noach.

The post is titled “Useful database analysis queries with INFORMATION_SCHEMA” and published here.

Code

Table

Test Table

CREATE TABLE `City`
(
  `ID` int(11) NOT NULL auto_increment,
  `Name` char(35) character set utf8 NOT NULL default '',
  `CountryCode` char(3) NOT NULL default '',
  `District` char(20) NOT NULL default '',
  `Population` int(11) NOT NULL default '0',
  PRIMARY KEY  (`ID`),
  UNIQUE KEY `ID` (`ID`),
  KEY `Population` (`Population`),
  KEY `Population_2` (`Population`,`CountryCode`)
)
-- ENGINE=MyISAM
ENGINE=INNODB
AUTO_INCREMENT=4080
DEFAULT CHARSET=latin1

Outline By Shlomi Noach

We can see that the Population_2 index covers the Population index, so the latter is redundant and should be removed. We also see that the ID index is redundant, since there is a PRIMARY KEY on ID, which is in itself a unique key. How can we test such cases by querying the INFORMATION_SCHEMA? Turns out we can do that using the STATISTICS table.

[Update: thanks to Roland Bouman’s comments. The following queries only consider BTREE indexes, and do not verify FULLTEXT or HASH indexes]

 

Query

Original


SELECT * FROM (
  SELECT TABLE_SCHEMA, TABLE_NAME, INDEX_NAME,
    GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX) AS columns
  FROM `information_schema`.`STATISTICS`
  WHERE TABLE_SCHEMA NOT IN ('mysql', 'INFORMATION_SCHEMA')
    AND NON_UNIQUE = 1 AND INDEX_TYPE='BTREE'
  GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME
) AS i1 INNER JOIN (
  SELECT TABLE_SCHEMA, TABLE_NAME, INDEX_NAME,
    GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX) AS columns
  FROM `information_schema`.`STATISTICS`
  WHERE INDEX_TYPE='BTREE'
  GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME
) AS i2
USING (TABLE_SCHEMA, TABLE_NAME)
WHERE i1.columns != i2.columns
AND LOCATE(CONCAT(i1.columns, ','), i2.columns) = 1

Revised


SELECT 

          i1.TABLE_SCHEMA as `schema`
        , i1.TABLE_NAME as `table`
        , i1.INDEX_NAME as `index1`
        , replace(i1.`columns`, '~', '') as `columns1`
        , i2.INDEX_NAME as `index2`
        , replace(i2.`columns`, '~', '') as `columns2`          

FROM
  (
        SELECT
                  TABLE_SCHEMA
                , TABLE_NAME
                , INDEX_NAME
                , GROUP_CONCAT(CONCAT('~', COLUMN_NAME, '~')
                                ORDER BY SEQ_IN_INDEX
                               ) AS columns

        FROM `information_schema`.`STATISTICS`

        WHERE TABLE_SCHEMA NOT IN ('mysql', 'INFORMATION_SCHEMA')

        -- AND NON_UNIQUE = 1 

        AND INDEX_TYPE='BTREE' 

        GROUP BY
              TABLE_SCHEMA
            , TABLE_NAME
            , INDEX_NAME
    ) AS i1 

    INNER JOIN 

    (

        SELECT
              TABLE_SCHEMA
            , TABLE_NAME
            , INDEX_NAME
            , GROUP_CONCAT(CONCAT('~', COLUMN_NAME, '~')
                            ORDER BY SEQ_IN_INDEX
                           ) AS columns

        FROM `information_schema`.`STATISTICS`

        WHERE INDEX_TYPE='BTREE' 

        GROUP BY
              TABLE_SCHEMA
            , TABLE_NAME
            , INDEX_NAME
    ) AS i2

/*
 *
USING (TABLE_SCHEMA, TABLE_NAME)

WHERE i1.columns != i2.columns 

AND LOCATE(CONCAT(i1.columns, ','), i2.columns) = 1 

*/  

  /*
   * Same Schema and Table
  */
  on i1.TABLE_SCHEMA = i2.TABLE_SCHEMA

  and i1.TABLE_NAME = i2.TABLE_NAME

  /* Different Index */
  and i1.INDEX_NAME != i2.INDEX_NAME

  and
      (

            /*
             Same exact column list
             */
            (
                   ( i1.columns = i2.columns )
               and ( i1.INDEX_NAME < i2.INDEX_NAME )
            )

             /*
                Different column list
                One column list is fully menetioned in the other
             */
             or (

                         ( i1.columns  i2.columns )

                    and  (
                            locate( i1.columns, i2.columns ) > 0
                         )   

                )    

      )

order by      

          `schema`
        , `table`
        , `index1`
        , `columns1`

.

Dedication

Dedicated to Shlomi Noach.

In time we all get to recognize a giving humble spirit.

The code is by no means perfect.

But, he is like I, think this is a good building block.  And, here are the the things I considered and the goal I have in mind.

Anyone can share when things are perfect and no one holds it against you if you choose not to share.

To me it takes a lot of confidence in far & disparate areas to place things in the public space when one is just starting out.

And that is the spirit of blogging, posting things on YouTube or doing a start-up.

Is one humble enough to grow in the public space.  And, can the relationships supplicate thoughtful innovation.

Sometimes in life we all enter those seasons, when one does not want to keep things as they are.

 

DBeaver – Table Relationships

Background

Just wanting to make sure that I am properly tracking table relationships in MySQL.

 

Code

INFORMATION_SCHEMA

INFORMATION_SCHEMA.KEY_COLUMN_USAGE

Code

Sample


set @schema := 'sakila';
set @table := 'inventory';

select 

         case

            when tblKCU.TABLE_NAME = @table then 'References'
            when REFERENCED_TABLE_NAME = @table then 'Referenced'

         end as 'isReferencingOrReferenced'
       , tblKCU.CONSTRAINT_NAME as 'constraint'
       , tblKCU.TABLE_NAME as 'table'
       , tblKCU.COLUMN_NAME as 'column'

       , tblKCU.REFERENCED_TABLE_NAME as 'refTable'
       , tblKCU.REFERENCED_COLUMN_NAME as 'refColumn'

       , tblKCU.ORDINAL_POSITION as 'colPos'

from   information_schema.KEY_COLUMN_USAGE tblKCU

where  tblKCU.TABLE_SCHEMA = @schema

/* Foreign Key Relationship */
and    (

             ( tblKCU.TABLE_NAME = @table )
          or ( tblKCU.REFERENCED_TABLE_NAME = @table )

       )

/* Skip Primary Key */
and    (

              ( tblKCU.TABLE_NAME is not null )
          and ( tblKCU.REFERENCED_TABLE_NAME is not null )

       )

order by       

         tblKCU.TABLE_NAME
       , tblKCU.COLUMN_NAME

       , tblKCU.REFERENCED_TABLE_NAME
       , tblKCU.REFERENCED_COLUMN_NAME

       , tblKCU.CONSTRAINT_NAME       

       , tblKCU.ORDINAL_POSITION

;

Output

 

Tools

DBeaver

Outline

Steps to follow to track visually through DBeaver

  1. Launch DBeaver
  2. Connect to MySQL Instance
  3. Navigate to Database Tables <Specific Table>
  4. Double Click on the selected table
  5. On the right panel, choose the “ER Diagram” Tab
  6. Review the shown tables and their relationships to each other

Image

MySQL – information_schema.statistics

Background

As a quick follow-up to our discussion of metadata on tables, let us discuss indexes.

BTW, the discussion on tables is here.

INFORMATION_SCHEMA.STATISTICS

Code


select

		tblStat.TABLE_NAME as 'table'

	  , tblStat.INDEX_NAME as 'index'

	  , case tblStat.NON_UNIQUE
			   when 0 then 'No'
			   when 1 then 'Yes'
			   else '?'
		end as 'Unique'


	  , GROUP_CONCAT( COLUMN_NAME order by SEQ_IN_INDEX ) as colList


	  , tblStat.CARDINALITY


from  INFORMATION_SCHEMA.STATISTICS tblStat


where  tblStat.TABLE_SCHEMA = 'drupal'


group by

		 tblStat.TABLE_SCHEMA
	   , tblStat.TABLE_NAME
	   , tblStat.INDEX_NAME


order by

		 tblStat.TABLE_SCHEMA
	  ,  tblStat.TABLE_NAME
	  ,  tblStat.INDEX_NAME


LIMIT 25

;


Output

 

Comparison with Other Databases

  1. GROUP_CONCAT
    • The group_concat is very, very powerful and has few equal in other database platforms
    • Function Documentation

 

SHOW INDEX

To get a bit more information, specifically cardinality ( Number of unique values in each column combination ), we can issue “show index” against the table.

Code

Syntax


show index from [table] from [database];

Sample


show index from search_index from drupal;

Output

References

  1. Reference
    • INFORMATION_SCHEMA Tables
      • INFORMATION_SCHEMA STATISTICS Table
    • Show Index
    •  Group_Function
      • Group By Function
        Link
    • Group_Concat

MySQL – information_schema.tables

Background

With SQL being a lingua-franca for quering data it should not be a far stretch to note that the various governing bodies have agreed on standards on how to expose dictionary.

 

Mysql

For instance, if we want to see which tables are in a database we can launch our favorite GUI SQL Editor or just access MySQL’s Client query tool eponymous named mysql.

 

information_schema.tables

SQL

Sample


select 
       TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME
     , TABLE_TYPE, ENGINE
     , TABLE_ROWS, AVG_ROW_LENGTH 

from information_schema.tables 

where TABLE_TYPE not in ("SYSTEM VIEW")  

LIMIT 10


 

Output

Explanation

  1. Database
    • def
  2. Schema
    • audit_repository
  3. Table
  4. Table Type
    • Available Options
      • SYSTEM VIEW
      • BASE TABLE
  5. Engine
    • Available Options
      • Memory – System Tables
      • InnoDB
      • MyISAM
  6. TABLE ROWS
    • Number of records in table
  7. AVG ROW LENGTH
    • Based on column data type and used size allocation

 

Comparison with Other Databases

SQL Server

  1. Able to use double-quotes in query
    • We are able to use double-quotes (“) to delimit column values
  2. Limit N
    • In Transact SQL we use top N.
    • On the other hand, in MySQL we use LIMIT N

 

show tables

Objective

List tables and views that fits specified name format.

SQL

Syntax


show full tables from [database] like [name] where TABLE_TYPE in [BASE_TABLE | VIEW ];

Sample


show full tables from drupal like 'role%';

 

Output

 

describe

Objective

Display table’s structure.

SQL

Syntax


describe [table];

Sample


describe drupal.role;

Output

 

References

  1. MySQL
    • Show Tables
      • Show Tables Syntax
        Link
    • Show Commands
      • Extensions to SHOW Statements
        Link

Column Stores Indexes – Query Comparison – Unpartitioned VS Partitioned

Objective

We had Column Store Tables created and now have tracking partitioned tables created, as well.

It is now time to see whether we benefit by partitioning the tables.

From our original set of tables, we only created tracking Partition tables for the Row Store Clustered Index and the Column Store Clustered Index.

For brevity sake, we skipped the Column Store Non-Clustered Index.

 

Query

Scenario

Our sample query will retrieve the top two sales within a date range.

SQL

Sample

Here is what the query likes when we target the [Fact].[SaleLargeRowStore] table.

BTW, that table is the unpartitioned Row Store table.

It has a Clustered Index on it’s primary key, Sale Key.


declare @dateBegin datetime
declare @dateEnd   datetime


set @dateBegin = '2013-04-15';
set @dateEnd = '2013-05-15';

; with cteYearCustomer --[Fact].[SaleLargeRowStore]
(
      [CustomerKey]
    , [profit]
)
as
(
    select  
              [CustomerKey] = tblFS.[Customer Key]
            , [profit] = max(tblFS.[profit])

    from  [Fact].[SaleLargeRowStore] tblFS

    where tblFS.[Invoice Date Key] between @dateBegin and @dateEnd

    group by
             tblFS.[Customer Key]

)

select top 2 *

from    cteYearCustomer cteYC

order by
        [profit] desc


Full


dbcc dropcleanbuffers with no_infomsgs
go

use [WideWorldImportersDW]
go

set nocount on
go

set statistics io on
go

declare @dateBegin datetime
declare @dateEnd   datetime


set @dateBegin = '2013-04-15';
set @dateEnd = '2013-05-15';

-- [Fact].[SaleLargeRowStore] 
declare @dateRSCBegin datetime
declare @dateRSCEnd datetime

declare @dateRSCPIndexIsChosenByEngineBegin datetime
declare @dateRSCPIndexIsChosenByEngineEnd datetime

declare @dateRSCPIndexIsClusteredBegin datetime
declare @dateRSCPIndexIsClusteredEnd datetime

declare @dateCSCIndexBegin datetime
declare @dateCSCIndexEnd datetime

declare @dateCSCPIndexBegin datetime
declare @dateCSCPIndexEnd datetime

declare @dateCSNCIndexBegin datetime
declare @dateCSNCIndexEnd datetime


declare @datediffSum bigint 

declare @NUMBER_OF_CHARS_IN_DIVIDER int

set @NUMBER_OF_CHARS_IN_DIVIDER = 180

set @dateRSCBegin = getdate();

print '@@version :- '
		+ cast(@@version as varchar(4000))

print replicate('=', @NUMBER_OF_CHARS_IN_DIVIDER);

print '[Fact].[SaleLargeRowStore]'
print '**************************'

-- [Fact].[SaleLargeRowStore]
; with cteYearCustomer --[Fact].[SaleLargeRowStore]
(
	  [CustomerKey]
	, [profit]
)
as
(
	select  
			  [CustomerKey] = tblFS.[Customer Key]
			, [profit] = max(tblFS.[profit])

	from  [Fact].[SaleLargeRowStore] tblFS

	where tblFS.[Invoice Date Key] between @dateBegin and @dateEnd

	group by
			 tblFS.[Customer Key]

)

select top 2 *

from    cteYearCustomer cteYC

order by
		[profit] desc

set @dateRSCEnd = getdate()

print replicate('=', @NUMBER_OF_CHARS_IN_DIVIDER);



/*
	 [Fact].[SaleLargeRowStoreClusteredPartition]
*/

print '[Fact].[SaleLargeRowStoreClusteredPartition] -- Clustered Index'
print '***************************************************************'

set @dateRSCPIndexIsClusteredBegin = getdate()

; with cteYearCustomer --[Fact].[SaleLargeRowStoreClusteredPartition] --Clustered Index
(
	  [CustomerKey]
	, [profit]
)
as
(
	select  
			  [CustomerKey] = tblFS.[Customer Key]
			, [profit] = max(tblFS.[profit])

	from [Fact].[SaleLargeRowStoreClusteredPartition]  tblFS with ( INDEX = 1)

	where tblFS.[Invoice Date Key] between @dateBegin and @dateEnd

	group by
			 tblFS.[Customer Key]

)

select top 2 *

from    cteYearCustomer cteYC

order by
		[profit] desc

set @dateRSCPIndexIsClusteredEnd = getdate()

print replicate('=', @NUMBER_OF_CHARS_IN_DIVIDER);

exec('dbcc dropcleanbuffers with no_infomsgs')

/*
	 [Fact].[SaleLargeRowStoreClusteredPartition]
*/

print '[Fact].[SaleLargeRowStoreClusteredPartition] -- Chosen by SQL Engine'
print '********************************************************************'

set @dateRSCPIndexIsChosenByEngineBegin = getdate()

; with cteYearCustomer --[Fact].[SaleLargeRowStoreClusteredPartition] -- Chosen by SQL Engine
(
	  [CustomerKey]
	, [profit]
)
as
(
	select  
			  [CustomerKey] = tblFS.[Customer Key]
			, [profit] = max(tblFS.[profit])

	from [Fact].[SaleLargeRowStoreClusteredPartition]  tblFS --with ( INDEX = 1)

	where tblFS.[Invoice Date Key] between @dateBegin and @dateEnd

	group by
			 tblFS.[Customer Key]

)

select top 2 *

from    cteYearCustomer cteYC

order by
		[profit] desc

set @dateRSCPIndexIsChosenByEngineEnd = getdate()

print replicate('=', @NUMBER_OF_CHARS_IN_DIVIDER);



/*
	[Fact].[SaleLargeColumnStoreClustered]
*/
print '[Fact].[SaleLargeColumnStoreClustered]'
print '**************************************'

set @dateCSCIndexBegin = getdate()

; with cteYearCustomer -- [Fact].[SaleLargeColumnStoreClustered]
(
	  [CustomerKey]
	, [profit]
)
as
(
	select  
			  [CustomerKey] = tblFS.[Customer Key]
			, [profit] = max(tblFS.[profit])

	from  [Fact].[SaleLargeColumnStoreClustered] tblFS

	where tblFS.[Invoice Date Key] between @dateBegin and @dateEnd

	group by
			--  tblY.[year]
			 tblFS.[Customer Key]

)

select top 2 *

from    cteYearCustomer cteYC

order by
		[profit] desc

set @dateCSCIndexEnd = getdate()

print replicate('=', @NUMBER_OF_CHARS_IN_DIVIDER);

/*
	[Fact].[SaleLargeColumnStoreClusteredPartition] 
*/

print '[Fact].[SaleLargeColumnStoreClusteredPartition]'
print '**********************************************'

set @dateCSCPIndexBegin = getdate()

; with cteYearCustomer -- [Fact].[SaleLargeColumnStoreClusteredPartition] 
(
	  [CustomerKey]
	, [profit]
)
as
(
	select  
			  [CustomerKey] = tblFS.[Customer Key]
			, [profit] = max(tblFS.[profit])

	from  [Fact].[SaleLargeColumnStoreClusteredPartition] tblFS

	where tblFS.[Invoice Date Key] between @dateBegin and @dateEnd

	group by
			 tblFS.[Customer Key]

)

select top 2 *

from    cteYearCustomer cteYC

order by
		[profit] desc

set @dateCSCPIndexEnd = getdate()

print replicate('=', @NUMBER_OF_CHARS_IN_DIVIDER)

/*
	[Fact].[SaleLargeColumnStoreNonClustered] 
*/

print '[Fact].[SaleLargeColumnStoreNonClustered]'
print '*****************************************'

set @dateCSNCIndexBegin = getdate()

; with cteYearCustomer -- [Fact].[SaleLargeColumnStoreNonClustered 
(
	  [CustomerKey]
	, [profit]
)
as
(
	select  
			  [CustomerKey] = tblFS.[Customer Key]
			, [profit] = max(tblFS.[profit])

	from  [Fact].[SaleLargeColumnStoreNonClustered] tblFS

	where tblFS.[Invoice Date Key] between @dateBegin and @dateEnd

	group by
			 tblFS.[Customer Key]

)

select top 2 *

from    cteYearCustomer cteYC

order by
		[profit] desc

set @dateCSNCIndexEnd = getdate()


print replicate('=', @NUMBER_OF_CHARS_IN_DIVIDER)

set @datediffSum =
			        isNull(datediff(millisecond, @dateRSCBegin, @dateRSCEnd), 0)
				  + isNull(datediff(millisecond, @dateRSCPIndexIsChosenByEngineBegin, @dateRSCPIndexIsChosenByEngineEnd), 0)
				  + isNull(datediff(millisecond, @dateRSCPIndexIsClusteredBegin, @dateRSCPIndexIsClusteredEnd), 0)
				  + isNull(datediff(millisecond, @dateCSCIndexBegin, @dateCSCIndexEnd), 0)
				  + isNull(datediff(millisecond, @dateCSCPIndexBegin, @dateCSCPIndexEnd), 0)
				  + isNull(datediff(millisecond, @dateCSNCIndexBegin, @dateCSNCIndexEnd), 0)


select 

		  [sourceID] = [sourceID]
		,  =  
		, [dateStart] = dateStart
		, [dateEnd] = dateEnd
		, [durationMilliseconds] =  [duration] 
		, [%] = cast([%] as decimal(6, 2))

from   
	(

		select 
				  [sourceID] = 1
				,  = '[Fact].[SaleLargeRowStore]'
				, [dateStart] = @dateRSCBegin
				, [dateEnd] = @dateRSCEnd
				, [duration] = datediff(millisecond, @dateRSCBegin, @dateRSCEnd)
				, [%] = (datediff(millisecond, @dateRSCBegin, @dateRSCEnd)) * 100.00 / @datediffSum


		union all

		select 
				  [sourceID] = 2.1
				,  = '[Fact].[SaleLargeRowStoreClusteredPartition] - Clustered Index'
				, [dateStart] = @dateRSCPIndexIsClusteredBegin
				, [dateEnd] = @dateRSCPIndexIsClusteredEnd
				, [duration] = datediff(millisecond, @dateRSCPIndexIsClusteredBegin, @dateRSCPIndexIsClusteredEnd)
				, [%] = (datediff(millisecond, @dateRSCPIndexIsClusteredBegin, @dateRSCPIndexIsClusteredEnd)) * 100.00 / @datediffSum

		union all

		select 
				  [sourceID] = 2.2
				,  = '[Fact].[SaleLargeRowStoreClusteredPartition] - Chosen by Engine'
				, [dateStart] = @dateRSCPIndexIsChosenByEngineBegin
				, [dateEnd] = @dateRSCPIndexIsChosenByEngineBegin
				, [duration] = datediff(millisecond, @dateRSCPIndexIsChosenByEngineBegin, @dateRSCPIndexIsChosenByEngineEnd)
				, [%] = (datediff(millisecond, @dateRSCPIndexIsChosenByEngineBegin, @dateRSCPIndexIsChosenByEngineEnd)) * 100.00 / @datediffSum

		union all

		select 
				  [sourceID] = 3
				,  = '[Fact].[SaleLargeColumnStoreClustered]'
				, [dateStart] = @dateCSCIndexBegin
				, [dateEnd] = @dateCSCIndexEnd
				, [duration] = datediff(millisecond, @dateCSCIndexBegin, @dateCSCIndexEnd)
				, [%] = (datediff(millisecond, @dateCSCIndexBegin, @dateCSCIndexEnd)) * 100.00 / @datediffSum


		union all

		select 
				  [sourceID] = 4
				,  = '[Fact].[SaleLargeColumnStoreClusteredPartition]'
				, [dateStart] = @dateCSCPIndexBegin
				, [dateEnd] = @dateCSCPIndexEnd
				, [duration] = datediff(millisecond, @dateCSCPIndexBegin, @dateCSCPIndexEnd)
				, [%] = (datediff(millisecond, @dateCSCPIndexBegin, @dateCSCPIndexEnd)) * 100.00 / @datediffSum

		union all

		select 
				  [sourceID] = 5
				,  = '[Fact].[SaleLargeColumnStoreNonClustered]'
				, [dateStart] = @dateCSNCIndexBegin
				, [dateEnd] = @dateCSNCIndexEnd
				, [duration] = datediff(millisecond, @dateCSNCIndexBegin, @dateCSNCIndexEnd)
				, [%] = (datediff(millisecond, @dateCSNCIndexBegin, @dateCSNCIndexEnd)) * 100.00 / @datediffSum
		

	) tblA

order by

	  [sourceID] asc


Query Plan

Query Plan – v2014

Row Store – Unpartition Table ( Fact.SaleLargeRowStore )

Image

Explanation
  1. Missing Index
    • Improvement Projection
      • The Query Processor estimates that implementing the following index could improve the query cost by 93.1978%.
    • Index Statement
      • CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [Fact].[SaleLargeRowStore] ([Invoice Date Key]) INCLUDE ([Customer Key],[Profit])

Row Store – Partition Table ( Fact.SaleLargeRowStoreClusteredPartition ) – Index Chosen By Engine

Image

Explanation
  1. Missing Index
    • Improvement Projection
      • The Query Processor estimates that implementing the following index could improve the query cost by 93.1978%.
    • Index Statement
      • CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [Fact].[SaleLargeRowStore] ([Invoice Date Key]) INCLUDE ([Customer Key],[Profit])

Row Store – Partition Table ( Fact.SaleLargeRowStoreClusteredPartition ) – Index Chosen By Engine

Image

Column Store – Unpartitioned Table ( Fact.SaleLargeColumnStoreClustered )

Image

  1. Column Index Scan
    • With a Clustered Column Store Index, the Clustering Index is inclusive of all columns

Column Store – Clustered – Partitioned Table ( Fact.SaleLargeColumnStoreClusteredPartition )

Image

 

Explanation
  1. Column Index Scan
    • With a Clustered Column Store Index, the Clustering Index is inclusive of all columns

 

Column Store – Non Clustered – Non-Partitioned Table –  ( Fact.SaleLargeColumnStoreNonClustered)

Image

Explanation
  1. Table Scan
    • We do not have a targeted Index

 

Statistics I/O

Image

Statistics – v2014

Statistics I/O – v2016

Tabulated

 

Table v2014 v2016
[Fact].[SaleLargeRowStore]
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Workfile’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Workfile’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘SaleLargeRowStore’. Scan count 1, logical reads 344481, physical reads 2, read-ahead reads 344467, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘SaleLargeRowStore’. Scan count 5, logical reads 344945, physical reads 3, read-ahead reads 344456, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[Fact].[SaleLargeRowStoreClusteredPartition] – Clustered Index ( Hardcoded to use clustered Index )
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Workfile’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Workfile’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘SaleLargeRowStoreClusteredPartition’. Scan count 13, logical reads 344557, physical reads 12, read-ahead reads 344531, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘SaleLargeRowStoreClusteredPartition’. Scan count 13, logical reads 344611, physical reads 10, read-ahead reads 344547, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[Fact].[SaleLargeRowStoreClusteredPartition] – Chosen by SQL Engine
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Workfile’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Workfile’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘SaleLargeRowStoreClusteredPartition’. Scan count 13, logical reads 37, physical reads 6, read-ahead reads 26, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘SaleLargeRowStoreClusteredPartition’. Scan count 13, logical reads 37, physical reads 6, read-ahead reads 54, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[Fact].[SaleLargeColumnStoreClustered]
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.  Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Workfile’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.  ( No References to workfile )
Table ‘SaleLargeColumnStoreClustered’. Scan count 1, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 1032, lob physical reads 5, lob read-ahead reads 2706.  Table ‘SaleLargeColumnStoreClustered’. Scan count 2, logical reads 4, physical reads 2, read-ahead reads 8, lob logical reads 268, lob physical reads 5, lob read-ahead reads 497.

Table ‘SaleLargeColumnStoreClustered’. Segment reads 1, segment skipped 0.

[Fact].[SaleLargeColumnStoreClusteredPartition]
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.  Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Workfile’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.  ( No References to workfile )
Table ‘SaleLargeColumnStoreClusteredPartition’. Scan count 1, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 78, lob physical reads 1, lob read-ahead reads 90.  Table ‘SaleLargeColumnStoreClusteredPartition’. Scan count 1, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
[Fact].[SaleLargeColumnStoreNonClustered]
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Workfile’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.  ( No References to workfile )
Table ‘SaleLargeColumnStoreNonClustered’. Scan count 1, logical reads 344132, physical reads 0, read-ahead reads 344098, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘SaleLargeColumnStoreNonClustered’. Scan count 5, logical reads 344005, physical reads 0, read-ahead reads 343533, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Summary

Here a couple of findings:

  1. Vertical partitioning
    • Helpful
      • Especially when queries are date driven
      • Please do some prep work and determine which date columns users often target
    • Not as Helpful
      • In a multi-column Index, not as helpful when the Date Column is not the first column referenced in the Index
  2. SQL Server Version 2016/Enterprise Versus 2014/Developer Edition
    • Optimization
      • In v2016, Enterprise Edition
        • Workfile not referenced in Statistics IO
        • Segment Reads and Segments Skipped explicitly Stated
          • Table ‘SaleLargeColumnStoreClustered’. Segment reads 1, segment skipped 0.

Column Stores Indexes & Partitioning – Partition Tables

Objective

In a recent post, we created the infrastructure objects ( File Groups, Files, Partition Functions, and Partition Schemes).

In this post, we will create tables that will utilize those objects.

 

Outline

 

Table Row or Column Store Partition Index Columns
Fact.SaleLargeRowStore Row Store [Sale Key]
Fact.SaleLargeRowStoreClusteredPartition Row Store InvoiceDateKeyYear [Sale Key]

[Invoice Date Key]

[Invoice Date Key]

[Customer Key]*
[Profit]*

 Fact.SaleLargeColumnStoreClustered  Column Store
 Fact.SaleLargeColumnStoreClusteredPartition  Column Store  InvoiceDateKeyYear
 Fact.SaleLargeColumnStoreNonClustered  Column Store

 

 

Exercise

Tables

[Fact].[SaleLargeRowStoreClusteredPartition]

SQL

Code

USE [WideWorldImportersDW]
GO

/*

    DROP TABLE [Fact].[SaleLargeRowStoreClusteredPartition]

*/

/*

    exec sp_helpindex '[Fact].[SaleLargeRowStoreClusteredPartition]'

*/

if object_id('[Fact].[SaleLargeRowStoreClusteredPartition]') is null
begin

    CREATE TABLE [Fact].[SaleLargeRowStoreClusteredPartition]
    (
        [Sale Key] [bigint] IDENTITY(1,1) NOT NULL,
        [City Key] [int] NOT NULL,
        [Customer Key] [int] NOT NULL,
        [Bill To Customer Key] [int] NOT NULL,
        [Stock Item Key] [int] NOT NULL,
        
        [Invoice Date Key] [date] NOT NULL,

        [InvoiceDateKeyYear] [int] NOT NULL,
        
        [Delivery Date Key] [date] NULL,
        
        [DeliveryDateKeyYear] [int] NULL,
        
        [Salesperson Key] [int] NOT NULL,
        [WWI Invoice ID] [int] NOT NULL,
        [Description] [nvarchar](100) NOT NULL,
        [Package] [nvarchar](50) NOT NULL,
        [Quantity] [int] NOT NULL,
        [Unit Price] [decimal](18, 2) NOT NULL,
        [Tax Rate] [decimal](18, 3) NOT NULL,
        [Total Excluding Tax] [decimal](18, 2) NOT NULL,
        [Tax Amount] [decimal](18, 2) NOT NULL,
        [Profit] [decimal](18, 2) NOT NULL,
        [Total Including Tax] [decimal](18, 2) NOT NULL,
        [Total Dry Items] [int] NOT NULL,
        [Total Chiller Items] [int] NOT NULL,
        [Lineage Key] [int] NOT NULL
    

        ) 

        --ON [USERDATA] 
        ON [pschemeYear]
        (
            [Invoice Date Key]
        )

end
go

declare @object sysname
declare @objectID int
declare @constraint sysname
declare @index  sysname

declare @addPartitionedIndex   bit
declare @addUnpartitionedIndex bit

set @object = '[Fact].[SaleLargeRowStoreClusteredPartition]'
set @constraint = 'PK_Fact_SaleLargeRowStoreClusteredPartition'

set @objectID = object_id(@object)

if object_id(@object) is not null
begin

    /*

        alter table [Fact].[SaleLargeRowStoreClusteredPartition]
            drop CONSTRAINT [PK_Fact_SaleLargeRowStoreClusteredPartition] 

    */
    if not exists
        (

            select *
            from  sys.key_constraints tblSKC
            where tblSKC.parent_object_id = @objectID
            and   tblSKC.[name] = @constraint

        )
    begin

        print 'Create Primary Key ' 
                + @object 
                + '.' 
                + quoteName(@constraint) 
                + ' ...'

        alter table [Fact].[SaleLargeRowStoreClusteredPartition]
        ADD CONSTRAINT [PK_Fact_SaleLargeRowStoreClusteredPartition] 
        PRIMARY KEY CLUSTERED 
            (

                [Sale Key] ASC
                /*
                Msg 1908, Level 16, State 1, Line 19
                Column 'Invoice Date Key' is partitioning column of the index 'PK_Fact_SaleLargeRowStoreClusteredPartition'. Partition columns for a unique index must be a subset of the index key.
                Msg 1750, Level 16, State 0, Line 19
                Could not create constraint or index. See previous errors.
                */
            , [Invoice Date Key]

            )

        --ON [USERDATA]
        ON [pschemeYear]
        (
            [Invoice Date Key]
        )
            
        update statistics [Fact].[SaleLargeRowStoreClusteredPartition]
            [PK_Fact_SaleLargeRowStoreClusteredPartition] 
            with FULLSCAN;


        print 'Created Constraint - Primary Key'
                + @object + '.' 
                + quoteName(@constraint)

    end



    /*
        drop index [Fact].[SaleLargeRowStoreClusteredPartition].[INDX_InvoiceDateKey_WithIncludedColumns_UnPartition]
    */
    set @addPartitionedIndex = 0;
    set @addPartitionedIndex = 1;
    set @index = 'INDX_InvoiceDateKey_WithIncludedColumns_Partition';

    if not exists
        (
            select *
            from  sys.indexes tblSI
            where tblSI.object_id = @objectID
            and   tblSI.[name] = @index
        )
        and (@addPartitionedIndex = 1)
    begin

        print 'Create Index ' + @object + '.' + quoteName(@index) + ' ...'


        CREATE NONCLUSTERED INDEX [INDX_InvoiceDateKey_WithIncludedColumns_Partition]
        ON [Fact].[SaleLargeRowStoreClusteredPartition] 
        (
            [Invoice Date Key]
        )
        INCLUDE 
        (
              [Customer Key]
            , [Profit]
        )
        with
        (
            FILLFACTOR=100
        )
        /*
        ON [USERDATA]
        */
        ON [pschemeYear]
        (
            [Invoice Date Key]
        )


        update statistics [Fact].[SaleLargeRowStoreClusteredPartition]
            [INDX_InvoiceDateKey_WithIncludedColumns_Partition]
            with FULLSCAN;


        print 'Created Index ' + @object + '.' + quoteName(@index)

    end

end


go

 

[Fact].[SaleLargeColumnStoreClusteredPartition]

SQL

Code

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

USE [WideWorldImportersDW]
GO

/*

	DROP TABLE [Fact].[SaleLargeColumnStoreClusteredPartition]

	exec sp_helpindex '[Fact].[SaleLargeColumnStoreClusteredPartition]'

*/
if object_id('[Fact].[SaleLargeColumnStoreClusteredPartition]') is null
begin


	CREATE TABLE [Fact].[SaleLargeColumnStoreClusteredPartition]
	(
		[Sale Key] [bigint] IDENTITY(1,1) NOT NULL,
		[City Key] [int] NOT NULL,
		[Customer Key] [int] NOT NULL,
		[Bill To Customer Key] [int] NOT NULL,
		[Stock Item Key] [int] NOT NULL,
	
		[Invoice Date Key] [date] NOT NULL,
		[InvoiceDateKeyYear] [int] NOT NULL,
			
		[Delivery Date Key] [date] NULL,
		[DeliveryDateKeyYear] [int] NULL,
			
		[Salesperson Key] [int] NOT NULL,
		[WWI Invoice ID] [int] NOT NULL,
		[Description] [nvarchar](100) NOT NULL,
		[Package] [nvarchar](50) NOT NULL,
		[Quantity] [int] NOT NULL,
		[Unit Price] [decimal](18, 2) NOT NULL,
		[Tax Rate] [decimal](18, 3) NOT NULL,
		[Total Excluding Tax] [decimal](18, 2) NOT NULL,
		[Tax Amount] [decimal](18, 2) NOT NULL,
		[Profit] [decimal](18, 2) NOT NULL,
		[Total Including Tax] [decimal](18, 2) NOT NULL,
		[Total Dry Items] [int] NOT NULL,
		[Total Chiller Items] [int] NOT NULL,
		[Lineage Key] [int] NOT NULL
	
			

	) ON [USERDATA]

end


/*
	Object:  Index [INDX_FACT_SaleLargeColumnStoreClustered] 
*/

declare @table sysname
declare @index sysname

set @table = '[Fact].[SaleLargeColumnStoreClusteredPartition]';
set @index = 'INDX_FACT_SaleLargeColumnStoreClusteredPartition'

if object_id(@table) is not null
begin

	if not exists
	(
		select *
		from   sys.indexes tblSI
		where  tblSI.object_id = object_id(@table) 
		and    tblSI.[name] = @index
	) --and (1=0)
	begin

		print 'Create Index ' 
				+ @table
				+ ' - Index '
				+ @index
		
		CREATE CLUSTERED INDEX [INDX_FACT_SaleLargeColumnStoreClusteredPartition] 
		ON [Fact].[SaleLargeColumnStoreClusteredPartition] 
		(
			[Invoice Date Key]
		)
		WITH 
		(
				DROP_EXISTING = OFF
			--, FILLFACTOR = 100
			--, COMPRESSION_DELAY = 0
		) 
		--ON [USERDATA]
		ON [pschemeYear]
			(
				[Invoice Date Key]
			)
		
	end


	if exists
	(
		select *
		from   sys.indexes tblSI
		where  tblSI.object_id = object_id(@table) 
		and    tblSI.[name] = @index
		and    tblSI.type_desc != 'CLUSTERED COLUMNSTORE'
	)
	begin

		print 'Create Index ' 
				+ @table
				+ ' - Index '
				+ @index

		CREATE CLUSTERED COLUMNSTORE INDEX [INDX_FACT_SaleLargeColumnStoreClusteredPartition] 
		ON [Fact].[SaleLargeColumnStoreClusteredPartition] 
		WITH 
		(
			DROP_EXISTING = ON
		) 
		--ON [USERDATA]
		ON [pschemeYear]
		(
			[Invoice Date Key]
		)


	end

end
GO


Source Code Control

GitHub

The Code for creating the tables is posted on Github here

 

Summary

We will add corresponding data to our newly created partitioned tables and run a few queries and see the effect of the partitioning.

 

Column Stores Indexes & Partitioning – Foundational Work

Background

As data grows bigger and ages it becomes prudent to look at partitioning as an avenue to better reduce the overall footprint of data read from disk into memory.

Column Store by its very definition delivers on the promises of Vertical Partitioning.

Let us see review the steps we have to take to achieve Range Partitioning.

 

Design Goals

Our goal is to divvy up our data into years.

In each table that we will be targeting we will ensure that a date column exists.

And, hope to have one of those date columns track the occurrence date.

 

Overview

Here are the plumbing areas:

  1. Create new Filegroups and files within the new file groups
  2. Create Partitioning Function
  3. Create Partitioning Scheme

 

Steps

Platform

File Groups & Files

Objective

Create File groups for each business year.

And, create a single file in each file group.

SQL


USE [WideWorldImportersDW]   
go

-- Year NULL

if not exists  
(      
	select *      
	from   sys.filegroups tblSFG      
	where  tblSFG.name = 'fg_Partition_Year_Null'  
)  
begin      

	ALTER DATABASE [WideWorldImportersDW] 
		ADD FILEGROUP  [fg_Partition_Year_Null]   
		
end 
go

if not exists  
	(      
		select *      
		from   sys.master_files tblMF      
		where  tblMF.[database_id] = db_id('WideWorldImportersDW')      
		and    tblMF.name = 'fg_Partition_Year_Null__01'  
	)  
begin 

   ALTER DATABASE [WideWorldImportersDW]  
   ADD FILE   
   (      
      NAME = [fg_Partition_Year_Null__01]   
      , FILENAME = 'Z:\Microsoft\SQLServer\DataFiles\WideWorldImportersDW_fg_Partition_Year_Null_file_001.ndf'  
   )  
   TO FILEGROUP [fg_Partition_Year_Null]  
	
end 


-- Year 2010
if not exists  
(      
    select *      
    from   sys.filegroups tblSFG      
    where  tblSFG.name = 'fg_Partition_Year_2010'  
)  
begin

   ALTER DATABASE [WideWorldImportersDW]         
      ADD FILEGROUP  [fg_Partition_Year_2010]   

end 
    
if not exists  
(      
   select *      
   from   sys.master_files tblMF      
   where  tblMF.[database_id] = db_id('WideWorldImportersDW')      
   and    tblMF.name = 'fg_Partition_Year_2010__01'  
)  
begin
   
   ALTER DATABASE [WideWorldImportersDW]  
   ADD FILE   
   (      
         NAME = [fg_Partition_Year_2010__01]   
       , FILENAME = 'Z:\Microsoft\SQLServer\DataFiles\WideWorldImportersDW_fg_Partition_Year_2010_file_001.ndf'  
  )  
  TO FILEGROUP [fg_Partition_Year_2010]  

end 


-- Year 2011
if not exists  
(      
    select *      
    from   sys.filegroups tblSFG      
    where  tblSFG.name = 'fg_Partition_Year_2011'  
)  
begin

   ALTER DATABASE [WideWorldImportersDW]         
      ADD FILEGROUP  [fg_Partition_Year_2011]   

end 
    
if not exists  
(      
   select *      
   from   sys.master_files tblMF      
   where  tblMF.[database_id] = db_id('WideWorldImportersDW')      
   and    tblMF.name = 'fg_Partition_Year_2011__01'  
)  
begin
   
   ALTER DATABASE [WideWorldImportersDW]  
   ADD FILE   
   (      
         NAME = [fg_Partition_Year_2011__01]   
       , FILENAME = 'Z:\Microsoft\SQLServer\DataFiles\WideWorldImportersDW_fg_Partition_Year_2011_file_001.ndf'  
  )  
  TO FILEGROUP [fg_Partition_Year_2011]  

end 


-- Year 2012
if not exists  
(      
    select *      
    from   sys.filegroups tblSFG      
    where  tblSFG.name = 'fg_Partition_Year_2012'  
)  
begin

   ALTER DATABASE [WideWorldImportersDW]         
      ADD FILEGROUP  [fg_Partition_Year_2012]   

end 
    
if not exists  
(      
   select *      
   from   sys.master_files tblMF      
   where  tblMF.[database_id] = db_id('WideWorldImportersDW')      
   and    tblMF.name = 'fg_Partition_Year_2012__01'  
)  
begin
   
   ALTER DATABASE [WideWorldImportersDW]  
   ADD FILE   
   (      
         NAME = [fg_Partition_Year_2012__01]   
       , FILENAME = 'Z:\Microsoft\SQLServer\DataFiles\WideWorldImportersDW_fg_Partition_Year_2012_file_001.ndf'  
  )  
  TO FILEGROUP [fg_Partition_Year_2012]  

end 


-- Year 2013
if not exists  
(      
    select *      
    from   sys.filegroups tblSFG      
    where  tblSFG.name = 'fg_Partition_Year_2013'  
)  
begin

   ALTER DATABASE [WideWorldImportersDW]         
      ADD FILEGROUP  [fg_Partition_Year_2013]   

end 
    
if not exists  
(      
   select *      
   from   sys.master_files tblMF      
   where  tblMF.[database_id] = db_id('WideWorldImportersDW')      
   and    tblMF.name = 'fg_Partition_Year_2013__01'  
)  
begin
   
   ALTER DATABASE [WideWorldImportersDW]  
   ADD FILE   
   (      
         NAME = [fg_Partition_Year_2013__01]   
       , FILENAME = 'Z:\Microsoft\SQLServer\DataFiles\WideWorldImportersDW_fg_Partition_Year_2013_file_001.ndf'  
  )  
  TO FILEGROUP [fg_Partition_Year_2013]  

end 


-- Year 2014
if not exists  
(      
    select *      
    from   sys.filegroups tblSFG      
    where  tblSFG.name = 'fg_Partition_Year_2014'  
)  
begin

   ALTER DATABASE [WideWorldImportersDW]         
      ADD FILEGROUP  [fg_Partition_Year_2014]   

end 
    
if not exists  
(      
   select *      
   from   sys.master_files tblMF      
   where  tblMF.[database_id] = db_id('WideWorldImportersDW')      
   and    tblMF.name = 'fg_Partition_Year_2014__01'  
)  
begin
   
   ALTER DATABASE [WideWorldImportersDW]  
   ADD FILE   
   (      
         NAME = [fg_Partition_Year_2014__01]   
       , FILENAME = 'Z:\Microsoft\SQLServer\DataFiles\WideWorldImportersDW_fg_Partition_Year_2014_file_001.ndf'  
  )  
  TO FILEGROUP [fg_Partition_Year_2014]  

end 


-- Year 2015
if not exists  
(      
    select *      
    from   sys.filegroups tblSFG      
    where  tblSFG.name = 'fg_Partition_Year_2015'  
)  
begin

   ALTER DATABASE [WideWorldImportersDW]         
      ADD FILEGROUP  [fg_Partition_Year_2015]   

end 
    
if not exists  
(      
   select *      
   from   sys.master_files tblMF      
   where  tblMF.[database_id] = db_id('WideWorldImportersDW')      
   and    tblMF.name = 'fg_Partition_Year_2015__01'  
)  
begin
   
   ALTER DATABASE [WideWorldImportersDW]  
   ADD FILE   
   (      
         NAME = [fg_Partition_Year_2015__01]   
       , FILENAME = 'Z:\Microsoft\SQLServer\DataFiles\WideWorldImportersDW_fg_Partition_Year_2015_file_001.ndf'  
  )  
  TO FILEGROUP [fg_Partition_Year_2015]  

end 


-- Year 2016
if not exists  
(      
    select *      
    from   sys.filegroups tblSFG      
    where  tblSFG.name = 'fg_Partition_Year_2016'  
)  
begin

   ALTER DATABASE [WideWorldImportersDW]         
      ADD FILEGROUP  [fg_Partition_Year_2016]   

end 
    
if not exists  
(      
   select *      
   from   sys.master_files tblMF      
   where  tblMF.[database_id] = db_id('WideWorldImportersDW')      
   and    tblMF.name = 'fg_Partition_Year_2016__01'  
)  
begin
   
   ALTER DATABASE [WideWorldImportersDW]  
   ADD FILE   
   (      
         NAME = [fg_Partition_Year_2016__01]   
       , FILENAME = 'Z:\Microsoft\SQLServer\DataFiles\WideWorldImportersDW_fg_Partition_Year_2016_file_001.ndf'  
  )  
  TO FILEGROUP [fg_Partition_Year_2016]  

end 

-- Year 2017
if not exists  
(      
    select *      
    from   sys.filegroups tblSFG      
    where  tblSFG.name = 'fg_Partition_Year_2017'  
)  
begin

   ALTER DATABASE [WideWorldImportersDW]         
      ADD FILEGROUP  [fg_Partition_Year_2017]   

end 
    
if not exists  
(      
   select *      
   from   sys.master_files tblMF      
   where  tblMF.[database_id] = db_id('WideWorldImportersDW')      
   and    tblMF.name = 'fg_Partition_Year_2017__01'  
)  
begin
   
   ALTER DATABASE [WideWorldImportersDW]  
   ADD FILE   
   (      
         NAME = [fg_Partition_Year_2017__01]   
       , FILENAME = 'Z:\Microsoft\SQLServer\DataFiles\WideWorldImportersDW_fg_Partition_Year_2017_file_001.ndf'  
  )  
  TO FILEGROUP [fg_Partition_Year_2017]  

end 

Partition Function

Objective

  1. Create a new Partition Function
  2. Indicate Column’s data type
    • date
  3. Range Type
    • Left or Right
      • Right Values
  4. Values
    • null
      • When date is null
      • When null entries are accounted for, please be sure to use convert function
      • Else
        • Msg 7705, Level 16, State 1
        • Could not implicitly convert range values type specified at ordinal 1 to partition function parameter type

SQL

Code

 CREATE PARTITION FUNCTION [pfnYear]
 ( 
    date 
 )     
 AS RANGE RIGHT FOR VALUES    
 (       
	  convert ( date , null ) 
	, '2010-01-01'
	, '2011-01-01'
	, '2012-01-01'
	, '2013-01-01'
	, '2014-01-01'
	, '2015-01-01'
	, '2016-01-01'
	, '2017-01-01'
	, '2018-01-01'
	, '2019-01-01'
	, '2020-01-01'  
) 

Partition Scheme

SQL



CREATE PARTITION SCHEME [pschemeYear]   
AS PARTITION [pfnYear]    
TO  
(       
	  [fg_Partition_Year_Null]
	, [fg_Partition_Year_2010]
	, [fg_Partition_Year_2011]
	, [fg_Partition_Year_2012]
	, [fg_Partition_Year_2013]
	, [fg_Partition_Year_2014]
	, [fg_Partition_Year_2015]
	, [fg_Partition_Year_2016]
	, [fg_Partition_Year_2017]
	, [fg_Partition_Year_2018]
	, [fg_Partition_Year_2019]
	, [fg_Partition_Year_2020]
	, [fg_Partition_Year_Next]  
	
) 

Summary

To make it easier to come back and edit our documentation, we will address using the underlying Partitioning components in a secondary post