Technical: Microsoft – SQL Server – Plan Cache Bloat and Single Use Plans
Several practices can cause SQL Server Procedure Cache Bloat.
In general, it is not optimal to continue to interact with the database and not help it a bit by sharing that identical database statements will be submitted.
Without this guidance, the database will continue to treat each submitted SQL Statement as new and novel.
What can cause them:
- Use of embedded SQL
- Use of Object Relational Mapping Tool (ORM) Tools – Jboss Hibernate and Hibernate.Net and Microsoft’s Entity Framework
- Use of SQL exec statements
- Use of Scaler functions
Why worry ?
Query Processor Steps
The Query processor steps can be broken down into the following:
The steps are described in a bit of detail by Benjamin Nevarez:
Benjamin Nevarez – The SQL Server Query Optimizer
Parsing makes sure that the T-SQL query has a valid syntax, and translates the SQL query into an initial tree representation: specifically, a tree of logical operators representing the high-level steps required to execute the query in question. Initially, these logical operators will be closely related to the original syntax of the query, and will include such logical operations as “get data from the Customer table”, “get data from the Contact table”, “perform an inner join”, and so on. Different tree representations of the query will be used throughout the optimization process, and this logical tree will receive different names until it is finally used to initialize the Memo structure, as will be discussed later.
Binding is mostly concerned with name resolution. During the binding operation, SQL Server makes sure that all the object names do exist, and associates every table and column name on the parse tree with their corresponding object in the system catalog. The output of this second process is called an algebrized tree, which is then sent to the Query Optimizer.
The next step is the optimization process, which is basically the generation of candidate execution plans and the selection of the best of these plans according to their cost. As has already been mentioned, SQL Server uses a cost-based optimizer, and uses a cost estimation model to estimate the cost of each of the candidate plans.
As an aside, Benjamin Nevarez’s shares extensively on his blog ( http://www.benjaminnevarez.com/ ).
It is immediately obvious that the steps outlined above take a bit of time and that resources such as CPU and Memory are expounded during the process.
Code – Identify SQL Use Plans
Here is small query for identifying Single Use Plans
use [master] go ;with cteDEQS ( [queryHash] , [queryPlanHash] , [sqlHandleSample] , [sqlPlanHandle] , [numberofQueryPlans] , [executionCount] ) as ( SELECT tbldeqs.query_hash queryPlan , tbldeqs.query_plan_hash as queryPlanHash , max(tbldeqs.[sql_handle]) as sqlHandleSample , max(tbldeqs.plan_handle) as sqlPlanHandle , COUNT(*) numberofQueryPlans , SUM(tbldeqs.execution_count) as executionCount FROM sys.dm_exec_query_stats tbldeqs GROUP BY tbldeqs.query_hash , tbldeqs.query_plan_hash ) select tblDEQS.numberofQueryPlans , tblDEQS.[executionCount] , db_name(tblSQLText.dbid) as [databaseName] , object_name(tblSQLText.objectid, tblSQLText.dbid) as [objectName] , tblSQLText.text , tblDECP.cacheobjtype , tblDECP.objtype , tblDECP.usecounts , tblDECP.refcounts , tblDECP.size_in_bytes from cteDEQS tblDEQS outer apply sys.dm_exec_sql_text(tblDEQS.[sqlHandleSample]) tblSQLText left outer join sys.dm_exec_cached_plans as tblDECP on tblDEQS.sqlPlanHandle = tblDECP.plan_handle where ( --single use plans (tblDECP.usecounts =1) ) order by tblDEQS.[numberofQueryPlans] desc , tblDEQS.[executionCount] desc
- The query above is useful whether ordered by NumberofQueryPlans or executionCount and I will suggest that you review each numeric column and weigh what ordering by that column means
Code – Summary
Measure impact of single Use Plans
Using a broadly covered script shared by Kimberly Tripps’s ( http://www.sqlskills.com/blogs/kimberly/plan-cache-and-optimizing-for-adhoc-workloads/), here is a summarized tally:
select tblCachePlan.objType , count_big(*) as [NumberofPlans] , sum(tblCache.usecounts) as [useCounts] , cast((sum(cast(size_in_bytes as bigint)) / 1024 / 1024) as decimal(20,2)) as totalInUseMB , sum( case when (usecounts = 1) then 1 else 0 end ) as [NumberofSingleInstancePlans] , sum ( case when (usecounts =1) then cast(size_in_bytes as bigint) else 0 end ) /1024 / 1024 as [singleUsePlanMB] from sys.dm_exec_cached_plans tblCachePlan group by tblCachePlan.objType order by [totalInUseMB] desc
- From the screen output above, we can see that Adhoc plans by far out number and use up more resource (storage) than other object types
- Single Use Plans take up 2 GB of RAM
Application Code Fixes:
Depending on the Application Framework being used, there are some avenues to fix this problem.
Microsoft – Linq – Compiled Queries
- How to improve your Linq query performance
Microsoft – .Net – SQLClient
Microsoft – .Net – OleDb
- Executing preparing statements
Java – JDBC – Prepared Statement
- Using Prepared Statements
Plan re-use is a goal that we should reach for.
If you take the time to identify and tackle query plans that are not being re-used, you will likely find code lines that can benefit from peer review.
So called Prepared Statements are widely supported by all modern database interface paradigms.
Personally, I think the Microsoft LINQ implementation is one of the more difficult to follow.
- Microsoft SQL Server Query Processor Internals and Architecture
- SQL Server Query Optimizer
- Query Processor