As a DBA, we will face lots of performance issues because poorly configured server configuration, database settings, missing indexes, inefficient query, fragmented index etc.
In this blog, we will look into the wait stats CXPACKETS occurrence and analysing whether it improving or degrading our performance.
Most of the CXPACKETS, appears because of the parallel processing.
In Parallel processing query is divided into multiple thread and executing in parallel is called the Parallel Processing.
Why do we need parallel processing?
For example, suppose we have query that takes 5 Min to completes with the help of single CPU, if that same queries are divided into multiple threads and then executed parallel on more than one CPU it may improve the performance.
Once the process is get divided into multiple threads, then there will be one main thread 0 is take care of remaining thread created at parallel execution.
This thread 0 make sure that remaining thread are completed their task and involve in final delivery.
There are lots of things needs to be considered in the parallel processing. Some of them are shown below
-
CPU Cores
-
I/O rates of disk
-
Indexing
-
Page Fragmentation
1. CPU Cores:
In parallelism multiple thread are created for particular process and get executed on the CPU cores based on the MAX DOP and Cost threshold for parallelism values.
DOP defines how many CPU needs to be utilized. Default value is Zero (Use all CPU). Max DOP needs to be set based on the CPU core available and the baseline analysis.
Cost threshold for Parallelism defines which process are needs to be executed parallel. Based on the baseline analysis of query cost and CPU workload we have defined its value. Default value is 5.
Sometimes a queries which runs on single CPU can be perform well compared to the multiple CPU.
Below example show how the same queries executed on single and multiple CPU. How much CPU time it has been utilized and occurrence of CXPACKETS.
At first we will check the values of Max DOP and Cost threshold for parallelism.
Now we start execute query on the adventure works2012 database. Before executing be sure that the actual execution plan is enabled.
Set statistics IO ON
Set statistics Time ON
Use AdventureWorks2012
Select * from sales.salesorderdetails order by lineTotal Desc
GO
Actual Execution plan
As you see the parallelism happens for our query as the cost threshold for parallelism is more than 5.
In the above execution plan, clustered Index scan is responsible for fetching the rows, hence i selected that and pressed F4 option to view the properties window.
As you can see as per my CPU core 4 threads are created and query executed in parallel.
Now we will check the CXPACKETS wait type using the command shown below.
Select * from sys.dm_os_wait_stats Where wait_type = ‘ cxpackets’
As you see that the parallelism happened and wait time is about 8705 ms.
CPU Execution time
Now we will check how it behaves for single CPU. For that I am adding a hint of max DOP to 1.
Before executing, reset CXPackets counter values by using below command.
Please note: Don’t execute this on production environment.
DBCC SQLPERF(‘sys.dm_os_wait_stats’,clear)
Use AdventureWorks2012
select * from sales.SalesOrderDetail order by LineTotal desc option (maxdop 1)
Actual Execution Plan
As you see parallelism not happened as I have forced the optimizer to use single CPU by setting the MAX DOP to 1.
Now we will check the CXPACKETS wait type using the command shown below.
Select * from sys.dm_os_wait_stats Where wait_type = ‘ cxpackets’
As you see that no CXPackets happened. If you find more numbers of CXpackets then something needs to analysed.
CPU Execution Time
As you see that CPU time is gradually decreased while executing on single processor. Here total elapsed time is same for both. But CPU work load is high on parallelism.
2. I/O rates of disk:
In multiple storage, design according that both the physical storage system has the same transfer rate.
If one of storage is performs faster and one couldn’t perform good, this will create more CXPACKETS wait.
In this case thread that is accessing the faster storage can get work done quickly then thread accessing the slow storage.
CXPACKETS wait type says that its waiting for remaining threads to complete their task.
3. Indexing:
As we know that creating index can improve the IO performance.
For Example, if we are joining the multiple tables, if one of the table doesn’t have index or index not created. Thread which was accessing that object may take some time fetch the row. So rest of the thread needs to be waiting for that particular thread to complete its job.
So creating a proper index plays a role high CXpackets wait time.
4. Page Fragmentation:
Due to high volume of high update and delete transaction, page splits happens more which leads data to be scattered. It may highly degrade the IO performance, memory utilization.
The Multi thread process which access the object affect by the fragmentation can causes the more CXPACKETS wait.
Reference : Amit bansal https://www.youtube.com/user/SQLMaestros/videos