Aug 30, 2011

UNION Vs UNION ALL

This issue came up during one of the training I conducted recently. There is a popular belief that UNION will ALWAYS perform a unique filter on the data.  While it is true that UNION Clause will ALWAYS RETURN unique set of data, filtering operation either sort or distinct select is not guaranteed.
Take this example:


SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 799
UNION
SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 989
SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 799
UNION ALL
SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 989

The results will be unique. Irrespective of whether you use UNION or UNION ALL, the results are same. More than that, the execution plan shows that there is no sort operation involved.




Is that mean that we do not have to worry about UNION ALL? Will SQL Server use the right choice all the time, Irrespective of whether we use UNON or UNION ALL?  NO!
There are cases where UNION ALL and UNION turns out to the same execution plan,  it is not guaranteed that the plan will be same.  For example, when some complexity is added to the script, SQL Server will not understand the uniqueness and will turn to filter the data.
Consider this script: 


SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 799
UNION
SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 989
UNION
SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 868
Just an additional SELECT is added.

















The new compiled plan is completely different. More than that, it introduces MERGE JOIN (UNION) which is used to sort and join the data, against the CONCATENATION operation in the previous query.  
The bottom line is use UNION ALL whenever possible.  It does remove the ambiguity. 

No comments:

Post a Comment