Whenever possible use UNION ALL instead of UNION. The difference is that UNION has a “side-effect” of eliminating all duplicate rows and sorting results, which UNION ALL doesn’t do. Selecting a distinct result requires building a temporary worktable, storing all rows in it and sorting before producing the output. In some cases that’s exactly what you need to do, then UNION is good to use. But if you don’t expect any duplicate rows in the result set, then use UNION ALL. It simply selects from one table or a join, and then selects from another, attaching results to the bottom of the first result set. UNION ALL requires no worktable and no sorting (unless other unrelated conditions cause that). In most cases it’s much more efficient. One more potential problem with UNION is the danger of flooding tempdb database with a huge worktable. It may happen if you expect a large result set from a UNION query.
Example
The following queries select ID for all stores in the sales table, and the ID for all stores in the big_sales table, a version of the sales table that has been populated with over 70,000 rows. The only difference between the two solutions is the use of UNION versus UNION ALL. But the addition of the ALL keyword makes a big difference in the query plan. The first solution requires stream aggregation and sorting the results before they are returned to the client. The second query is much more efficient, especially for large tables. In this example both queries return the same result set, though in a different order.
Although the result sets in this example are interchangeable, you can see that the UNION ALL statement consumed less than half of the resources that the UNION statement consumed (UNION is doing 463 logical reads on g_sales table which takes more resources of SQL Server. UNION ALL is doing only 224 logical read and makes it fast). So be sure to anticipate your result sets and in those that are already distinct, use the UNION ALL clause.
No comments:
Post a Comment