Execution Plan Operators Reference Guide
SQL Server is an incredibly complex system. One of the core tools in a SQL Server professional’s arsenal is the ability to read and understand execution plans. These graphical representations provide a roadmap of how SQL Server intends to retrieve and process the data for a given query.
This post serves as a reference guide as to what each of the operators in a SQL Server execution plan is and what it does.
What is an Execution Plan?
An execution plan is essentially a representation of the series of steps and processes SQL Server employs to execute a SQL query. Think of it as a “recipe” that the server follows, detailing how to access and manipulate the data to return the results you’ve requested.
Core Operators in Execution Plans
Table Scan: When SQL Server reads every row from a table to find the ones that meet the query conditions, it’s performing a Table Scan. This is often seen as inefficient for larger tables because it reads the entire table data, regardless of how many rows the query needs.
- Clustered Index Scan: Similar to a table scan but focuses on the clustered index. If a table has a clustered index and SQL Server still decides to read all of the rows from that index, a Clustered Index Scan happens. This can happen even when a WHERE clause is in use if SQL believes that scanning the whole index is faster than seeking just a few rows.
- Clustered Index Seek: One of the most efficient ways to retrieve rows. SQL Server directly seeks to the relevant data pages using the clustered index, making the retrieval much faster than scanning the entire index.
- Non-Clustered Index Seek: This occurs on non-clustered indexes and is analogous to a Clustered Index Seek. SQL Server uses the non-clustered index to find the specific rows or range of rows efficiently.
- RID Lookup: A RID (Row Identifier) lookup happens when a query needs columns that aren’t in the non-clustered index. SQL Server then has to look up the heap (a table without a clustered index) using the RID to fetch the required columns.
- Key Lookup: This is similar to RID Lookup but for tables with a clustered index. The non-clustered index contains a “key” to the clustered index, which SQL Server uses to retrieve the additional columns it needs.
- Sort: As the name suggests, this operator reorders the data. If data needs to be presented in a different order than it’s stored on disk or if an ORDER BY clause is present, this operator comes into play.
- Aggregate: This operator performs aggregation based on commands like COUNT, SUM, AVG, etc. It groups data based on certain columns and then calculates an aggregate for each group.
- Compute Scalar: Sometimes, the query might need to compute an expression for each row, for example, calculating a tax amount or converting units. The Compute Scalar operator performs these computations.
- Concatenation: This operator is used when SQL Server needs to combine rows from two or more inputs into a single output stream.
- Assert: SQL Server uses the Assert operator to ensure a particular condition is always true. For instance, it might be used to enforce a CHECK constraint on a table.
- Hash Match Join: One of the more complex operators, a Hash Match Join, breaks down the join process into building a hash table from one of the input tables (usually the smaller one) and then probing that table using the larger input table.
- Hash Match Aggregate: Like the Hash Match Join but for aggregation. SQL Server builds a hash table of the input rows and then aggregates them based on the hashed values.
- Merge Join: An efficient join mechanism, but both inputs must be sorted on the join columns. The Merge Join processes the two sorted inputs one row at a time, advancing each input as necessary.
- Nested Loop: Best suited for small data sets. For every row in the outer table, SQL Server scans the inner table to find matching rows.
- Segment: Useful for dividing datasets. For instance, if a query involves calculating subtotals for different groups in a result set, SQL Server might use the Segment operator to identify when a new group starts in a sorted dataset.
- Table Spool (Lazy Spool): The Lazy Spool operator reads its input only once and caches it in tempdb. Future requirements for the data are then served from this saved data rather than re-reading the original input.
- Merge Interval: Used to merge overlapping or touching intervals.
- Filter: This operator filters out rows that don’t match the query criteria. While many filtering operations happen earlier in the query process, the Filter operator is used for conditions that can’t be applied earlier.
- Online Index Insert: Part of the mechanism SQL Server uses to allow tables to remain accessible (online) while indexes are being rebuilt or reorganized.
- Sequence Project: Adds a sequence to a stream of rows, like ROW_NUMBER.
- Eager Spool: Unlike Lazy Spool, the Eager Spool reads its input immediately and caches it.
- Parallelism: Often seen on multi-core systems, the Parallelism operator takes a stream of rows and breaks it into multiple streams for processing on different CPU cores. Share on: