Exam 70-461 Notes for Chapter 5 Grouping and Windowing

Although exam 70-461 is based on SQL Server 2012, the code here is based on the Wide World Importers database and SQL Server 2016.

Chapter 5 is about data analysis using sub-queries and aggregate queries. Lessons include: 1. Writing Grouped Queries, 2. Pivoting and Unpivoting, and 3. Using Window Functions.

Use grouped queries to group data and then perform analysis computations on each group. The set of attributes your data is grouped on is called the grouping set. You can define one or more grouping sets in a single query.

Single Grouping Set example:

A query becomes a grouped query when you use a group function (i.e. COUNT(*) function), a GROUP BY clause, or both. Grouped queries return one result row per group.

Explicit GROUP BY clause example:

The above query generates the following output.

Example of a grouping set with multiple elements. Recall from Chapter 1 that a result set is non-relational when an ORDER BY clause is used.

The above query generates the following output.

To filter at the group level, use the HAVING clause, which uses a predicate that evaluates per group as opposed to per row; the WHERE clause evaluates per row. It’s important to note that general set functions ignore NULLs. For instance, COUNT([column_name]) ignores NULLs whereas COUNT(*) includes NULLs. This is not applicable in the next example because I’m querying purchase orders, not shipments. If you

T-SQL aggregate functions include COUNT, SUM, AVG, MIN, and MAX.

Example of a Grouped Query using aggregate functions, plus both WHERE and HAVING clauses:

The above query generates the following output:

Grouped Query with Aggregates
Results of Grouped Query with Aggregates

Additionally, with general set functions, you can apply the DISTINCT clause before the expression. For instance, if you wanted to count the distinct dates when purchases or sales were made, as in the following example:

The DISTINCT clause is available to other general set function, but is commonly used with COUNT.

Recall from chapter 2 that the logical query processing phases are as follows:

  1. FROM
  2. WHERE
  3. GROUP BY
  4. HAVING
  5. SELECT
  6. ORDER BY

The “keyed-in order” of the SELECT statement, however, is:

  1. SELECT
  2. FROM
  3. WHERE
  4. GROUP BY
  5. HAVING
  6. ORDER BY

From a logical query processing perspective, the GROUP BY clause is evaluated after the FROM and WHERE clauses, before the HAVING, SELECT, and ORDER BY clauses. Each group is represented by only one row in the result set; therefore, all expressions that appear in those clauses must guarantee a single result value per group. In order to refer to elements from the underlying tables that do not appear in the GROUP BY list, you must apply an aggregate function to each of them. Otherwise, you will get the following error, for instance, if column ShipperName was not included in the GROUP BY clause:

Msg 8120, Level 16, State 1, Line 4
Column ‘Purchasing.Suppliers.SupplierName’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Because the Purchasing.Suppliers.SupplierName column neither appears in the GROUP BY list nor is it contained in an aggregate function, it is not permitted in the HAVING, SELECT, and ORDER BY clauses.

Working with Multiple Grouping Sets

T-SQL supports three clauses that can be used to define multiple grouping sets: GROUPING SETS, CUBE, and ROLLUP. Use them in the GROUP BY clause in the next example. Grouping sets enable the use of a single query to group data in more than one way.

Leave a Reply

Your email address will not be published. Required fields are marked *