10 GROUP BY SQL Practice Exercises with Solutions

GROUP BY is a powerful SQL clause that allows you to create groups of records and then calculate summary metrics (such as averages) for those groups. However, GROUP BY is often challenging for SQL learners to master. Yet, practicing GROUP BY is very important if you’re planning on using SQL.

In this article, we’ve collected several GROUP BY practice exercises to help you solidify your understanding of how this clause works.

GROUP BY Practice

This article contains 10 GROUP BY exercises to help you practice. Learning SQL by doing exercises is one of the best ways to sharpen your skills.

The exercises in this article come from our SQL Practice Set. It has more than 80 interactive SQL exercises covering topics like simple queries, JOINs, subqueries, – and, of course, GROUP BY . We also offer other practice sets, including:

Now that you know what SQL practice resources are available, let’s dig into the GROUP BY clause. If you don’t know anything about GROUP BY , start by reading GROUP BY in SQL Explained and 5 Examples of GROUP BY .

The GROUP BY Clause

In these GROUP BY practice exercises, we’ll be using a table called games that stores information about video games. We need to emphasize the fact that although the names of the games are real, the other fields of the table contain data that’s completely made up. Below is a partial view of the table:

titlecompanytypeproduction_yearsystemproduction_costrevenuerating
Blasting BoxesSimone Gamesaction adventure1998PC1000002000007
Run Run Run!13 Mad Bitsshooter2011PS335000006500003
Duck n’Go13 Mad Bitsshooter2012Xbox300000015000005
SQL Wars!Vertabelowargames2017Xbox50000002500000010
Tap Tap Hex!PixelGaming Inc.rhythm2006PS2250000035000007
NoRiskSimone Gamesaction adventure2004PS2140000034000008

Let’s briefly review the GROUP BY clause. This clause allows us to create groups of records and calculate different metrics for each group (such as the average, minimum, or maximum values in each set). Below is a simple query that uses GROUP BY to calculate the number of games produced by each company:

SELECT company, COUNT(*) FROM games GROUP BY company;

This query tells the database to create groups of rows from the table games that have the same value in the company column. (In the image below, the rows with the same color are in the same group, i.e. they are made by the same company). Then the function COUNT(*) is used to count the quantity of rows in each group; this returns the number of games produced by each company.

Here’s the table with the color-coded rows. Every row with the same color belongs to the same group:

titlecompanytypeproduction_yearsystemproduction_costrevenuerating
Blasting BoxesSimone Gamesaction adventure1998PC1000002000007
Run Run Run!13 Mad Bitsshooter2011PS335000006500003
Duck n’Go13 Mad Bitsshooter2012Xbox300000015000005
SQL Wars!Vertabelowargames2017Xbox50000002500000010
Tap Tap Hex!PixelGaming Inc.rhythm2006PS2250000035000007
NoRiskSimone Gamesaction adventure2004PS2140000034000008

And here’s the result:

CompanyCOUNT(*)
Simone Games2
13 Mad Bits2
Vertabelo1
PixelGaming Inc.1

Exercise 1: Total Revenue for Each Company

Exercise:

Obtain the name and total revenue for each company.

Solution:

SELECT company, SUM(revenue) FROM games GROUP BY company;

Explanation:

In the query, we use the GROUP BY company clause to create groups of rows with the same value in the company column. Then, the function SUM(revenue) is executed for each group of rows and the result is shown along with the company name.

Each row of the result is associated with a group of rows. This is an important point when using GROUP BY : the individual row details are collapsed into one row per group, and the data returned is about the group of rows.

Before going to the next exercise, I would like to recommend the course SQL Monthly Practice, where you can find an additional source of SQL practice exercises. We release a new SQL practice course every month.

Exercise 2: Games Produced per Year with Average Revenue and Cost

Exercise:

Generate a report with the production year and the number of games released this year (named count), the average of production cost for all games produced in this year (named avg_cost ) and the average revenue for that year (named avg_revenue ).

Solution:

SELECT production_year, COUNT(*) AS count, AVG(production_cost) AS avg_cost, AVG(revenue) AS avg_revenue FROM games GROUP BY production_year;

Explanation:

This exercise is similar to the previous one. Here, we use aggregate functions COUNT() and AVG() to calculate the metrics. Moreover, we rename the columns in the report to appropriately describe their content ( as count, avg_cost and avg_revenue ). To rename a column, we use the AS clause followed with the name to assign to the column, e.g.:

AVG(production_cost) AS avg_cost

Exercise 3: Number of Profitable Games of Each Game Type

Exercise:

Count how many games of a given type are profitable (i.e. the revenue was greater than the production cost). Show the game type and the number of profitable games (named number_of_games ) for each type.

Solution:

SELECT type, COUNT(*) AS number_of_games FROM games WHERE revenue > production_cost GROUP BY type;

Explanation:

In this exercise, the WHERE condition is executed first. It determines which games are profitable (have greater revenue than production cost). After that, GROUP BY groups rows (games) of the same type . Finally the COUNT(*) function is applied to each group of rows to obtain the number of profitable games of each type.

Exercise 4: Total Revenue per Game Type in PS2 and PS3 Systems

Exercise:

Obtain the type of games and the total revenue generated for games with a production_year after 2010 and with a PS2 or PS3 system . Order the result so the types with the highest revenue come first.

Solution:

SELECT type, SUM(revenue) AS total_revenue FROM games WHERE production_year > 2010 AND systems IN ('PS2','PS3') GROUP BY type;

Explanation:

Like the previous exercise, we first filter the rows; this time, the conditions are production_year > 2010 and system IN (‘PS2’, ‘PS3’) . Once we filtered the rows, we proceeded to create groups of rows using the GROUP BY type clause. Finally, we apply the function SUM(revenue) to each group of rows to obtain the total revenue in this group.

Exercise 5: Total Gross Profit per Company

Exercise:

For all companies present in the table, obtain their names and the sum of gross profit over all years. (Assume that gross profit = revenue - cost of production). Name this column gross_profit_sum . Order the results by gross profit, in descending order.

Solution:

SELECT company, SUM(revenue - production_cost) AS gross_profit_sum FROM games GROUP BY company ORDER BY gross_profit_sum DESC;

Explanation:

In the query, we use the expression revenue - production_cost to calculate the gross profit for each game. If we create groups using the company column and then sum the gross profit of all games by that company, we obtain the company’s global gross profit.

Notice the ORDER BY gross_profit_sum DESC clause. The DESC keyword indicates descending order; in other words, the highest value of gross_profit_sum appears first in the result.

Exercise 6: Yearly Gross Profit per Company

Exercise:

Obtain the yearly gross profit of each company. In other words, we want a report with the company name, the year, and the gross profit for that year. Order the report by company name and year.

Solution:

SELECT company, production_year, SUM(revenue - production_cost) AS gross_profit_sum FROM games GROUP BY company, production_year ORDER BY company, production_year;

Explanation:

Since we want to obtain gross profit broken down by year, we need to use the GROUP BY company, production_year clause . This creates a group of rows for each pair of company and production_year values. For each group, we then calculate the expression SUM(revenue - production_cost) ; this is the gross profit.

Exercise 7: Difference Between Yearly Gross Profits

Exercise:

Generate report to show the yearly gross profit for each company, the gross profit of the previous year, and the difference between both years. I suggest using the previous query as a starting point.e

Solution:

WITH company_gross_profit AS ( SELECT company, production_year AS year, SUM(revenue - production_cost) AS gross_profit FROM games GROUP BY company, production_year ORDER BY company, production_year ) SELECT cur.company, cur.year, cur.gross_profit, prev.gross_profit, cur.gross_profit - prev.gross_profit AS profit_delta FROM company_gross_profit AS cur, LEFT JOIN company_gross_profit AS prev ON cur.company = prev.company AND cur.year = prev.year + 1;

Explanation:

This query is based on a CTE called company_gross_profit , which is created using the WITH clause in the first part of the query. After that, the main query refers to the CTE table company_gross_profit two times in a self-join; the ‘left’ or ‘first’ table is referred as cur (current) and the other is referred as prev (previous). Then we join two rows of the same company but from contiguous years. The condition to do that is:

cur.company = prev.company AND cur.year = prev.year + 1

Finally, the column profit_delta is a calculated field. It’s obtained using the difference between the current year’s gross profit and the previous year’s gross profit:

cur.gross_profit - prev.gross_profit AS profit_delta

CTEs, or common table expressions, are an advanced feature of SQL. I recommend the articles A Guide to SQL Common Table Expressions, SQL CTEs Explained with Examples, and How to Learn SQL Common Table Expressions (CTEs) if you want to learn more about CTEs.

Exercise 8: Companies Producing More Than One Game

Exercise:

For each company, select its name, the number of games it’s produced (as the number_of_games column), and the average cost of production (as the avg_cost column). Show only companies producing more than one game.

Solution:

SELECT company, COUNT(company) AS number_of_games, AVG(production_cost) AS avg_cost FROM games GROUP BY company HAVING COUNT(*) > 1;

Explanation:

In this query, we use the HAVING clause instead of the WHERE clause. While WHERE filters individual records, HAVING is used to apply filters to groups of records.

In this query, we want a report showing only companies that produced more than one game. After we GROUP BY company, we can use the condition COUNT(*) > 1 to discard companies having only one game.

In SQL, you can only directly use conditions involving aggregate functions in the HAVING clause; in the WHERE clause, you’d have to use a subquery. This is because WHERE is processed before aggregate functions in the SQL order of operations.

If you’re uncertain about the HAVING clause, I recommend our article The SQL HAVING Clause Explained, where we discuss how to use this clause in detail.

Exercise 9: Companies Producing ‘Good’ Games with 4M+ Revenue

Exercise:

We are interested in good games produced between 2000 and 2009. A good game is a profitable game with a rating higher than 6. For each company, show the company name, its total revenue from good games produced between 2000 and 2009 (as the revenue_sum column), and the number of good games it produced in this period (as the number_of_games column). Only show companies with good-game revenue over 4 000 000.

Solution:

SELECT company, COUNT(company) AS number_of_games, SUM(revenue) AS revenue_sum FROM games WHERE production_year BETWEEN 2000 AND 2009 AND rating > 6 AND revenue - production_cost > 0 GROUP BY company HAVING SUM(revenue) > 4000000;

Explanation:

In the query, we use GROUP BY company because the metrics we want to obtain ( number_of_games and revenue_sum ) are at the company level. The interesting point in this query are the filters we apply to data. Some of them are at row level and must be in the WHERE clause:

However there is another filter at the group level, which must be put in the HAVING column:

Exercise 10: Leader Companies By Game Type

Exercise:

Return a list of those companies and types of games where the company is a market leader. A market leader for a game type is a company that has a total revenue for that type of game that exceeds the total revenue of all other companies for that type of game.

Show the company name, the type of game, and that company’s total revenue for that type of game.

Solution:

SELECT company, type, SUM(revenue) as total_revenue FROM games g1 GROUP BY company, type HAVING SUM(revenue) > ( SELECT SUM(revenue) FROM games g2 WHERE g2.type = g1.type AND g2.company <> g1.company );

Explanation:

First of all, note the clause GROUP BY company, type . We use it because we want the company’s revenue for all the games of the same type.

However, the most interesting point is the use of a subquery in the HAVING clause. We want companies with a sum of revenue greater than the total revenue of the rest of the companies; we use a subquery to obtain the total revenue of the other companies..

The subquery references the g1.company and g1.type columns; these are the columns company and type in the outer query. These references in the subquery allows us to obtain the total revenue for the rest of the companies for the same type of game.

Exercise 11: Continue to Practice Your SQL skills

We’ve covered ten GROUP BY exercises with different levels of complexity, from simple aggregates to CTEs and subqueries. Learning GROUP BY using practice exercises is one of the best approaches to master this clause. The article TOP 9 SQL GROUP BY Interview Questions walks you through the most common questions asked during a SQL job interview; this is also a great way to practice if you haven’t got an interview coming up.

Our All Forever SQL Package is another way to practice SQL – and a very comprehensive one! This bundle contains all our 70+ current SQL courses in four different SQL dialects (and access to the future courses we create). All courses on our platform are practical and interactive. So, choose your practice path and start being awesome at using GROUP BY !