# Answering Simple Questions in SQL Coding Interviews

### Requirements

Any database server will do. For this instance, MySQL is used.

SELECT @@version_comment AS Version,
@@version_compile_machine AS Version Compile Machine,
@@innodb_version AS InnoDB Version,
@@version_compile_os AS Version Compile OS;
Table 1: Server Info
VersionVersion Compile MachineInnoDB VersionVersion Compile OS
MySQL Community Server (GPL)x86_645.7.21Linux

### Introduction

In Interviewing ‘SQL Developers’ (and failling!), author pankswork frustratingly writes about having interviewers fail what seems to be very simple SQL questions at a rate of about 50%. “Is my test too hard,” they wonder.

When I read this post I immeidately wondered if they were trick questions I wasn’t picking up on. They all seemed very easy to answer. They were a little tricky, but certainly not unreasonably complicated. Indeed, most commenters to the Reddit post agreed.

I wanted to use the opportunity to demonstrate how to go about solving the interview problems, which are:

1. Select all data from both tables.

2. What team has the most wins?

3. What is the combined salary for each team?

4. What player does not belong to a team?

I slightly reworded some questions to avoid confusion but the results will still be the same, along with the methods of obtaining them.

### Database Setup

First, I’ll create the database and build the Players and Teams tables.

DROP DATABASE IF EXISTS PlayersTeams;
CREATE DATABASE PlayersTeams;
USE PlayersTeams;
CREATE TABLE Players (
PlayerID TINYINT(1)  UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
Salary   SMALLINT(4) UNSIGNED NOT NULL,
TeamID   TINYINT(1)  UNSIGNED NOT NULL
);
INSERT INTO Players
(Salary, TeamID)
VALUES
(1500, 1),
(1359, 1),
(1070, 1),
(1165, 3),
(1474, 2),
(1411, 1),
(1211, 2),
(1334, 1),
(1486, 4),
(1223, 2);
CREATE TABLE Teams (
TeamID   TINYINT(1)  UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
TeamName VARCHAR(10)          NOT NULL,
Wins     TINYINT(2)  UNSIGNED NOT NULL,
Losses   TINYINT(2)  UNSIGNED NOT NULL
);
INSERT INTO Teams
(TeamName, Wins, Losses)
VALUES
("Jets",   10,  4),
("Giants",  4, 10),
("Eagles",  7,  7);

### Select all data from both tables

What we are being asked to do here is effectly do a SELECT * from both tables which we join on some type of JOIN statement; but, which one? The dataset is small enough for us to easily see we should have 10 rows in final result set.

If we use a LEFT JOIN with Players as the left table, Teams as the right table and TeamID as the foreign key, then every record within Players will be returned along with associated Teams. If we have an instance of a player that does not have an associated team then that record will be NULL.

SELECT
*
FROM
Players
LEFT JOIN
Teams
ON
Teams.TeamID = Players.TeamID
ORDER BY
Players.PlayerID;
Table 2: 10 records
PlayerIDSalaryTeamIDTeamIDTeamNameWinsLosses
1150011Jets104
2135911Jets104
3107011Jets104
4116533Eagles77
5147422Giants410
6141111Jets104
7121122Giants410
8133411Jets104
914864NANANANA
10122322Giants410

If we were to use a RIGHT JOIN then our server would instead take all Teams.TeamID and find corresponding Players per id. However, we have a Player.TeamID value of 4 which does not exist in Teams.TeamID. Therefore, our result set would only contain nine records.

SELECT
*
FROM
Players
RIGHT JOIN
Teams
ON
Teams.TeamID = Players.TeamID
ORDER BY
Players.PlayerID;
Table 3: 9 records
PlayerIDSalaryTeamIDTeamIDTeamNameWinsLosses
1150011Jets104
2135911Jets104
3107011Jets104
4116533Eagles77
5147422Giants410
6141111Jets104
7121122Giants410
8133411Jets104
10122322Giants410

However, if we used RIGHT JOIN and reversed the order of our tables (make Teams our primary table, joining Players) then we would get the correct results.

SELECT
*
FROM
Teams
RIGHT JOIN
Players
ON
Players.TeamID = Teams.TeamID
ORDER BY
Players.PlayerID;
Table 4: 10 records
TeamIDTeamNameWinsLossesPlayerIDSalaryTeamID
1Jets104115001
1Jets104213591
1Jets104310701
3Eagles77411653
2Giants410514742
1Jets104614111
2Giants410712112
1Jets104813341
NANANANA914864
2Giants4101012232

Notice also how the column order of our result set changes. In the previous RIGHT JOIN, the associated columns from Teams were on the right side. Now, they are on the left. This is our visual cue of the query we created.

I suspect what’s happening here is those doing the interview get twisted up on the differences not recognizing they’re missing some data.

### What team has the most wins?

This question does not require us to do fancy joins. All of the data we need is contained within the Teams table. One way some may be going about this problem is simply using MAX().

SELECT
MAX(Wins)
FROM
Teams;
Table 5: 1 records
MAX(Wins)
10

Notice though our question asked “What team”, not “what is the maximum number of Wins”. We need to return the TeamName (or TeamID). But, you cannot just add this column into the SELECT statement without issue.

SELECT
TeamName,
MAX(Wins)
FROM
Teams;

This query will generate an error on most all newer database servers. In this case, what MySQL is telling us is that we either must remove TeamName from the query or move it to a GROUP BY statement. See MySQL Handling of GROUP BY for more info.

For our question, all of this is unnecessary anyway. We want to find the maximum Wins and return the corresponding TeamName. Just sort the dataset by Wins, largest to smallest, and return the first record.

SELECT
TeamName
FROM
Teams
ORDER BY
Wins DESC
LIMIT
1;
Table 6: 1 records
TeamName
Jets

### What is the combined salary for each team?

In the original post, this question was phrased as “How much does each team make?” The author acknowledged this was intended to be a trick question; the teams do not have a “profit” or “revenue”-type column.

However, every player does have a Salary and it is reasonable to assume the question is most likely looking to find the total salary of all players per team.

We can do the first part of this query very simply by summing Salary with a grouping on TeamID.

SELECT
SUM(Salary),
TeamID
FROM
Players
GROUP BY
TeamID;
Table 7: 4 records
SUM(Salary)TeamID
66741
39082
11653
14864

Realize we would have recevied the group-by-handling error again had I not used the GROUP BY statement; this applies for SUM as well.

At this point I now have the total Salary per team. And, yes, I can probably just leave it at that - question answered. But, what fun is that?

Let’s get the TeamName. To do this I LEFT JOIN Teams again by TeamID. I also modify my original SELECT statement to return TeamName rather than TeamID. Finally, I modify the GROUP BY clause to append the table name. Without this last correction my query will return an Error 1052 that TeamID is ambiguous.

SELECT
SUM(Salary),
Teams.TeamName
FROM
Players
LEFT JOIN
Teams
ON
Teams.TeamID = Players.TeamID
GROUP BY
Players.TeamID;
Table 8: 4 records
SUM(Salary)TeamName
6674Jets
3908Giants
1165Eagles
1486NA

### What player does not belong to a team?

This question, I think, is the easiest of all. Actually, we already answered it when solving our first question (2). Recall we had one record of NULL values; this was for a player in our Players table that did not have a corresponding team in the Teams table.

This was our query for those results:

SELECT
*
FROM
Players
LEFT JOIN
Teams
ON
Teams.TeamID = Players.TeamID
ORDER BY
Players.PlayerID;

What I need to do here is return only the records where Teams.TeamID IS NULL. And, since the question is asking for “player” and not any additional information, we should only return Players.PlayerID. Our final query is as follows:

SELECT
Players.PlayerID
FROM
Players
LEFT JOIN
Teams
ON
Teams.TeamID = Players.TeamID
WHERE
Teams.TeamID IS NULL
ORDER BY
Players.PlayerID;
Table 9: 1 records
PlayerID
9

### Summary

I think if 50% of people are failing this first step then they have seriously manipulated their resume or perhaps other screening methods need to be improved. I could understand if occasionally someone tripped up on one of these questions. But, not half of an interview class.

Most importantly for those doing code interviews, in my opinion, is to not just study code and how to perform certain tasks but also understand the errors generated and how to resolve them. When working on this article I made an error and it took me 30 minutes to figure out what it was; I had written “UNISIGNED”, rather than “UNSIGNED”.

People will make mistakes and there’s nothing wrong with that. The best we can do on these job interviews is to just try to prepare the best we can.