Publish Date - January 26th, 2023
|Last Modified - March 7th, 2023
Joins, joins, joins – nothing but joins! Four tables that need to be joined together and uniquely filtered for specific things. Really think about how you want to structure your joins as if you’re joining. While this was not as hard as occupations, it’s easier than placement. However, overall of these queries are great ways to learn more complex querying in MySQL, SQL server or Oracle.
The problem
Julia just finished conducting a coding contest, and she needs your help assembling the leaderboard! Write a query to print the respective hacker_id and name of hackers who achieved full scores for more than one challenge. Order your output in descending order by the total number of challenges in which the hacker earned a full score. If more than one hacker received full scores in same number of challenges, then sort them by ascending hacker_id.
Input Format
The following tables contain contest data:
- Hackers: The hacker_id is the id of the hacker, and name is the name of the hacker.
- Difficulty: The difficult_level is the level of difficulty of the challenge, and score is the score of the challenge for the difficulty level.
- Challenges: The challenge_id is the id of the challenge, the hacker_id is the id of the hacker who created the challenge, and difficulty_level is the level of difficulty of the challenge.
- Submissions: The submission_id is the id of the submission, hacker_id is the id of the hacker who made the submission, challenge_id is the id of the challenge that the submission belongs to, and score is the score of the submission.
Sample Input
Hackers Table:Difficulty Table:Challenges Table:Submissions Table:
Sample Output
90411 Joe
Explanation
Hacker 86870 got a score of 30 for challenge 71055 with a difficulty level of 2, so 86870 earned a full score for this challenge.
Hacker 90411 got a score of 30 for challenge 71055 with a difficulty level of 2, so 90411 earned a full score for this challenge.
Hacker 90411 got a score of 100 for challenge 66730 with a difficulty level of 6, so 90411 earned a full score for this challenge.
Only hacker 90411 managed to earn a full score for more than one challenge, so we print the their hacker_id and name as space-separated values.
The solution
The solution is an interesting one because you need to make sure you’re joining appropriately.
My query
/*
hacker_id, name of hackers
WHERE score = max(score) AND score > 2
ORDER BY DESC
COUNT of challenges of hacker_id, score = max(score)
COUNT(score) in same challenges, sort by ASC hacker_id
t1 = Hackers
t2 = Submissions
t3 = Challenges
t4 = Difficulty
*/
SELECT t1.hacker_id, t1.name
FROM hackers AS t1
JOIN submissions AS t2 ON t1.hacker_id = t2.hacker_id
JOIN challenges AS t3 ON t3.challenge_id = t2.challenge_id
JOIN difficulty AS t4 ON t4.difficulty_level = t3.difficulty_level
WHERE t4.score = t2.score
GROUP BY t1.hacker_id,t1.name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC,t1.hacker_id ASC
As always, I start with some pseudo code to outline everything.
/*
hacker_id, name of hackers
WHERE score = max(score) AND score > 2
ORDER BY DESC // 1
COUNT of challenges of hacker_id, score = max(score) // 2
COUNT(score) in same challenges, sort by ASC hacker_id // 3
t1 = Hackers
t2 = Submissions
t3 = Challenges
t4 = Difficulty
*/
Note, how I label my table prefixes (almost like variables) and provide some logic on the WHERE, GROUP BY and ORDER BY clause at the end (marked 1, 2, 3 respectively).
Next, here are the JOINs. You need to figure out a logical way to JOIN these tables together since there’s not one primary key.
SELECT t1.hacker_id, t1.name
FROM hackers AS t1
JOIN submissions AS t2 ON t1.hacker_id = t2.hacker_id
JOIN challenges AS t3 ON t3.challenge_id = t2.challenge_id
JOIN difficulty AS t4 ON t4.difficulty_level = t3.difficulty_level
With hackers as your main table:
- Submissions joins on hackers by hacker_id
- Challenges joins on submissions by challenge_id
- difficulty joins on challenges by difficulty_level
By doing this, you ensure that all of values are reported.
After that the crucial point is this:
WHERE t4.score = t2.score
GROUP BY t1.hacker_id,t1.name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC,t1.hacker_id ASC
With those joins, now you can ask MySQL to match scores from the submissions table and the difficulty table. When you do that, you’re ensuring rows in the query are basically people who got 100% in the competition, since t2.score = max(score) possible for that specific challenge.
The rest of the query is just clean-up and organization of that data with a GROUP BY for the HAVING CLAUSE, which provides you the ability to showcase Hacker_id’s and names that appear more than once.
Here’s the final query result:
27232 Phillip
28614 Willie
15719 Christina
43892 Roy
14246 David
14372 Michelle
18330 Lawrence
26133 Jacqueline
26253 John
30128 Brandon
35583 Norma
13944 Victor
17295 Elizabeth
19076 Matthew
26895 Evelyn
32172 Jonathan
41293 Robin
45386 Christina
45785 Jesse
49652 Christine
13391 Robin
14366 Donna
14777 Gerald
16259 Brandon
17762 Joseph
28275 Debra
36228 Nancy
37704 Keith
40226 Anna
49307 Brian
12539 Paul
14363 Joyce
14658 Stephanie
19448 Jesse
20504 John
20534 Martha
22196 Anthony
23678 Kimberly
28299 David
30721 Ann
32254 Dorothy
46205 Joyce
47641 Patricia
13122 James
13762 Gloria
14863 Walter
18690 Marilyn
18983 Lori
21212 Timothy
25732 Antonio
28250 Evelyn
30755 Emily
38852 Benjamin
42052 Andrew
44188 Diana
48984 Gregory
13380 Kelly
13523 Ralph
21463 Christine
24663 Louise
26243 Diana
26289 Dorothy
39277 Charles
23278 Paula
25184 Martin
32121 Dorothy
36322 Andrew
39782 Tammy
40257 James
41319 Jean
10857 Kevin
25238 Paul
34242 Marilyn
39771 Alan
49789 Lillian
57947 Justin
74413 Harry
Conclusion
If you’re working on a potato like me (computer), your query may take a long time to run (those JOINs and order clauses are RAM killers). However, this was a nice change of pace from other MySQL courses on Hackerrank as it’s very similar to queries you may encounter in data science or on the job.
Check out some of my articles on where I’m learning how to code: