LeetCode 185: Department Top Three Salaries
June 25, 2022
Let's solve LeetCode 185, the Department Top Three Salaries problem, in FlipDB, our array-oriented, APL-based DSL. We will use FlipDB directly in the APL session.
We are given two tables, a department table and an employee table:
d←s.Get '/Databases/LeetCode185'
d.Display 0
── LeetCode185.Department
┌ID──┐ ┌Name───┐
↓1 │ ↓IT │
│2 │ │Sales │
└Int8┘ └Char(5)┘
── 2 rows by 2 columns
── LeetCode185.Employee ─────────────────────
┌ID──┐ ┌Name───┐ ┌Salary┐ ┌DepartmentID┐
↓1 │ ↓Joe │ ↓85,000│ ↓1 │
│2 │ │Henry │ │80,000│ │2 │
│3 │ │Sam │ │60,000│ │2 │
│4 │ │Max │ │90,000│ │1 │
│5 │ │Janet │ │69,000│ │1 │
│6 │ │Randy │ │85,000│ │1 │
│7 │ │Will │ │70,000│ │1 │
└Int8┘ └Char(5)┘ └Int32─┘ └Int8────────┘
── 7 rows by 4 columns ──────────────────────
The problem is:
A company's executives are interested in seeing who earns the most money in each of the company's departments. A high earner in a department is an employee who has a salary in the top three unique salaries for that department. Write an SQL query to find the employees who are high earners in each of the departments.
The Department table does not add anything to the problem. It simply gives a nice description for each department, and we can dispense with it.
We create a new query on the Employee table:
t←d.GetTable 'Employee'
q←t.Query ''
The task is simply to select some rows. There is no computation or aggregation required in the result, so the only thing we need to specify is a where clause, making use of the rankDown function:
q.Where←'3 > 1 rankDown by Salary (group DepartmentID)'
r←q.Execute 0
r.Display 0
── Key:ID ───────────────────────────────────
┌ID──┐ ┌Name───┐ ┌Salary┐ ┌DepartmentID┐
↓1 │ ↓Joe │ ↓85,000│ ↓1 │
│2 │ │Henry │ │80,000│ │2 │
│3 │ │Sam │ │60,000│ │2 │
│4 │ │Max │ │90,000│ │1 │
│6 │ │Randy │ │85,000│ │1 │
│7 │ │Will │ │70,000│ │1 │
└Int8┘ └Char(5)┘ └Int32─┘ └Int8────────┘
── 6 rows by 4 columns ──────────────────────
Note first that the rankDown
function is in ⎕IO←0
, so we are selecting salaries ranked 0, 1, or 2. Note further that rankDown
takes a left argument of 0 1 or 2, for rank, dense rank, and average rank, respectively, so in this case we are applying dense rank. Finally note that rankDown
is passed to the by
operator to apply it within each department.
We can pick apart this where clause expression and see how it works, right in the APL session:
disp←{⍵.Display 0}
Salary←85 80 60 90 69 85 70
Department←1 2 2 1 1 1 1
The group
function returns the indices for each unique value:
disp group Department
┌───────────┐
↓[0,3,4,5,6]│
│[1,2] │
└Int8───────┘
The rankDown
function, with a left argument of 1 for dense, applied to Salary directly:
disp 1 rankDown Salary
┌────┐
↓1 │
│2 │
│5 │
│0 │
│4 │
│1 │
│3 │
└Int8┘
And now applied via the by
operator, so that we rank within each group:
disp 1 rankDown by Salary (group Department)
┌────┐
↓1 │
│0 │
│1 │
│0 │
│3 │
│1 │
│2 │
└Int8┘
The by
operator groups the data and applies its operand function to each group, and then ungroups the data, restoring it to its original order. It is particularly useful with uniform functions like rankDown
, but it is also useful with aggregate and structural functions.
Finally we flag the top 3 ranks:
disp 3 gt 1 rankDown by Salary(group Department)
┌───────┐
↓1 │
│1 │
│1 │
│1 │
│0 │
│1 │
│1 │
└Boolean┘
Design Issue
There is a tradeoff when designing a DSL, between more functions with fewer arguments, and fewer functions with more arguments. In the case of rank, FlipDB currently has two functions, rankUp
and rankDown
, each taking three potential values as a left argument for an effective total combination of 6 functions:
rankUp
rankDown
denseRankUp
denseRankDown
averageRankUp
averageRankDown
We could of course have only one function, rank
, and provide yet another argument for the direction. The idea behind building in the ordering into the name was to make it similar to the related APL primitives grade up
and grade down
. Two functions seemed good, 6 functions seemed overkill at the time, but I'm reconsidering. It would be easier to read denseRankDown Salary
than 1 rankDown Salary
. The latter requires a trip to the documentation, while the former is discoverable with autocomplete.