Interesting Query (1) --- Find Second Highest Salary

George
Jul 19
12.5k
0
9

Article

Extend the article: SQL - Interesting Queries as a series of articles (12/22/2022):

Introduction

SQL, the Structural Query Language, seems simple, but often with some tricks in them. This article will discuss some interesting, and maybe complex queries, and list the regular ones as references at the bottom.

The content of this article:

Introduction
Questions
- Question 1
- Question 2
Setup Table and Initial Data
Answers
1. for Question 1
  1. Answer 1-1
  2. Answer 1-2
  3. Answer 2-3
  4. Answer 1-4
2. for Question 2
Summary

Questions

Question 1
- Query to find Second Highest Salary of Employee?
Question 2
- Find and Remove Duplicate Rows from a Database Table?

Set up Table and initial Data

Set up Test Database Environment

USE [TestDB]  
GO  
/****** Object: Table [dbo].[Employees] Script Date: 5/28/2021 7:42:51 PM ******/  
SET ANSI_NULLS ON  
GO  
SET QUOTED_IDENTIFIER ON  
GO  
CREATE TABLE [dbo].[Employees](  
[Id] [int] IDENTITY(1,1) NOT NULL,  
[Name] [nvarchar](max) NOT NULL,  
[Salary] [int] NOT NULL,  
[ManagerId] [int] NOT NULL,  
CONSTRAINT [PK_dbo.Employees] PRIMARY KEY CLUSTERED  
(  
   [Id] ASC  
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]  
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]  
GO

Insert initial Data

insert dbo.Employees  
Values ('Greg', 100000, 1),  
       ('George', 150000, 1),  
       ('Helen', 130000, 1),  
       ('Tom', 120000, 2),  
       ('Kevin', 110000, 2),  
       ('David', 120000, 3),  
       ('Geek', 110000,3),  
       ('Tesla', 120000,3),
       ('David', 120000,3),
       ('Kevin', 110000,2)

The result will be,

Ansers

There are two questions.

Answer for Question 1

Query to find Second Highest Salary of Employee?

This is a very popular SQL question, one can find out it on different sites with multiple solutions.

Answer 1-1

SELECT MAX(salary) [Second Highest Salary]  
FROM employees  
WHERE salary NOT IN (SELECT MAX(salary) FROM employees)

Result

This solution is most effiecient and is the best solution. It is short and easy to understand.

We use the SUBQUERY in the WHERE Clause. The solution is straight forward,

SubQuery (Inner Query in the graph) SELECT the MAX salary from the employees table.
Main Query (Outer Query in the graph) SELECT the MAX salary from the employees table without the MAX salary.

In this article, How to find Second Highest salary in SQL?, it gives more than 10 ways for this question, but we are not interested in getting the same solution with more complex logic.

There ARE some equivalent solutions to what we have in Answer 1, and some with advantages for the extended question such as find Third (Nth) Highest Salary of Employee?

Answer 1-2,

SELECT MAX(salary) [Second Highest Salary]  
FROM employees  
WHERE salary < (SELECT MAX(salary) FROM employees)

This is exactly equivalent to Answer 1.

Answer 1-3,

SELECT MAX(salary) [Second Highest Salary]  
FROM employees  
WHERE salary NOT IN (SELECT DISTINCT TOP 1 salary   
                     FROM employees  
                     ORDER BY salary DESC)

The logic of this solution is similar to Answer 1, but SubQuery uses SELECT TOP Clause. The advantage for this is that if we replace 1 in SubQuery by 2, we can get the solution for finding the Third Highest Salary of Employee, such as

SELECT MAX(salary)  
FROM employees  
WHERE salary NOT IN (SELECT DISTINCT TOP 2 salary   
                     FROM employees  
                     ORDER BY salary DESC)

For which, if we try to use Answer 1 to get the solution, it is possible, but a bit complex,

SELECT MAX(salary) [Second Highest Salary]  
FROM employees  
WHERE salary NOT IN (  
                    SELECT MAX(salary)   
                    FROM employees  
                    )  
AND salary NOT IN (  
                    SELECT MAX(salary)   
                    FROM employees  
                    WHERE salary NOT IN (  
                                        SELECT MAX(salary)   
                                        FROM employees  
                                        )  
                    )

where,

NOT IN the first SubQuery --- get rid of the top max value as Answer 1 did, and
the second SubQuery is just the Answer 1 itself
NOT IN the second SubQuery --- get rid of the second max value

Therefore, we got the third-highest salary, however, with quite complex logic --- this is what we do not like.

Furthermore, if we want the Nth Highest Salary of Employee, it will be hard by Answer 1 or Answer 1-2, but we can easy to get it by Answer 1-3 by replacing 1 by N-1, then we have,

SELECT MAX(salary) [Nth Highest Salary]  
FROM employees  
WHERE salary NOT IN (SELECT DISTINCT TOP (N-1) salary   
                     FROM employees  
                     ORDER BY salary DESC)

Note (12/20/2022 added),

we can get the Nth Highest Salary of Employee, by the way below, that is equivalent to the way above (think about the logic):

SELECT MIN(salary) [Nth Highest Salary]  
FROM employees  
WHERE salary     IN (SELECT DISTINCT TOP N salary   
                     FROM employees  
                     ORDER BY salary DESC)

Answer 1-4,

SELECT DISTINCT salary [Second Highest Salary]   
FROM employees e1   
WHERE 2 = (SELECT COUNT(DISTINCT salary) FROM employees e2   
WHERE e1.salary <= e2.salary);

This is equivalent to Answer 1-3, and also can be used to get Nth Highest Salary of Employee. However, it is a bit hard to understand its logic. Let us discuss it step by step,

First, the Inner Query,

SELECT COUNT(DISTINCT salary) FROM employees

Return - 5, that is the category of salary:

SELECT salary FROM employees  
GROUP BY salary

Return

Then we add the condition into the Sub Query, and let it == 2,

2 = (SELECT COUNT(DISTINCT salary) FROM employees e2     
WHERE e1.salary <= e2.salary)

Satisfy the condition for e1.salary from the Major (Outer) Query must be the 130000, the second-highest salary as required.

Answer for Question 2

Find and Remove Duplicate Rows from a Database Table?

Finding Duplicate Rows is relatively easy to do while locating the positions for the Duplicate Rows, and therefore Removing them is a bit difficult.

Finding Duplicate Rows

SELECT name, salary, managerid, COUNT(salary) Records 
FROM employees  
GROUP BY name, salary, managerid  
HAVING COUNT(salary) > 1

The result,

We group all records except the identity or primary key that must be different for a different records.

Removing Duplicate Rows with an identity[ref]

We first need to locate the position of the duplicate records, which is easier when the table has an identity on it.

Use the following query to select the duplicate records,

SELECT *   
FROM employees   
WHERE Id NOT IN (SELECT MIN(Id)   
FROM employees  
GROUP BY name, salary, ManagerId)

Those are,

Then Delete them,

DELETE  
FROM employees   
WHERE Id NOT IN (SELECT MIN(Id)   
FROM employees  
GROUP BY name, salary, ManagerId)

Result

Records 9, 10 are deleted,

Removing Duplicate Rows withOUT an identity --- Oracle[ref]

For the Oracle database, it is easier because Oracle has a ROWID pseudo column for a table, then we can do the job as.

Locate the duplicate Records,

SELECT *  
FROM employees  
WHERE ROWID NOT IN (SELECT MIN(ROWID)  
FROM employees  
GROUP BY name, salary, ManagerId)

Delete the duplicate Records,

DELETE  
FROM employees  
WHERE ROWID NOT IN (SELECT MIN(ROWID)  
FROM employees  
GROUP BY name, salary, ManagerId)

Removing Duplicate Rows withOUT an identity --- SQL Server

We can use CTE (Common Table Expression) with ROW_NUMBER[ref] function.

Locate the duplicate Records[ref],

;WITH CTE AS  
(  
   SELECT name, salary,managerid,  
   ROW_NUMBER() OVER (PARTITION BY name, salary, managerid 
   Order by name) AS Rnum  
   FROM employees  
)  
SELECT * From CTE

Result

Delete the duplicate Records: Rnum > 1,

;WITH CTE AS    
(    
   SELECT name, salary,managerid,    
   ROW_NUMBER() OVER (PARTITION BY name, salary, managerid   
   Order by name) AS Rnum    
   FROM employees    
)    
--SELECT * From CTE    
DELETE FROM CTE WHERE Rnum <> 1

We can also do the job without using CTE[ref],

DELETE DUP    
FROM    
(    
    SELECT ROW_NUMBER() OVER (PARTITION BY name, salary, managerid   
    Order by name) AS Rnum    
    FROM employees    
) DUP    
WHERE DUP.Rnum > 1;

Summary

We choose two queries at this first time trying,

Query to find Second Highest Salary of Employee?
Find and Remove Duplicate Rows from a Database Table?

If I find some other interesting Query, I will add them to this article.

References

Complex SQL Queries --- complexsql.com
Top 10 Frequently asked SQL Query Interview Questions --- java67.com
Complex Queries in SQL ( Oracle ) --- bullraider.com
Advanced SQL - Subqueries and Complex Joins --- ocw.mit.edu
41 Essential SQL Interview Questions --- toptal.com
50 SQL Query Questions and Answers for Practice --- techbeamers.com
Top 30 SQL Query Interview Questions You Must Practice In 2019 --- edureka.co
SQL Query Interview Questions and Answers --- artoftesting.com
Rowid vs Rownum vs Row_Number() in oracle --- interviewsansar.com
How to Remove Duplicates from a Table in SQL Server --- MS
T-SQL: Remove Duplicate Rows From A Table Using Query Without Common Table Expression --- MS
Selecting Only EVEN or ODD Records from Tables – MS SQL SERVER --- learnwithsundar.com
What are SQL Set Operators?| Intersect and Minus in SQL --- complexsql.com

Concept

Top 50 SQL Interview Questions and Answers (2021 Update) --- guru99.com
SQL Interview Questions --- interviewbit.com
Top 90 SQL Interview Questions and Answers (LATEST) --- softwaretestinghelp.com