In this article, I am going to explain various methods to delete duplicate rows from a table in SQL Server. This detailed article will cover the following topics as follows.
- Introduction
- Various methods to remove duplicate rows from a table in SQL Server
- Points to Remember
- Conclusion
First of all, we are going to create a new database using the SQL Server. You can still do this step if you already have an existing database.
Create a new Database
The following SQL query creates a new database and a table. Copy this query and execute it in Query Explorer or the command line.
-- Execute the following query to create the database...
IF (DB_ID('OnkarSharma_DeleteDuplicateRows') IS NOT NULL)
BEGIN
USE master
PRINT 'Database exists'
DROP DATABASE OnkarSharma_DeleteDuplicateRows
PRINT 'Database Dropped...'
END
GO
CREATE DATABASE OnkarSharma_DeleteDuplicateRows
PRINT 'New Database ''OnkarSharma_DeleteDuplicateRows'' Created'
GO
USE [OnkarSharma_DeleteDuplicateRows]
GO
-- Employee Table
CREATE TABLE [dbo].[Employee] (
EmployeeID INT IDENTITY(31100, 1),
EmployerID BIGINT NOT NULL DEFAULT 228866,
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50) NOT NULL,
Email VARCHAR(255) NOT NULL,
DepartmentID VARCHAR(100) NOT NULL,
Age INT NOT NULL,
GrossSalary BIGINT NOT NULL,
PerformanceBonus BIGINT,
ContactNo VARCHAR(25),
PRIMARY KEY (EmployeeID)
);
Next, you can insert data to the tables using the SQL INSERT statement or by adding data directly to the tables in SSMS.
Let's check our table using the following query.
To get the data from the "Employee" table, use the following query.
SELECT * FROM OnkarSharma_DeleteDuplicateRows..Employee
3 Ways to Delete Duplicate Rows From A Table In SQL Server
Here are 3 common methods that you can use to delete duplicate records from a table In SQL Server.
Method 1. Using GROUP BY and Having Clause.
In this method, the SQL GROUP BY clause is used to identify and remove duplicate rows from a table.
Syntax
DELETE FROM <Table_Name>
WHERE ID NOT IN
(
SELECT MAX(ID) AS MaxRecordID
FROM <Table_Name>
GROUP BY column1, columns2, ...
);
Example
DELETE FROM [Employee]
WHERE EmployeeID NOT IN
(
SELECT MAX(EmployeeID) AS MaxRecordID
FROM [Employee]
GROUP BY [EmployerID], [FirstName], [LastName], [Email], [DepartmentID], [Age], [GrossSalary], [PerformanceBonus], [ContactNo]
);
To verify the deletion, use the following query.
SELECT * FROM OnkarSharma_DeleteDuplicateRows..Employee
Method 2. Using CTE (Common Table Expression).
CTE (Common Table Expression) can also be used to remove duplicate rows from a table in SQL Server.
Syntax
WITH CTE AS (
SELECT
column1,
column2,
...
ROW_NUMBER() OVER (
PARTITION BY column1, column2, ...
ORDER BY column1, column2, ...
) AS RowNumber
FROM
<Table_Name>
)
DELETE FROM CTE
WHERE RowNumber > 1;
Example
WITH CTE AS
(
SELECT
[EmployeeID],
[EmployerID],
[FirstName],
[LastName],
[Email],
[DepartmentID],
[Age],
[GrossSalary],
[PerformanceBonus],
[ContactNo],
ROW_NUMBER() OVER (
PARTITION BY [EmployerID], [FirstName], [LastName], [Email], [DepartmentID], [Age], [GrossSalary], [PerformanceBonus], [ContactNo]
ORDER BY [EmployerID], [FirstName], [LastName], [Email], [DepartmentID], [Age], [GrossSalary], [PerformanceBonus], [ContactNo]
) AS RowNumber
FROM
Employee
)
DELETE FROM CTE
WHERE RowNumber > 1;
To verify the deletion, use the following query.
SELECT * FROM OnkarSharma_DeleteDuplicateRows..Employee
Method 3. Using Rank Function.
The RANK function with PARTITION BY can also be used to remove duplicate rows from a table in SQL Server.
Syntax
DELETE E
FROM <Table_Name> E
INNER JOIN
(
SELECT *,
RANK() OVER(PARTITION BY column1, column2, ...
ORDER BY ID) rank
FROM <Table_Name>
) T ON E.ID = t.ID
WHERE rank > 1;
Example
DELETE E
FROM [Employee] E
INNER JOIN
(
SELECT *,
RANK() OVER (
PARTITION BY [EmployerID], [FirstName], [LastName], [Email], [DepartmentID], [Age], [GrossSalary], [PerformanceBonus], [ContactNo]
ORDER BY [EmployeeID]
) AS rank
FROM [Employee]
) T ON E.[EmployeeID] = T.[EmployeeID]
WHERE rank > 1;
To verify the deletion, use the following query.
SELECT *
FROM OnkarSharma_DeleteDuplicateRows..Employee;
Points to Remember
I would recommend you follow the points given below before deleting any type of record.
- Back up your data.
- Be sure to test your DELETE query with a SELECT statement.
- Choose an effective method as per the requirement to remove duplicate rows.
See you in the next article, till then, take care and be happy learning.
You may also visit my other articles on SQL Server.
You can connect with me @.
Conclusion
In this article, we have discussed various methods to delete duplicate rows from a table in SQL Server
I hope you enjoyed this article. Follow C# Corner to learn more new and amazing things about SQL Server.
Thanks for reading.