How To Delete Duplicate Rows From A Table In SQL Server

Onkar Sharma
1y
2.1k
0
12

Article

DeleteDuplicateRowsfromatableInSQLServer.zip

In this article, I am going to explain various methods to delete duplicate rows from a table in SQL Server. This detailed article will cover the following topics as follows.

Introduction
Various methods to remove duplicate rows from a table in SQL Server
Points to Remember
Conclusion

First of all, we are going to create a new database using the SQL Server. You can still do this step if you already have an existing database.

Create a new Database

The following SQL query creates a new database and a table. Copy this query and execute it in Query Explorer or the command line.

-- Execute the following query to create the database...
IF (DB_ID('OnkarSharma_DeleteDuplicateRows') IS NOT NULL)
BEGIN
    USE master
    PRINT 'Database exists'
    DROP DATABASE OnkarSharma_DeleteDuplicateRows
    PRINT 'Database Dropped...'
END
GO
CREATE DATABASE OnkarSharma_DeleteDuplicateRows
PRINT 'New Database ''OnkarSharma_DeleteDuplicateRows'' Created'
GO
USE [OnkarSharma_DeleteDuplicateRows]
GO
-- Employee Table
CREATE TABLE [dbo].[Employee] (
    EmployeeID INT IDENTITY(31100, 1),
    EmployerID BIGINT NOT NULL DEFAULT 228866,
    FirstName VARCHAR(50) NOT NULL,
    LastName VARCHAR(50) NOT NULL,
    Email VARCHAR(255) NOT NULL,
    DepartmentID VARCHAR(100) NOT NULL,
    Age INT NOT NULL,
    GrossSalary BIGINT NOT NULL,
    PerformanceBonus BIGINT,
    ContactNo VARCHAR(25),
    PRIMARY KEY (EmployeeID)
);

Next, you can insert data to the tables using the SQL INSERT statement or by adding data directly to the tables in SSMS.

Let's check our table using the following query.

To get the data from the "Employee" table, use the following query.

SELECT * FROM OnkarSharma_DeleteDuplicateRows..Employee

Employee

3 Ways to Delete Duplicate Rows From A Table In SQL Server

Here are 3 common methods that you can use to delete duplicate records from a table In SQL Server.

Method 1. Using GROUP BY and Having Clause.

In this method, the SQL GROUP BY clause is used to identify and remove duplicate rows from a table.

Syntax

DELETE FROM <Table_Name>
WHERE ID NOT IN
(
    SELECT MAX(ID) AS MaxRecordID
    FROM <Table_Name>
    GROUP BY column1, columns2, ...
);

Example

DELETE FROM [Employee]
    WHERE EmployeeID NOT IN
    (
        SELECT MAX(EmployeeID) AS MaxRecordID
        FROM [Employee]
        GROUP BY [EmployerID], [FirstName], [LastName], [Email], [DepartmentID], [Age], [GrossSalary], [PerformanceBonus], [ContactNo] 
    );

To verify the deletion, use the following query.

SELECT * FROM OnkarSharma_DeleteDuplicateRows..Employee

Method 2. Using CTE (Common Table Expression).

CTE (Common Table Expression) can also be used to remove duplicate rows from a table in SQL Server.

Syntax

WITH CTE AS (
    SELECT
        column1,
        column2,
        ...
        ROW_NUMBER() OVER (
            PARTITION BY column1, column2, ...
            ORDER BY column1, column2, ...
        ) AS RowNumber
    FROM
        <Table_Name>
)
DELETE FROM CTE
WHERE RowNumber > 1;

Example

WITH CTE AS
(
    SELECT
        [EmployeeID], 
        [EmployerID], 
        [FirstName], 
        [LastName], 
        [Email], 
        [DepartmentID], 
        [Age], 
        [GrossSalary], 
        [PerformanceBonus], 
        [ContactNo],
        ROW_NUMBER() OVER (
            PARTITION BY [EmployerID], [FirstName], [LastName], [Email], [DepartmentID], [Age], [GrossSalary], [PerformanceBonus], [ContactNo]
            ORDER BY [EmployerID], [FirstName], [LastName], [Email], [DepartmentID], [Age], [GrossSalary], [PerformanceBonus], [ContactNo]
        ) AS RowNumber
    FROM
        Employee
)
DELETE FROM CTE 
WHERE RowNumber > 1;

To verify the deletion, use the following query.

SELECT * FROM OnkarSharma_DeleteDuplicateRows..Employee

Method 3. Using Rank Function.

The RANK function with PARTITION BY can also be used to remove duplicate rows from a table in SQL Server.

Syntax

DELETE E
	FROM <Table_Name> E
	  INNER JOIN
	(
	 SELECT *, 
			RANK() OVER(PARTITION BY column1, column2, ...
			ORDER BY ID) rank
	 FROM <Table_Name>
	) T ON E.ID = t.ID
	WHERE rank > 1;

Example

DELETE E
FROM [Employee] E
INNER JOIN
(
    SELECT *, 
           RANK() OVER (
               PARTITION BY [EmployerID], [FirstName], [LastName], [Email], [DepartmentID], [Age], [GrossSalary], [PerformanceBonus], [ContactNo]
               ORDER BY [EmployeeID]
           ) AS rank
    FROM [Employee]
) T ON E.[EmployeeID] = T.[EmployeeID]
WHERE rank > 1;

To verify the deletion, use the following query.

SELECT *
FROM OnkarSharma_DeleteDuplicateRows..Employee;

Example

Points to Remember

I would recommend you follow the points given below before deleting any type of record.

Back up your data.
Be sure to test your DELETE query with a SELECT statement.
Choose an effective method as per the requirement to remove duplicate rows.

See you in the next article, till then, take care and be happy learning.

You may also visit my other articles on SQL Server.

You can connect with me @.

Conclusion

In this article, we have discussed various methods to delete duplicate rows from a table in SQL Server

I hope you enjoyed this article. Follow C# Corner to learn more new and amazing things about SQL Server.

Thanks for reading.