SQL Server tables should never contain duplicate rows, if there is a key
constraints. Table with no key constraints allow the duplicates. It happens in
many ways such as import data from other resources.
Nothing to worry, write a query and make your database without duplicates.
First we need a table and some data to explain the process. Use the following
script to create a table and insert a record.
CREATE TABLE Dup_Students (ID INT, FirstName varchar(25), Department Char(2))
CREATE
TABLE Dup_Students (ID
INT, FirstName
varchar(25),
Department Char(2))
INSERT
INTO Dup_Students
VALUES(1,
'Jack',
'IT')
INSERT
INTO Dup_Students
VALUES(2,
'Alice',
'ME')
INSERT
INTO Dup_Students
VALUES(3,
'James',
'EE')
INSERT
INTO Dup_Students
VALUES(4,
'Nickle',
'CE')
INSERT
INTO Dup_Students
VALUES(5,
'George',
'IT')
--
INSERT
INTO Dup_Students
VALUES(1,
'Jack',
'IT')
INSERT
INTO Dup_Students
VALUES(2,
'Alice',
'ME')
INSERT
INTO Dup_Students
VALUES(3,
'James',
'EE')
INSERT
INTO Dup_Students
VALUES(4,
'Nickle',
'CE')
--
INSERT
INTO Dup_Students
VALUES(1,
'Jack',
'IT')
INSERT
INTO Dup_Students
VALUES(2,
'Alice',
'ME')
Now you can view the number of duplicate records in each row.
SELECT
ID, FirstName,
Department, Count(*)
as DuplicateCount
From
Dup_Students
group
by ID,FirstName,
Department
Before deleting you can view the records with row number for the duplicates.
This result set is going to act as a temp table for the delete process.
SELECT
ID, FirstName,
Department,
ROW_NUMBER() OVER(PARTITION
BY ID, FirstName
ORDER BY ID)
AS DuplicateRowCount
FROM
Dup_Students
We are going to delete all the duplicates using CTE (Common table expression)
and ROW_NUMBER(), which is a new in SQL server 2005.
WITH
Dup_Students_CTE (ID,
FirstName, Department,
DuplicateRowCount)
AS
(
SELECT
ID, FirstName,
Department,
ROW_NUMBER() OVER(PARTITION
BY ID, FirstName
ORDER BY ID)
AS DuplicateRowCount
FROM
Dup_Students
)
DELETE
FROM Dup_Students_CTE
WHERE
DuplicateRowCount > 1
GO