Delete duplicate rows from a Table Using SQL

Introduction 

in this tutorial, you will learn how to delete duplicate rows from a table in SQL Server.


Suppose your SQL table contains duplicate rows and you want to remove those duplicate rows. 

The Problem:

For this demo, we need a sample table with few duplicate records. 

The Solution:

Let’s create a sample student table and insert a few records in it.


tblStudent

CREATE TABLE tblStudent
(
[ID] INT identity(1,1),
[FirstName] Varchar(100),
[LastName] Varchar(100),
[Country] Varchar(100),
)

Create Table SQL

Insert Data


Insert into tblStudent ([FirstName],[LastName],[Country] ) values('Sarafu','Sha','India');
Insert into tblStudent ([FirstName],[LastName],[Country] ) values ('Sarafu','Sha','India');
Insert into tblStudent ([FirstName],[LastName],[Country] ) values ('Raj','Kumar','NZ');
Insert into tblStudent ([FirstName],[LastName],[Country] ) values ('James','Smith','UK');
Insert into tblStudent ([FirstName],[LastName],[Country] ) values('James','Smith','UK');
Insert into tblStudent ([FirstName],[LastName],[Country] ) values ('James','Smith','UK');

Create Table SQL

List of Steps

Step 1: we use the SQL GROUP BY clause to identify the duplicate rows


SELECT
 [FirstName],
 [LastName],
 [Country],
 COUNT(*) AS Total_Rows

FROM [dbSample].[dbo].[tblStudent]
GROUP BY
 [FirstName],
 [LastName],
 [Country];

Step 2: get those records having occurrence greater than 1 in the Employee table


SELECT
  [FirstName],
  [LastName],
  [Country],
  COUNT(*) AS Total_Rows

FROM [dbSample].[dbo].[tblStudent]
GROUP BY
  [FirstName],
  [LastName],
  [Country]

HAVING COUNT(*) > 1;

Step 3: We use the SQL MAX function to calculate the max id of each data row.

SELECT * FROM [dbSample].[dbo].[tblStudent]
WHERE ID NOT IN
(
  SELECT MAX(ID)
  FROM [dbSample].[dbo].[tblStudent]
  GROUP BY
  [FirstName],
  [LastName],
  [Country]
);

Use Max function in SQL

Delete it: 

To remove this data, replace the first Select with the SQL delete statement.


DELETE FROM [dbSample].[dbo].[tblStudent]
WHERE ID NOT IN
(
  SELECT MAX(ID) AS MaxRecordID
  FROM [dbSample].[dbo].[tblStudent]
  GROUP BY [FirstName],
   [LastName],
  [Country]
);

Delete Duplicate records using max function

Leave a Reply