Removing duplicates in Excel

Duplication management in Excel Spreadsheets

Finding duplicate information in large amounts of data can be difficult and time consuming if you try and check each and every piece of data by hand. Fortunately Excel gives you several tools to find and remove them, depending on whether you want a one-off clean, a filtered list of unique values or a dynamic solution that updates as data changes.

Remove duplicates from list

The best way to permanently remove duplicate rows from a table or range is to use the in-built Excel function.

1. Select your data (include headers if you want).

2. Go to the Data Tab and then select Remove Duplicates.

        

3. In the dialog box tick My data has headers if you included it in your selection.

4. Choose the columns to check for duplicates:

  • All columns: rows must match exactly to another row to be considered a duplicate.
  • Specific columns: Only the data in those columns must be duplicated for the row to be removed. Therefore if you only select one column to check for duplicates, even if the other columns are not duplicated, the row will be removed.

5. Click OK. Excel deletes dupes and reports how many were removed.

In the following example, only rows which match entirely have been removed. Therefore only the last row has been removed from the original as BOTH columns matched the row at the top.

In this example we asked for the dedupe to occur only against the Name so the last two instances of “Tom” were removed.

Tips

  • This keeps the first occurrence and removes subsequent ones.
  • It’s case-insensitive (“ABC” equals “abc”).

Keep originals intact

If you want to keep the original list and want to produce another deduped list of data then do the following:

  1. Select the data. You must include a Header row for this to work.
  2. From the Data tab select Advanced (in the Sort & Filter group).

3. From the form which appears, select “Copy to another location”

4. In the “Copy To” box enter the location you want to put the new deduped list making sure the area you want to data to appear in does not already have data in it.

5. Make sure the “Unique records only” box is ticked


However it is generally easier to copy the table of data you want to dedupe, and then perform dedupe method 1 on this table than use this method.

Dynamic function (Microsoft 365)

If you have Excel 365 then the “UNIQUE” function can provide a unique list. Simply enter a formula using the UNIQUE function. Note that because it is a dynamic function, it will change size depending if the original list values change. So make sure nothing is below where you write this formula

The formula is created using the following parameters

=UNIQUE(array, [by_cols], [exactly_once])

Eg “=UNIQUE(I12:J16, FALSE, FALSE)”

The “array” section is the range of data you want to make unique

The [by_cols] part indicates if you want to filter by column (TRUE) or rows (FALSE). Rows is the default and probably what you will use most often.

If [exactly_once] is set to FALSE then duplicates will be removed. If it is set to TRUE then the resulting table will only show data which appears once. In other words ALL duplicates will be removed and only rows (or columns) which do not have any duplicates will remain.

Because it’s formula-driven, your list updates automatically as data changes—great for dashboards.

Highlight Duplicates

If you want to see duplicates before deleting:

1. Select the range.

2. From the “Home” tab select conditional formatting, then Highlight Cell Rules and then Duplicate values.

      

3. Choose a format and click OK

PivotTable

PivotTables naturally collapse duplicates and can be a quick way to remove duplicates from different combinations of column

1. Select your data

2. From the Insert menu select Pivot Table.

3. Fill in the form which appears (eg choose if you want the pivot table on the same sheet or another sheet)

4. In the Pivot table put the field(s) you want to dedupe into Rows only.text

Common pitfalls & clean-up tips

  • Hidden spaces / non-printing characters: Use the =TRIM() and =CLEAN() functions before deduping. A trailing space makes two entries look identical but they will be treated as distinct and will both appear in the output.
  • Mixed case expectations: Excel’s native tools are case-insensitive. If you need case-sensitive uniqueness then you can use a formula like =SUMPRODUCT(--EXACT(A2,$A$2:$A$100)) to flag true duplicates. Add this formula to a column in the table of data and include it when deduping.
  • Merged cells: Avoid them; they break sorting and removing duplicates. Unmerge first (Home → Merge & Center → Unmerge).
  • Data types: Standardize formats (dates, numbers vs. text) so that equivalent values compare equal.
  • Backups: For one-time deletions, duplicate the sheet or save a copy first.

Conclusion

There are several different ways available to achieve the same dedupe result. As most of them are fairly simple to use, a lot of the time it comes down to personal preference to which one you use.

The contact form requires that you configure reCAPTCHA in the site configuration.

The contact form requires that you select an email template.