Find and remove duplicates: Difference between revisions

Jump to navigation Jump to search
mNo edit summary
Line 15: Line 15:
References
References
* [http://www.extendoffice.com/documents/excel/1499-count-duplicate-values-in-column.html How to count duplicate values in a column in Excel?] Using {{kbd | key = COUNTIF(range, criteria)}} {{access | date = 2015-08-25}} or using '''Pivot Tables'''(樞紐分析表)  to find the occurrence of value >= 2
* [http://www.extendoffice.com/documents/excel/1499-count-duplicate-values-in-column.html How to count duplicate values in a column in Excel?] Using {{kbd | key = COUNTIF(range, criteria)}} {{access | date = 2015-08-25}} or using '''Pivot Tables'''(樞紐分析表)  to find the occurrence of value >= 2
==== Marking Duplicate Titles with Sequential Numbers ====
'''Practical Example:'''
{| class="wikitable"
|-
! Row !! Title !! Formula (Sequential) !! Formula Result (Sequential) !! Formula Result (Flag Only)
|-
| A2 || Duplicate Title || =COUNTIF($A$2:A2, A2)-1 || 0 || (blank)
|-
| A3 || Other Title || =COUNTIF($A$2:A2, A3)-1 || 0 || (blank)
|-
| A4 || Duplicate Title || =COUNTIF($A$2:A2, A4)-1 || 1 || 1
|-
| A5 || Duplicate Title || =COUNTIF($A$2:A2, A5)-1 || 2 || 1
|}
Example function for marking duplicate occurrence order
<pre>
=COUNTIF($A$2:A2, A2)-1
</pre>
This function marks duplicate entries with sequential numbers (0 for first occurrence, 1 for second, 2 for third, etc.). Here's a breakdown of how it operates:
* $A$2:A2: This is an expanding range with mixed cell references. The starting point ($A$2) is fixed with absolute reference, while the ending point (A2) uses relative reference that changes as the formula is copied down.
* COUNTIF($A$2:A2, A2): This counts how many times the value in the current row appears from the start of the range up to and including the current row. For the first occurrence, the range is just one cell (A2:A2) so it returns 1. For the second occurrence, the range expands (A2:B3) and returns 2.
* -1: Subtracting 1 converts the count to a zero-based index. First occurrence = 1-1 = 0, second occurrence = 2-1 = 1, third occurrence = 3-1 = 2.
'''Marking Only Duplicate Entries'''
Example function for flagging duplicates only
<pre>
=IF(COUNTIF($A$2:A2, A2)>1, 1, "")
</pre>
This function only marks duplicate entries (second occurrence onwards) with 1, leaving first occurrences blank. Here's how it works:
* COUNTIF($A$2:A2, A2)>1: This checks if the current value has appeared more than once in the expanding range from the start up to the current row.
* IF(..., 1, ""): If the condition is true (meaning this is a duplicate), the function returns 1. Otherwise, it returns an empty string (""), leaving the cell blank.
* Result: First occurrence shows nothing, second and subsequent occurrences show 1.


==== Finding duplicate rows that differ in multiple columns ====
==== Finding duplicate rows that differ in multiple columns ====

Navigation menu