Data cleaning: Difference between revisions

Jump to navigation Jump to search
83 bytes added ,  19 October 2015
Line 189: Line 189:
== duplicate data ==
== duplicate data ==
* EXCEL:  
* EXCEL:  
** one column data: [http://www.extendoffice.com/documents/excel/1499-count-duplicate-values-in-column.html How to count duplicate values in a column in Excel?] Using {{kbd | key = COUNTIF(range, criteria)}} {{access | date = 2015-08-25}}
** one column data: [http://www.extendoffice.com/documents/excel/1499-count-duplicate-values-in-column.html How to count duplicate values in a column in Excel?] Using {{kbd | key = COUNTIF(range, criteria)}} {{access | date = 2015-08-25}} or using '''Pivot Tables'''(樞紐分析表)  to find the occurrence of value >= 2
** two columns data: [https://support.microsoft.com/en-us/kb/213367 How to compare data in two columns to find duplicates in Excel] {{access | date = 2015-06-16}} {{exclaim}} It may costs too much time (larger than one hour) if the number of records exceeds 1,000,000
** two columns data: [https://support.microsoft.com/en-us/kb/213367 How to compare data in two columns to find duplicates in Excel] {{access | date = 2015-06-16}} {{exclaim}} It may costs too much time (larger than one hour) if the number of records exceeds 1,000,000
* PHP: [http://php.net/manual/en/function.array-unique.php PHP: array_unique], [http://php.net/manual/en/function.array-intersect.php PHP: array_intersect]
* PHP: [http://php.net/manual/en/function.array-unique.php PHP: array_unique], [http://php.net/manual/en/function.array-intersect.php PHP: array_intersect]

Navigation menu