Data cleaning: Difference between revisions

Jump to navigation Jump to search
397 bytes added ,  19 September 2019
Line 434: Line 434:


=== Counting number of duplicate occurrence ===
=== Counting number of duplicate occurrence ===
* MySQL: find the number of duplicate occurrence between list_a & list_b which using the same primary key: column name {{kbd | key = id}}
MySQL: find the number of duplicate occurrence between list_a & list_b which using the same primary key: column name {{kbd | key = id}}
** {{kbd | key = SELECT count(DISTINCT(`id`)) FROM `list_a` WHERE `id` IN (SELECT DISTINCT(`id`) FROM `list_b`) ; }}
* {{kbd | key = SELECT count(DISTINCT(`id`)) FROM `list_a` WHERE `id` IN (SELECT DISTINCT(`id`) FROM `list_b`) ; }}
* Excel:  
 
** [http://superuser.com/questions/307837/how-to-count-number-of-repeat-occurrences microsoft excel - How to count number of repeat occurrences - Super User] {{exclaim}} long number issue: [https://superuser.com/questions/783840/countif-incorrectly-matches-long-number microsoft excel - Countif incorrectly matches long number - Super User]
Excel:  
* [http://superuser.com/questions/307837/how-to-count-number-of-repeat-occurrences microsoft excel - How to count number of repeat occurrences - Super User] {{exclaim}} long number issue: [https://superuser.com/questions/783840/countif-incorrectly-matches-long-number microsoft excel - Countif incorrectly matches long number - Super User]
 
Cygwin
* [https://www.computerhope.com/unix/uuniq.htm uniq command] on Cygwin of {{Win}} or {{Linux}}: {{kbd | key=<nowiki>uniq -c <file.txt> > <number_of_occurrence_and_item.txt></nowiki>}}<ref>[https://unix.stackexchange.com/questions/52534/how-to-print-only-the-duplicate-values-from-a-text-file shell - How to print only the duplicate values from a text file? - Unix & Linux Stack Exchange]</ref>


=== Other ===
=== Other ===

Navigation menu