Data cleaning: Difference between revisions

Jump to navigation Jump to search
58 bytes added ,  20 September 2019
m
Line 433: Line 433:
* Google spreadsheet add-on: [https://www.ablebits.com/google-sheets-add-ons/remove-duplicates/howto.php Remove Duplicates for Google Sheets help]
* Google spreadsheet add-on: [https://www.ablebits.com/google-sheets-add-ons/remove-duplicates/howto.php Remove Duplicates for Google Sheets help]


=== Counting number of occurrences (or frequency) ===
=== Counting number of duplicate occurrence ===
MySQL: find the number of duplicate occurrence between list_a & list_b which using the same primary key: column name {{kbd | key = id}}
MySQL: find the number of duplicate occurrence between list_a & list_b which using the same primary key: column name {{kbd | key = id}}
* {{kbd | key = SELECT count(DISTINCT(`id`)) FROM `list_a` WHERE `id` IN (SELECT DISTINCT(`id`) FROM `list_b`) ; }}
* {{kbd | key = SELECT count(DISTINCT(`id`)) FROM `list_a` WHERE `id` IN (SELECT DISTINCT(`id`) FROM `list_b`) ; }}
Line 440: Line 440:
* [http://superuser.com/questions/307837/how-to-count-number-of-repeat-occurrences microsoft excel - How to count number of repeat occurrences - Super User] {{exclaim}} long number issue: [https://superuser.com/questions/783840/countif-incorrectly-matches-long-number microsoft excel - Countif incorrectly matches long number - Super User]
* [http://superuser.com/questions/307837/how-to-count-number-of-repeat-occurrences microsoft excel - How to count number of repeat occurrences - Super User] {{exclaim}} long number issue: [https://superuser.com/questions/783840/countif-incorrectly-matches-long-number microsoft excel - Countif incorrectly matches long number - Super User]


=== Counting number of occurrences (or frequency) of string ===
Cygwin
Cygwin
* (1) separate each string by [[Return symbol | return_symbol]] (2) [https://www.computerhope.com/unix/uuniq.htm uniq command] on Cygwin of {{Win}} or {{Linux}}: {{kbd | key=<nowiki>sort <file.txt> | uniq -c</nowiki>}}<ref>[https://unix.stackexchange.com/questions/134446/counting-the-occurrences-of-the-string text processing - Counting the occurrences of the string - Unix & Linux Stack Exchange]</ref>
* (1) separate each string by [[Return symbol | return_symbol]] (2) [https://www.computerhope.com/unix/uuniq.htm uniq command] on Cygwin of {{Win}} or {{Linux}}: {{kbd | key=<nowiki>sort <file.txt> | uniq -c</nowiki>}}<ref>[https://unix.stackexchange.com/questions/134446/counting-the-occurrences-of-the-string text processing - Counting the occurrences of the string - Unix & Linux Stack Exchange]</ref>

Navigation menu