Data cleaning: Difference between revisions

Jump to navigation Jump to search
254 bytes added ,  19 November 2019
(counting)
Line 448: Line 448:
Cygwin
Cygwin
* (1) separate each string by [[Return symbol | return_symbol]] (2) [https://www.computerhope.com/unix/uuniq.htm uniq command] on Cygwin of {{Win}} or {{Linux}}: {{kbd | key=<nowiki>sort <file.txt> | uniq -c</nowiki>}}<ref>[https://unix.stackexchange.com/questions/134446/counting-the-occurrences-of-the-string text processing - Counting the occurrences of the string - Unix & Linux Stack Exchange]</ref>
* (1) separate each string by [[Return symbol | return_symbol]] (2) [https://www.computerhope.com/unix/uuniq.htm uniq command] on Cygwin of {{Win}} or {{Linux}}: {{kbd | key=<nowiki>sort <file.txt> | uniq -c</nowiki>}}<ref>[https://unix.stackexchange.com/questions/134446/counting-the-occurrences-of-the-string text processing - Counting the occurrences of the string - Unix & Linux Stack Exchange]</ref>
file: test.txt
<pre>
#apple
#追劇
#電影
#綜藝
#Apple
#藍芽
</pre>
Result of the execution of command: {{kbd | key=<nowiki>sort test.txt | uniq -ic | sort -nr</nowiki>}}
<pre>
2 #Apple
  1 #電影
  1 #追劇
  1 #藍芽
  1 #綜藝
</pre>


== Outlier / Anomaly detection ==
== Outlier / Anomaly detection ==

Navigation menu