Count occurrences of a word in string: Difference between revisions

Count occurrences of a word in string (edit)

Revision as of 19:25, 5 September 2025

966 bytes added , 5 September 2025

→‎BASH

Planetoid

Bureaucrats, Administrators

14,990

edits

@@ Line 59: / Line 59: @@
 * (4) execute the following command {{kbd | key=<nowiki>sort <file.txt> | uniq -ic | sort -nr</nowiki>}}<ref>[https://unix.stackexchange.com/questions/134446/counting-the-occurrences-of-the-string text processing - Counting the occurrences of the string - Unix & Linux Stack Exchange]</ref><ref>[https://unix.stackexchange.com/questions/170043/sort-and-count-number-of-occurrence-of-lines Sort and count number of occurrence of lines - Unix & Linux Stack Exchange]</ref>
 * (5) Remove the leading whitespace in the file: Using the [[Text editor with support for regular expression | text editor]] with support for [[Regular expression|regular expression]] and replace {{kbd | key=<nowiki>^\s+(\d+)\s+</nowiki>}} with {{kbd | key=<nowiki>\1\t</nowiki>}}
+=== Input Format A: One term per line ===
+{{exclaim}} Each line contains only one term/keyword
 file: test.txt
@@ Line 70: / Line 73: @@
 </pre>
-=== Output format I: occurrence & keyword ===
+==== Output format I: count followed by keyword ====
 {{exclaim}} The term each line in the input file was allowed contains whitespaces.
@@ Line 92: / Line 95: @@
 </pre>
+==== Output format II: keyword followed by count ====
-=== Output format II: keyword & occurrence ===
 {{exclaim}} The term each line in the input file should '''not''' contains whitespaces.
@@ Line 115: / Line 117: @@
 </pre>
+=== Input Format B: Multiple terms per line ===
+{{exclaim}} Each line contains multiple terms/keywords separated by spaces
+file: input.txt
+<pre>
+電影 追劇 綜藝
+藍芽 apple 電影
+電影 綜藝
+</pre>
+==== Method using awk for word frequency counting ====
+{{kbd | key=<nowiki>awk '{for(i=1;i<=NF;i++) count[$i]++} END {for(word in count) print count[word], word}' input.txt | sort -nr</nowiki>}}
+Output:
+<pre>
+電影
+綜藝
+追劇
+藍芽
+apple
+</pre>
+How it works:
+* {{kbd | key=<nowiki>{for(i=1;i<=NF;i++) count[$i]++}</nowiki>}} - Loop through each field (word) in each line and increment its count
+* {{kbd | key=<nowiki>END {for(word in count) print count[word], word}</nowiki>}} - After processing all lines, print count and word for each unique word
+* {{kbd | key=<nowiki>sort -nr</nowiki>}} - Sort numerically in descending order
 === Verification of count occurrence ===

Count occurrences of a word in string: Difference between revisions

Count occurrences of a word in string (edit)

Revision as of 19:25, 5 September 2025

Navigation menu

Search