Merge multiple text files into one file: Difference between revisions
Jump to navigation
Jump to search
(Created page with "Step 1: check the last line of text file is newline<ref>[https://stackoverflow.com/questions/34943632/linux-check-if-there-is-a-newline-at-the-end-of-a-fil...") |
No edit summary |
||
Line 1: | Line 1: | ||
== Steps == | |||
Step 1: check the last line of text file is [[Return symbol | newline]]<ref>[https://stackoverflow.com/questions/34943632/linux-check-if-there-is-a-newline-at-the-end-of-a-file eof - Linux - check if there is a newline at the end of a file - Stack Overflow]</ref> | Step 1: check the last line of text file is [[Return symbol | newline]]<ref>[https://stackoverflow.com/questions/34943632/linux-check-if-there-is-a-newline-at-the-end-of-a-file eof - Linux - check if there is a newline at the end of a file - Stack Overflow]</ref> | ||
* {{kbd | key=<nowiki>tail -c 1 file.txt</nowiki>}} on {{Linux}}. Parameter "-c <span style="text-decoration: underline;">number</span>: The location is <span style="text-decoration: underline;">number</span> bytes." quoted from the [http://man7.org/linux/man-pages/man1/tail.1.html commands manual]. If the last line is newline, returned result will be empty. {{exclaim}} How to check multiple files? | * {{kbd | key=<nowiki>tail -c 1 file.txt</nowiki>}} on {{Linux}}. Parameter "-c <span style="text-decoration: underline;">number</span>: The location is <span style="text-decoration: underline;">number</span> bytes." quoted from the [http://man7.org/linux/man-pages/man1/tail.1.html commands manual]. If the last line is newline, returned result will be empty. {{exclaim}} How to check multiple files? | ||
Line 10: | Line 12: | ||
* {{kbd | key=sort -us -o bundle_unique.txt bundle.txt}}<ref>[http://unix.stackexchange.com/questions/19641/how-to-remove-duplicate-lines-in-a-large-multi-gb-textfile linux - How to remove duplicate lines in a large multi-GB textfile? - Unix & Linux Stack Exchange]</ref> OS: {{Linux}}, cygwin of {{Win}} "-u means Unique keys; -s means stable sort; -o means output" quoted from [https://www.computerhope.com/unix/usort.htm sort] manual. | * {{kbd | key=sort -us -o bundle_unique.txt bundle.txt}}<ref>[http://unix.stackexchange.com/questions/19641/how-to-remove-duplicate-lines-in-a-large-multi-gb-textfile linux - How to remove duplicate lines in a large multi-GB textfile? - Unix & Linux Stack Exchange]</ref> OS: {{Linux}}, cygwin of {{Win}} "-u means Unique keys; -s means stable sort; -o means output" quoted from [https://www.computerhope.com/unix/usort.htm sort] manual. | ||
Step 4: (optional) Remove the heading of CSV file | |||
Step 5: Verify the merge | |||
* count Number of Lines {{kbd | key=<nowiki>wc -l filename</nowiki>}}<ref>[https://www.tecmint.com/wc-command-examples/ 6 WC Command Examples to Count Number of Lines, Words, Characters in Linux]</ref> | |||
== References == | |||
<reference /> | |||
[[Category:Data Science]] | [[Category:Data Science]] | ||
[[Category:Text file processing]] | [[Category:Text file processing]] |
Revision as of 16:29, 30 January 2020
Steps
Step 1: check the last line of text file is newline[1]
- tail -c 1 file.txt on Linux . Parameter "-c number: The location is number bytes." quoted from the commands manual. If the last line is newline, returned result will be empty. How to check multiple files?
- (optional) If the last line is not newline, you may add the new line manually. See details on bash - How to add a newline to the end of a file? - Unix & Linux Stack Exchange
Step 2: Merge the content
- copy *.txt > bundle.txt or copy file1.txt file2.txt > bundle.txt on Win
- cat *.txt > bundle.txt or cat file1.txt file2.txt > bundle.txt on Linux [2][3]
Step 3: (optional) Remove the duplicated lines
- sort -us -o bundle_unique.txt bundle.txt[4] OS: Linux , cygwin of Win "-u means Unique keys; -s means stable sort; -o means output" quoted from sort manual.
Step 4: (optional) Remove the heading of CSV file
Step 5: Verify the merge
- count Number of Lines wc -l filename[5]
References
<reference />
- ↑ eof - Linux - check if there is a newline at the end of a file - Stack Overflow
- ↑ Terminal 101: Join Multiple Files Together with Cat
- ↑ How to merge all (text) files in a directory into one? - Unix & Linux Stack Exchange
- ↑ linux - How to remove duplicate lines in a large multi-GB textfile? - Unix & Linux Stack Exchange
- ↑ 6 WC Command Examples to Count Number of Lines, Words, Characters in Linux