You may need to remove empty lines from a text file in order to compact them, for providing as input to another utility or just for readability. These could be any kind of files such as a badly formatted html file, a log file with lots of empty or blank lines or even source code files with double spacing.
Using text editors
If these are files of manageable length and size, and there are not a whole lot of them, then you could probably use a text editor such as vi or kate to perform the task.
vi or vim editor is one of the most commonly used editor in Linux. It can be used from a command prompt without access to a X based system, which is often useful. Vi is a very powerful tool, which support many of the formats and command line syntaxes discussed later in the post.
This flexibility means that there are several different options available in the editor. All of it involves commands from the escape mode of the editor. To enter escape mode in vi, hit the escape key at any time from within the editor.
the ‘g‘ in the expression represents global mode and executes on all lines of the file. The second part is the regular expression that needs to match. The ‘^‘ matches the start of the line and ‘$‘ matches the end of the line. Together ‘^$‘ matches an empty line. The ‘d‘ at the end stands for the delete command. This will delete all empty lines, but will not remove lines that contain a blanks, spaces or white spaces.
Modifying the regular expression as above to match one or more occurrence of all white space will make sure that it will delete empty lines as well as the blank lines.
This is actually the negation of the previous syntax. The ‘v‘ stands for the reverse global mode, which means the command will execute on lines that does not match the expression. The ‘d‘ at the end is the same as before and stands for the delete command.
There are almost limitless options here to pretty much do anything, such as deleting all lines that match three blanks, condense multiple blank file into a single blank line, etc etc…..Also, the expressions used above or a variation thereof is used later in the post with the command line utilities as well.
Kate is another graphical editor. It is more or less the default text editor for KDE. You can of course delete the blank lines manually one after another as in any text editor, but Kate does provide some easy to use commands to let you do this.
Kate supports a Editor Component Command Line feature that you can access from View->Switch to Command Line in the menu bar or by using the shortcut F7.
rmblank: In the command line, type in the command rmblank to remove all empty lines in the current document. This will not however remove any lines with blanks or white spaces in them.
rtrim: This will trim the lines and get rid of any trailing spaces from the end of the lines. Doing a rtrim before rmblank will allow you to remove both the empty lines as well as the blank lines.
Many other advanced text editors such as UltraEdit support search and replace functionality using regular expressions. It is very likely that you can do this replacement using regex expressions as well in these editors.
Using command line utilities
Although using text editor is a good option, using a command line utility is faster and easier. It is also very useful if you many different files to process. There are several utilities in Linux that can be used to remove empty lines in text files. The most popular and commonly used ones are awk, sed, and grep.
bash$ awk '/./' filename.txt > newfile.txt
awk is a very versatile command. The above command removes all the empty lines in a file and prints the result out to the standard output. You can redirect the output to a file to save the result as a new file. The above command will remove all the empty lines in a file named filename.txt and saves it to newfile.txt.
The above command will not remove lines that has white spaces in them such as spaces or other non-printable characters. If you want to remove both the empty lines as well as the lines with spaces, then try the following command.
bash$ awk NF filename.txt > newfile.txt
sed is another popular utility for processing text. This follows a similar pattern to what was mentioned earlier in the vi editor section. The expressions used are very similar, mostly because as with most Linux commands it uses the regular expressions to match the input text.
bash$ sed '/^$/d' filename.txt > newfile.txt
The above command will remove empty lines but not the lines that may contain spaces or other non-printable characters. In order to remove the empty lines as well as the blank lines use the following sed command.
bash$ sed '/^\s*$/d' filename.txt > newfile.txt
The following command is actually just a negation of the first sed command. This will not remove (notice the !d) any lines that matches the regular expression, which matches all lines with any character in them.
bash$ sed '/./!d' filename.txt > newfile.txt
grep is more of a matcher utility compared to the sed command mentioned earlier. Both will take similar regular expressions to match the empty and blank lines in a file. You will need to use the -v option with grep to negate the expression and print out the non-matching lines.
The example below will delete all the empty lines in text file filename.txt, but not the ones with whitespaces.
bash$ grep '.' filename.txt > newfile.txt
In order to remove empty lines and as well all the lines with just white spaces in them, use the following grep command.
bash$ grep -v '^\s*$' filename.txt > newfile.txt
All of the above utilities gets the job done pretty much the same way by using very similar expressions to match the empty or blank lines in the file. You can now extend these to remove any matching lines not just the empty lines. For example, if you wish to remove all comments in a script file, you can use the following variation.
Assuming that the comments in the script file always occur in a single line by itself and starts with a # (hash or pound), the following command will remove all of those lines.
bash$ sed '/^\s*#/d' filename.txt > newfile.txt
You may further modify the expression to suit your needs.