Linux: Stream EDitor (SED)

On the Internet, there exist many useful and comprehensive manuals how to deal with GNU Stream EDitor (SED), this is just to remind some of the useful variations…

Basic usage

Just to remind s stands for substitute, g stands for globally which means, substitute all occurrences of the ORIGINAL string with the REPLACEMENT string.

sed -e 's/ORIGINAL/REPLACEMENT/g' inputFile

Of course, on Unix-based systems, the “pipe” can be leveraged as well:

cat inputFile | sed -e 's/ORIGINAL/REPLACEMENT/g'

To save the results directly in the same file, use:

sed -i 's/ORIGINAL/REPLACEMENT/g' /path/to/the/file

Advanced usage

To delete the first three rows (1 to 3), use:

cat inputFile | sed '1,3d'

To not delete (print only) the fifth row, use:

cat inputFile | sed '5!d'

To skip lines containing a specific STRING only (and print the rest):

cat inputFile | sed '/STRING/d'

To print lines containing a specific STRING only:

cat inputFile | sed '/STRING/!d'

To remove empty lines or lines containing only spaces:

cat inputFile | sed -e '/^ *$/d'

or

cat inputFile | sed -e '/^s*$/d'

The following metacharacters were used:

  • ^ the carret matches the beginning of the line
  • $ the dollar sign matches the end of the line
  • . the dot matches any single character
  • * the asterisk matches zero or more occurrences of the previous character
  • [] the brackets match every character stated in the brackets (e.g., [a-z], [0-9], etc.)
  • & the ampersand holds the matched string
  • ! the exclamation mark negates the expression

To remove all characters from the beginning of a line until a specific STRING:

sed -i 's/^.*STRING/STRING/g' /path/to/the/file

To remove all characters from a specific STRING until the end of the line:

sed -i 's/STRING.*/STRING/g' /path/to/the/file

To replace a STRING with a newline (a backslash-escaped literal newline is needed to get to sed):

sed -e $'s/STRING/\\n/g'

To replace the last occurrence of a character X on each line by character Y:

sed -i 's/(.*)X/1Y/' /path/to/the/file

To replace the last occurrence of a character X until the end of the line:

sed -i 's/(.*)X.*/1/' /path/to/the/file

To replace fixed groups of characters among them, for instance to modify the date format in MMDDYYYY to YYYYMMDD:

echo "MMDDYYYY" | sed -e 's_(..)(..)(....)_312_g'

To insert leading zeros to a list of numbers:

echo "5 17 3 351 1" | sed -e 's/b[0-9]b/00&/g; s/b[0-9]{2}b/0&/g'

The script works in two phases. The first one replaces one digit strings with two leading zeroes, and the second one replaces two digit strings with just one leading zero.

To calculate the total number of rows, use:

sed -n '$=' FileToCountRows

To remove all the html tags from the given file, use:

cat file.html | sed '/</{
:loop
s/<[^//g
/</{
N
b loop
}
}'

To replace specific string file new line, use:

sed -e "s/STRING/n/g"

To convert a multiline string into an array in bash, use:

mapfile -t STRINGARRAY <<< "$MULTILINESTRING"

An Example File to Play with

The are many funny limericks on the Internet. This one can be used as a source file to test the above examples:

cat man_from_esser 
There once was an old man of Esser,
whose knowledge grew lesser and lesser.
It at last grew so small,
he knew nothing at all.
And now, he's a college professor.
Advertisements
This entry was posted in Linux and tagged , , . Bookmark the permalink.