Saturday, March 6, 2010

File Preprocessing with awk and Perl

In the last week i have been written bash, perl, python, sed, awk scripts, i have been writting even scripts to write my scripts, this are the tricks i want to save for later...

- how to get rid of the blank lines of a file
awk '$1 !~ /(^$)/' someFile > someNewFile

this will do the same

cat somefile | grep -v '^$' >> someNewFile

Other problem i have run into, has been the problems with carriage return in unix and other systems, in some systems is "\r" in unix is "\n", the worst part happens when you want to read a file in bash and nothing happens and then you check by using

od -c

and realize that all the newlines are "\r". I found that the best way to solve this is to use a simple Perl one liner:

perl -pe 's/\r\n|\n|\r/\n/g'

which i found in the newline wiki page