X
Let's get in touch

Unix

Everybody already knows how great file-handling tools *NIX operating systems have. I’m just constantly amazed with how much you can get done in a single command line. For example, imagine a process that does some data processing on a remote machine and logs all the communication between the client and the server machine in one file. Now, imagine this process running for 30+ hours, producing a 220+ Mb log file.

After the process is done, your boss wants some kind of reporting – how many entries were processed, how many were processed sucessful, how many errors were there and which errors they were. Not much of a problem when working on a UNIX machine:

A:    cat out.txt | grep 'COMMAND SUCCESS' | wc -l
B:    cat out.txt | grep 'COMMAND FAILED' | wc -l

… and just to make sure, lets check if A+B = C:

C:    cat in.txt | wc -l

Now, lets report on errors:

cat out.txt | grep 'ERROR_CODE:' | sort | uniq

returns a list of errors:

ERROR_CODE: 10065
ERROR_CODE: 11245
ERROR_CODE: 19543

and now just lets find out how many of each we got:

cat out.txt | grep 'ERROR_CODE: 10065' | wc -l
cat out.txt | grep 'ERROR_CODE: 11245' | wc -l
cat out.txt | grep 'ERROR_CODE: 19543' | wc -l

Email to the boss, and we’re done. All good? Great. But, another email comes in: “Could you please send me a list of all the IDs that caused an error #11245”. Sure, no problem:

cat out.txt | grep -B 7 'ERROR_CODE: 11245' | grep 'REQUEST_ID' | awk '{ print $2; }' | sed 's/REQUEST_ID:([0-9*])/1/g' > ids.txt

Lets explain this one a bit:

  • the initial request that was sent to the system was logged 7 lines before the ERROR_CODE (therefore the “-B 7”)
  • the line with the request had the following format:
    START REQUEST_ID:XXXXX SOME_OTHER_STUFF
    (therefore the awk part)
  • with sed we just extracted the number from the request_id column

Can it get any more powerful than this?

Leave your comment
  • I never thought that a day would come where I could say: I understand stuff like written above. Lo and behold, the day has come.

    Love the sed,awk,grep and Unix in a Nutshell – the book that made them accessible to me 🙂

  • tnovak

    > All good? Great. But, another email comes in: “Could you please
    > send me a list of all the IDs that caused an error #11245″. Sure, no problem:

    Alternatively, awk could be avoided with the tiny modification to the regex:

    … | grep REQUEST_ID | sed -r ‘s/^.*REQUEST_ID:([0-9]+).*$/1/g’

    or even:

    … | perl -ne ‘print $1 if /REQUEST_ID:(d+)/’

    😉

  • Frank

    Love thy unix, but also love thy ports…

    http://unxutils.sourceforge.net/

We use cookies to help us optimize the website experience. Okay

Learn more about our privacy policy.