Quick Guide to nawk

Here is a quick guide to nawk. nawk as it has more functionalities over awk. Most systems now would have both programs installed. See also:

To run nawk

  • From command line : nawk ‘program’ inputfile1 inputfile2 …
  • From a file : nawk -f programfile inputfile1 inputfile2 …

Structure of nawk program

  • A nawk program can consist of three sections: nawk ‘BEGIN{…}{… /* BODY */ …}{END}’ inputfile
  • Both ‘BEGIN’ and ‘END’ blocks are optional and are executed only once.
  • The body is executed for each line in the input file.

Field Separators

  • The following example adds the field ‘=’ separator, in addition to the blank space separator: nawk ‘BEGIN{FS = ” *|=”}{print $2}’ input file.
  • For example, if the input file contains the line “Total = 500”, then the output will be 500.

Printing Environment Variables

  • The following example appends the current path to a list of filenames/directories:
    ls -alg | nawk ‘{print ENVIRON[“$PWD”] “/” $8}’
  • ENVIRON is an array of environment variables indexed by the individual variable name.
  • The variable FILENAME is a string that stores the current name of the file nawk is parsing.

Examples of usage

  • To kill all the jobs of the current user : kill -9 `ps -ef | grep $LOGNAME | nawk ‘{print $2}’`

Multi-dimensional array

  • To use 2D or multi-dimensional array, use comma to seperate the array index: matrix[3, 5] = $(i+5)

Another examples

  • The example below calculates the averages for 16 items from 10 sets of readings.
  • Example of an input line the program is trying to match: Total elapsed time is 560
BEGIN{
  printf("--------- Execution Time -----------\n");
  item=16;
  set=10;
}
{# all new variables are initialized to 0for(;j < set;j++)
  for(i=0;i < item; i++)
  {# skip input until the second word matches "elapsed"while($2 != "elapsed")
  getline;# notice the use of array without declaring its# dimensionsum[i]+=$5;
getline;
  }

if(j==set){for(i=0;i < item;i++){
   
  # this and the next 2 lines are comments
  # you can use either print or printf for output 
  # print sum[i]/set;
   
  printf("Set %d : %6.3f\n",i,sum[i]/set);
}
j++;
  }
}END{
  printf("-------------- End --------------");
}

Examples from the man page

  • Write to the standard output all input lines for which field 3 is greater than 5:
    $3 > 5
  • Write every tenth line:
    (NR % 10) == 0
  • Write any line with a substring matching the regular expression:
    /(G|D)(2[0-9][[:alpha:]]*)/
  • Print any line with a substring containing a G or D, followed by a sequence of digits and characters:
    /(G|D)([[:digit:][:alpha:]]*)/
  • Write any line in which the second field contains a backslash:
    $2 ~ /\\/
  • Write any line in which the second field contains a backslash (alternate method). Note that backslash escapes are interpreted twice, once in lexical processing of the string and once in processing the regular expression.
    $2 ~ “\\\\”
  • Write the second to the last and the last field in each line, separating the fields by a colon:
    {OFS=”:”;print $(NF-1), $NF}
  • Write lines longer than 72 characters:
    {length($0) > 72}
  • Write the first two fields in opposite order separated by the OFS:
    { print $2, $1 }
  • Same, with input fields separated by comma or space and tab characters, or both:
    BEGIN { FS = “,[\t]*|[\t]+” }{ print $2, $1 }
  • Add up first column, print sum and average:
    {s += $1 }END{print “sum is “, s, ” average is”, s/NR}
  • Write fields in reverse order, one per line (many lines out for each line in):
    { for (i = NF; i > 0; –i) print $i }
  • Write all lines between occurrences of the strings “start” and “stop”:
    /start/, /stop/
  • Write all lines whose first field is different from the previous one:
    $1 != prev { print; prev = $1 }
  • Simulate the echo command:
    BEGIN { for (i = 1; i < ARGC; ++i) printf “%s%s”, ARGV[i], i==ARGC-1?”\n”:””}
  • Write the path prefixes contained in the PATH environment variable, one per line:
    BEGIN{n = split (ENVIRON[“PATH”], path, “:”) for (i = 1; i <= n; ++i) print path[i]}