Selected by PC Webopaedia

UNIX Utilities - awk



awk [program|-f programfile] [flags/variables] [files]

function name(argument(s), localvar(s)) {

return value



{  split($1,t,":")
   $1 = (t[1]*60+t[2])*60+t[3]
   print
}

Replaces an HH:MM:SS time stamp in the first field with a seconds since midnight value which can be more easily plotted, computed with, etc.



     {  for(i = 1; i<=NF; i++) ct[$i] += 1 }
END  {  for(w in ct) {
          printf("%6d %s",ct[w],w)
        }
     }

This reads a file of text and creates a file containing each unique word along with the number of occurrences of the word in the text.



NR=1  { t0=$1; tp = $1; for(i=1;i<=nv;i++) dp[i] = $(I+1);next}
      { dt=$1-tp;
        tp = $1
        printf("%d ",$1-t0)
        for(i=1;i<=nv;i++) {
          printf("%d ",($(I+1)-dp[i])/dt)
          dp[i] = $(i+1)
        }
        printf("\n")
      }

Take a set of time stamped data and convert the data from absolute time and counts to relative time and average counts. The data is presumed to be all amenable to treatment as integers. If not, formats better the %d must be used.



BEGIN{  printf("set term postscript\n") > "plots"
        printf("set output '|lpr -Php'\n") > "plots" }
     {  if(system("test -s " $1 ".r") { 
          print "process1 " $1 ".r " $2
          printf("plot '%s.data' using 2:5 title '%s'",\
                  $1,$3)  >> "plots"
        }
     }
END  { print "gnuplot < plots" }

Write a pair of set lines to a file called plots. For each input line, if a file whose name is the first field on the line with a .r appended exists, write a command to the stdout file containing the file name and the second field from the line; also write a plot statement to a file called plots using the third field from the input line. After the file has been processed, add a gnuplot command to the stdout file. If all of the output is passed to sh or csh through a pipe, the commands will be executed.



BEGIN  { l[1]=25; l[2]=20; l[3]=50 }
/^[ABC]/ {
         I = index("ABC", substr($0,1,1))
         a=$0 "                                               "
         print substr(a,1,l[i])
       }
       { print }

Make lines whose first characters are 'A', 'B', or 'C' have lengths of 25, 20, and 50 bytes respectively, changing no other lines.



/^\+/ { hold = hold "\r" substr($0,2); next}
      { if( unfirst ) print hold
        hold =""
      }
/^1/	{ hold = "\f" }
/^0/	{ hold = "\n" }
/^-/	{ hold = "\n\n" }
      { unfirst = 1
        hold = hold + substr($0,2)
      }
END	{ if(unfirst) print hold }

This routine will take FORTRAN-type output with leading ANSI vertical motion indicators and convert it to a stream with ASCII printer control sequences in it.



BEGIN  { b=""; if(ll==0) ll=72 }
NF==0  { print b; b=""; print ""; next }
       { if(substr(b,length(b),1)=="-") {
              b=substr(b,1,length(b)-1) $0 }
         else b=b " " $0
         while(length(b)>ll) {
            i = ll
            while(substr(b,i,1)=" ") I--
            print substr(b,1,i-1)
            b = substr(b,i+1)
         }
       }
END	 { print b; print "" }

This will take an arbitrary stream of text (where paragraphs are indicated by consecutive \n) and make all the lines approximately the same length. The default output line length is 72, but it may be set via a parameter on the awk command line. Both long and short lines are taken care of but extra spaces/tabs within the text are not correctly handled.



BEGIN {	FS = "\t"   # make tab the field separator
         printf("%10s %6s %5s   %s\n\n",
                "COUNTRY", "AREA", "POP", "CONTINENT")
      }
      { printf("%10s %6d %5d   %s\n", $1, $2, $3, $4)
        area = area +$2
        pop = pop + $3
      }
END	{ printf("\n%10s %6d %5d\n", "TOTAL", area, pop) }

This will take a variable width table of data with four tab separated fields and print it as a fixed length table with headings and totals.



Operators

" "              The blank is the concatenation operator
+ - * / %        All of the usual C arithmetic
                 operators, add, subtract, multiply,
                 divide and mod.
== != < <= > >=  All of the usual C relational
                 operators, equal, not equal, less
                 than, less than or equal and greater
                 than, greater than or equal
&& ||            The C boolean operators and and or
= += -= *= /= %= The C assignment operators
~ !~             Matches and doesn't match
?:               C conditional value operator
^                Exponentiation
++ --            Variable increment/decrement
      Note the absence of the C bit operators &, |, << and >>

[s]printf format items

Format strings in the printf statement and sprintf function consist of three different type of items: literal characters, escaped literal characters and format items. Literal characters are just that: characters which will print as themselves. Escaped literal characters begin with a backslash (\) and are used to represent control characters; the common ones are: \n for new line, \t for tab and \r for return. Format items are used to describe how program variables are to be printed.

All format items begin with a percent sign (%). The next part is an optional length and precision field. The length is an integer indicating the minimum field width of the item, negative if the data is to be white spacethe left of the field. If the length field begins with a zero (0), then instead of padding the value with leading blanks, the item will be padded with leading 0s. The precision is a decimal followed by the number of decimal digits to be displayed for various floating point representations. Next is an optional source field size modifier, usually 'l' (ell). The last item is the actual source data type, commonly one of the list below:

     d     Integer
     f     Floating point in fixed point format
     e     Floating point invaluel format
     g     Floating point in "best fit" format; integer, fixed
           point, or exponential; depending on exact value
     s     Character string
     c     Integer to be interpreted as a character
     x     Integer to be printed as hexadecimal

Examples:

   %-20s   Print a string in the left portion of a 20 character
           field
   %d      Print an integer in however many spaces it takes
   %6d     Print an integer in at least 6 spaces; used to format
           pretty output
   %9ld    Print a long integer in at least 9 spaces
   %09ld   Print a long integer in at least 9 spaces with leading
           0s, not blanks
   %.6f    Print a float with 6 digits after the decimal and as
           many before it as needed
   %10.6f  Print a float in a 10 space field with 6 digits after
           the decimal

| Search EITS |

| Comments and Suggestions | OCIO Home | EITS Home | UGA Home |