UC Irvine, Information and Computer Science Department Winter 2000

ICS 54: Tools: Filters and Utilities: Brief notes for Chapter 14


Introduction

UNIX allows you compose programs interactivly or from a command file.

The shell provides the user interface for programs.

This encourages/dictates keeping programs simple with single purpose.

To use this capability effectively, you must learn


Simple Filters

Filters are programs that transform an input stream to produce an output stream.

pg,   more, and   less
For paging through a file. Examples:
  pg file
  .... | more
  less  <  ...
head [ -n ]
Display first n lines of a file or standard input. Examples:
  head -100 file
  .... | head -100
  head -100  <  ...
tail [ -n ]
Display last n lines of a file.
tail [ +n ]
Display line n to the end of a file or standard input.
tail [ -f ]
Follow. After doing the initial  "tail" on a file keep reading and passing through any later additions to the file.
Useful in monitoring a file that is being written (e.g., a log file).
wc [ -Clw ]
Count Words (well, actually Characters, Lines, and Words) in a file or input stream.
uniq [ -cdu ]
Report or filter out repeated lines in a file or input stream, producing an output stream of unique lines. Options:
-c  =  Label each line with its replication count.
-d  =  Show only duplicated (repeated) lines.
-u  =  Show only unique (non-repeated) lines.
spell
List misspelled words in a file or standard input.
tee file
Copy standard input to to standard output but also put a copy of it into file. Examples:
    ... | tee A.copy | ...
   spell memo | tee misspelled | more
expand
Change tabs to spaces
unexpand
Change spaces to tabs
fmt [ -w ]
Simple text formatting for filling and joining lines to make them a maximum width of w
Works great in vi. To format a paragraph to have maximum line width of 64:   !}fmt -64


More Filters: tr

tr [ -cds ]   [ str1 [ str2 ] ]
Translate corresponding characters in str1 into those in str2. For example, to convert upper case to lower:
  tr   '[A-Z]' '[a-z]' < file1 > file2
-c   option
Instead of using str1, use the Complement of it (every character not in str1. For example, to replace every character which is not a letter or a number by #:
  tr  -c  '[A-Z][a-z][0-9]'   '[#*]'
-d   option
Delete all input characters in str1
-s   option
Squeeze all strings of repeated output characters in str2 to single characters. The classic example transforms the input into a list of words, one to a line:
  tr  -cs  '[A-Z][a-z]'   '[\012*]'


More Filters: sort

sort [ -tchar ] [ -frn ] [ +pos1 [ -pos1 ] ]
Sort according to fields separated by char and going from position pos1 to position pos2 each of the form w.c where w denotes fields offset from the start and c denotes character offsets within the field.
-f   option
Ignore case, Folding upper and lower together.
-r   option
Sort in Reverse order
-n   option
Sort in numeric order
Example: Sort ThisFile in reverse numeric order on the field after the fourth colon (":"):
sort   -t:   +4nr   ThisFile
Example: List a directory in reverse order of file size:
ls   -l   |  sort   +4nr


More Filters: The grep family

grep     fgrep     egrep
Search for a string or pattern in files or an input stream.
Normally, matching lines are output, but
Options can change this default behavior.
grep [ options ] pattern
pattern   can be a fixed string or a (limited) form of   regular expressions   as in vi. See Table 14-1 in the text, but beware of the typo that says "^^" when it should be "^".
.Match any character
^Match start of line
$Match end of line
[...]Match any character in brackets
Example: [abcA-Z7]
[^...]  Match any character except those in brackets
Example: [^abcA-Z7]
*Match 0 or more repetitions of previous item
fgrep [ options ] string
Fixed strings only, no regular expressions allowed.
But can look for multiple strings using the -f option to read these strings from a file, one string per line.
egrep [ options ] pattern
Allows full   regular expressions   as in awk. See Table 14-2 in the text, but beware of the typo that says "++" when it should be "+".
+Match 1 or more repetitions of previous item
?Match 0 or 1 repetitions of previous item
(...) Treat enclosed text as a group/item
|Separator for items which are considered alternatives.
Example: (NY|LA|SF)
-c   option
Display Count of matches
-i   option
Ignore case of letters
-n   option
Put line Numbers in front of each match
-l   option
List only the names of files containing a match, not each of the lines.
-v   option
Display only lines that do not match the string/pattern.
-w   option
Match only as Words not as (sub)string within a word
-f file   option
Take strings/patterns to be matched from   file
-e expression   option
Used to indicated Explicitly that the   expression   (string/pattern) follows.
Useful when the   expression   begins with a "-".


Utilities

script [ typescript ]
Start a shell and keep a transcript (in typescript) of everything printed on your screen.
Exit with exit and clean up the control characters in typescript.
Default value of typescript is typescript.
bc
Your Basic Calculator, but ibase=8 and obase=12 is a great way to convert from base 8 to base 12 if you ever have to.
cal [ [month] year ]
Display a calendar
echo
Both a built-in part of various shells and a separate program.
Extremely useful, but can cause portability problems.
date [ + format]
Current date (and time) in just about any format you can want.
find   path   expression  
Find files by recursively descending the directory hierarchy for each path specified by  path  seeking files that match the criteria given by  expression 
Example:
    find . -name '*.html' -mtime -3 -exec ls -l {} \;
This lists each file whose name ends ".html" and which has been modified in the last 3 days.
Want more examples? Want more of an explanation of exec? Want more details about expressions including complex expressions?
     "Read The Friendly Manual" or another reference.

Comments are welcome.
Current as of 28 January 2000
HTML 4.01 Checked.