Landmark Graphics Corporation World Wide Technology Conference Houston, Texas December 1, 1994 A perl Tutorial by: Will Morse BHP Petroleum Copyright 1994 Will Morse. Permission is given to freely copy and distribute this paper as long as there is no charge except real and actual mechanical copying costs and as long as this notice is kept with each copy so others may copy it as well. The opinions expressed are the author's own and do not necessarily reflect the opinions or policies of The Broken Hill Proprietary Company, Limited, or its various divisions. Your milage may vary. INTRODUCTION: perl is the "Swiss Army Chainsaw of Systems Administration". This tutorial covers: * What is perl? page 2 * What does a perl program look like? page 4 * A brief, hopelessly incomplete, overview of perl syntax and features. - Variable naming page 5 - Assignments page 6 - Arithmetic page 7 - If-then-else page 8 - Loops page 10 - Files page 11 - Special functions page 12 - Regular expressions page 14 - Report Writer page 16 - Debugger page 16 * An example using an exported horizon file from SeisWorks. page 17 * Information on how to get perl. page 20 * A list of books on perl. page 21 * Internet contacts and support for perl. page 22 * Some handy Landmark-related perl programs. page 23 histograms, SEG-Y Headers, nulls to spaces, HPGL * A little about perl version 5. page 26 * Other free software you should know about. page 27 expect, tcl/tk, GNUtar, gzip This paper specifically addresses perl 4.x. perl 5.x is out now, but most people still use version 4. The current books and documentation for perl all address version 4. There are versions of perl, such as tkperl and oraperl, that address X-windows programming, Oracle database access, and other specific features. There is not time enough to cover these versions in this short tutorial. WHAT IS PERL? perl stands for practical extraction and reporting language. perl is a high level programming language combining elements of C-shell, awk, and many other programming languages and utilities. perl is free. There are no license fees. There are no royalties. It is NOT "Public Domain". When you get perl, you will find a README file that explains your license. If you give perl to someone else, you have to give them this README file. Please read and understand this file. There are at least two publicly available books explaining the use of perl in detail (see page 21). There is excellent support for perl on the internet in the comp.lang.perl newsgroup (see page 22). perl is better than shell script programming and awk or sed programming because: * It does not have all the shell initiation overhead, which makes it faster. * It can read and write binary data files. * It can have many files for input or output at one time. * It has a report writer. * It has extended regular expressions. * It has both linear and associative arrays. * It has powerful defaults that simplify programming. * It can process very large files without record size limits. perl is better than C, C++, or Fortran programming (for sysadmin and data admin tasks) because: * It does not have visible compile or link stages. The program is kept in a source module, like a shell script. * There is a very rich set of character string manipulation and array handling commands. * It has a more forgiving and easier to use syntax. * It has both linear and associative arrays. * It has a report writer. * Some versions of perl include X-windows features, tkperl, or access to Oracle databases, oraperl. In fairness, perl has some down points. * perl programs typically take 1.7 times as long to execute as an equivalent C program. This is okay for utility programs, but would be a problem for a program like SeisWorks. * There are a number of "gotcha's" that are pretty typical Unix characteristics, but would confuse non-programmers. For instance, a number starting with 0 is assumed to be octal. $num = 010; print "The number is $num. \n"; will print The number is 8. * Mostly of interest to professional programmers, perl does not have a CASE construct. It does not have pointers or let you take the address of anything (until perl 5). * perl has a dozen different ways to do anything. Some people don't like this, others do. * perl has a lot of capabilities to do things most sysadmins and data admins will not understand. Don't worry about it, just use what you need. You don't have to know everything about a computer language to make good use of it. perl does not come on the standard SunOS 4.x distribution, but it is a very standard language. perl is listed in the job descriptions compiled by the Systems Administrators Guild (SAGE) of Usenix, the Unix User's Group. There are currently negotiations to include perl in the Solaris 2.5 release. WHAT DOES A SIMPLE PERL PROGRAM LOOK LIKE? This simple program copies records from a file and prefixes each line with a sequential number. The italic numbers are not in the program, they just help the explanation. 1 #! /usr/local/bin/perl 2 while (<>) 3 { 4 print STDOUT ++$i, $_; 5 } Explanation: 1 #! is the Unix method for specifying a shell program. /usr/local/bin/perl is the standard place to put perl. 2 while () {} creates a loop that continues while the statement in the () is true. The statements in the loop are enclosed in {}. <> is a special default. It tells perl to look at the calling command line to see if any files are specified. If they are, read each file in turn. If no files are specified, read from standard input. In either case, put the characters read into the special variable $_. When <> reaches end-of-file, it returns false, which terminates the while loop. 4 print is a simple, unformatted, printing method. STDOUT is the standard filehandle for Standard Output. Filehandles are specified in all caps in perl. ++$i says to increment the value of $i and make that value available to the print statement. All scaler values (anything but a command, linear array, associative array, filehandle, or procedure name) starts with $. $_ is the default operand of any command. In this case, $_ contains the last record read by the <> statement. ; terminates each command in perl. A BRIEF OVERVIEW OF PERL SYNTAX: In perl, as in Unix generally, character case is significant. X and x are not the same character. It is common to name variables and other items in mixed case: $thisIsMixedCase It is also permissible to use underscores: $variable_with_underscores. Do not use names that start with a number, as these are often perl special symbols, $1, $2, etc. All perl commands end with a semicolon, ;. Variables: perl identifies each type of variable - or data name - with a prefix character or identifying style. These characters are: $ scalar a single number (integer or real) or character string @ linear array an array referenced by an index number % associative array an array referenced by a textual key UC file-handle a file handle is uppercase & procedure a subroutine xx: label object of goto, or marker for escape from a loop. "Subscripts" enclosed in [] apply to linear arrays. @items refers to the entire array items. $items[1] refers to the scaler value which is the second item in the array items. Linear arrays start with the index 0. $#items is the number of items in @items starting from 0. Subscripts enclosed in {} apply to associative arrays. %items refers to the entire associative array items. $items{"x"} refers to the scalar value matching the key "x" Values enclosed in () are lists. Lists are often used as arguments to a subroutine or built-in function call. It is not necessary to enclose arguments in () if there is only one argument or the program knows the limit of the list. There can be completely separate and unrelated variables $x, @x. %x, and &x, not to mention $X, @X, %X and &X. There are special variables, the most important of which are $_, @_,and @ARGV. $_ is the default scaler value. If you do not specify a variable name in a function where a scaler variable goes, the variable $_ will be used. This is a very heavily used feature of perl. @_ is the list of arguments to a subroutine. @ARGV is the list of arguments specified on the command line when the program is executed. Basic Commands and Control: Braces, {}, are used to contain a block of program statements. It is possible to have local variables within a block. Blocks are used for the objects of most control commands. Simple Assignment: Simple, scaler, assignment is what you might expect: $var = 1; $str = "This is a string."; One can also assign lists of scalars in one statement: ($rock, $jock, $crock) = ("Plymouth", "Warren Moon", "Solaris 2.x"); One can assign a list to an array: @items = (1, 2, "Cambodia", 4); or an array to a list: ($a, $b, $c, $d) = @items; Associative arrays need a key, but otherwise work as you would expect: $aa{"able"} = "x"; %aa = ("able", "x", "baker", "y", "aardvark", "z"); Assigning an ARRAY to a SCALER will give the number of items in the ARRAY. @items = (10, 20, 30); $i = @items; print "$i"; will print "3". Arithmetic Operations: perl has the usual operations, and many more: $c = $a + $b addition $c = $a - $b subtraction $c = $a * $b multiplication $c = $a / $b division $c = $a % $b remainder $c = $a ** $b exponentiation $c = $a . $b concatenation ++$a, $a++ increment by 1 --$a, $a-- decrement by 1 $a += $b increment by $b $a -= $b decrement by $b $a .= $b append $b to $a $c = "*" x $b make $b *'s Of course, there are many more. There are also modifiers like these: $a = "Big And Little"; $c = \l$a; print $c; prints "big and little". \l convert to lower case \u convert to upper case \L lowercase until \E \U uppercase until \E \E end case modification There are functions for math including: log($x) exp($x) sqrt($x) sin($x) cos($x) atan2($y,$x) The only trig functions are sin, cos, and atan2, however, these can easily be used to compute the others. The ERUUG Unix Cookbook (see page 21) has a list of the formulas for the conversions. If-Then-Else: The basic if-then-else command is fairly typical of all computer languages. if ( condition ) { true branch } else { false branch } There is also if (condition) {commands} elsif (condition) {commands} elsif (condition) {commands} which simplifies a lot of complex nested if statements. Note that it is elsif, not elseif or else if. Both the true and false branches may contain any number of nested if statements. There is also another form of if statement: unless (condition) { true branch } The condition has a wide range of comparison operators. It is important to observe the distinction between numeric comparisons and string comparisons. numeric string meaning == eq equals != ne not equal > gt greater than < lt less than Strings that do not consist of numbers have a value of zero. if ("abc" == "def") is TRUE, because the strings are numerically zeros. To make this work right you have to have if ("abc" eq "def") perl has file test operators like shell scripts. perl has an extended set to tests such as: -T true if file is text -B true if file is binary -M days since file modified -A days since file accessed -C days since file created Other forms of the if-command are not common in other computer languages, but can be quite useful. A good example is the postfix if. next if $var == 1; A useful form of logic uses || or && in a command: open (IN,"to open file F for write only. X = >> to append to file F. X = | to WRITE to a pipe to PROGRAM F. Y = | to READ from a pipe from PROGRAM F. If only the filename is provided, the file is opened for read and write. Reading: The most basic reading mechanism is to enclose the filehandle in <>, like this $record = ; A special case of this goes like this: $record = <>; This special case looks for filenames on the program command line and reads any files it finds, one after the other. If it finds no filenames on the program command line, the program will assign <> to STDIN. It is important NOT to use the array form: @record = ; as this will read the entire file into the array @record, which may take up an awful lot of memory. Reading is often done using a while loop, like this: while ( ) { commands } When the last record is read, the returns the value FALSE, which terminates the while loop. Since a scaler variable has not been supplied for the record, the record is stored in $_. Writing: Most writing is done using the print or the printf commands. These commands are used to write to files even if the results are never actually printed on a hardcopy device. print writes a line with default line spacing. It is used when the output has no particular column spacing to comply with: print STDOUT "The X is $x and Y is $y\n"; printf is just like the printf in C and other similar languages. It is a formatted print. The first variable or string contains the format. $fmt = " X = %8.2f Y = %8.2f Flag = %s\n"; printf STDOUT ($fmt, $x, $y, $flag); The \n is the new line character. The % indicates the beginning of a format character, the f is the format for floating point numbers. The 8.2 indicates the number is 8 characters long with a decimal point in the sixth character, and two decimal places in the seventh and eighth characters. The %s is a character string with no length specified. Closing: perl will automatically close any open files when it exits. There are some occasions where it is useful to close a file before perl exits, so the there is an explicit close. close FILEHANDLE; Other Important Functions: Error Messages: die is used to print an error message and then exit. warn is used to print an error message, but continue. String Handling: split is used to split tokens (fields) from a character string into an array. If you have a line: $line = "Now is the time for all good men"; you can put each word into an array with the command: @token = split(/\s+/,$line); sort sorts a list or array. study, an instruction I issue many times to my 12-year- old, optimizes string operations. Binary Encoding: pack packs values into a string using a template. $pi = pack("f",3.1415926); puts pi into a floating point number. unpack extracts values from a string using a template. $pi2 = unpack("f",$pi); There is a long list of templates you can use. You can use more than one template at a time to build up or extract binary data from a record. l long 32 bit signed integer L long 32 bit unsigned integer s short 16 bit signed integer S short 16 bit unsigned integer f float 32 bit floating point d double 64 bit floating point A ASCII ASCII string c char a single byte (character) System: There are many system oriented functions including: chmod change file permissions fcntl sets file control options fork creates an independent sub-process. mkdir make a directory Regular Expressions: Regular expressions and pattern matching are an important part of all Unix programming. perl adds a set of extended regular expression characters to the standard set. There are two ways regular expressions are used: Match m/regexp/ m is optional, you can use /regexp/ next if m/^\s*$/; will skip blank lines. Substitute s/regexp/new/ If the regexp matches, replace it with new. s/\s*$//; will trim trailing spaces from a line. Standard Set (not complete) a match a a* match zero or more character a's . match any character .* match zero or more of . [a-m] match characters a through m only [^n-z] do not match letters n to z [a-m]* match zero or more letters a to m ^ match the beginning of the line $ match the end of the line \t matches a tab character perl extensions (not complete) \d same as [0-9] \D same as [^0-9] \s matches white space (space or tab) \S matches anything but white space \w same as [0-9a-zA-Z] characters) \W same as [^0-9a-zA-Z] .+ same as ..* [a-m]+ match one or more letters a to m a{n,m} at least n a's, not more than m a's a? zero or one a, not more than one \cD matches control-D An important use of regular expressions is the use of () to select subsets of the regular expression. This is actually a standard part of regular expressions and can be used in vi, awk, sed, and anywhere regular expressions are found. perl makes it especially easy to use the () For instance, if you had the character string: "SeisWorks 3D" "s3d 2> /dev/null" as is found in launcher.dat, you could use the regular expression: ; if ( m/^\t"(.+)"\s*"(\S+)\s+2>\s*(.+)$/ ) { ($title, $program, $errorFile) = ($1, $2, $3); } to extract the title, program name, and the error file name. The way this works is: ; reads a record. Since it doesn't say where to pu the record, it is stored in $_. m/.../ matches a regular expression. Since it doesn't say what variable to use, it uses $_. ^ matches the beginning of the line \t matches the initial tab. " matches the first " ( starts the first extracted string .+ matches one or more of any character ) closes the first extraction, placing it in $1 " matches the second " \s* matches zero or more spaces or tabs " matches the third " ( starts the second extraction \S+ matches any characters but space or tab ) closes the second extraction, placing it in $2 \s+ matches one or more spaces or tabs. 2 matches 2 > matches > \s* matches zero or more spaces or tabs ( starts the third extraction .+ matches one or more characters ) closes the third extraction, placing it in $3 " matches the fourth " $ matches the end of the line $title = $1; puts the value from $1 into $title. Report Writer: The report writer feature lets you define how your page should look and do all the necessary assignments with a single command. The report writer takes care of page breaks, page numbers, and other issues for you. format STDOUT_TOP = Projects Using Too Much Disk page @## Project Owner Last Used Cost -------------- -------- ------------- ----------- $% . format STDOUT = @<<<<<<<<<<<<< @<<<<<<< @>>>>>>>>>>>> @#######.## $project, $owner, $lastUsed, $cost . while (<>) { ($project, $owner, $lastUsed, $cost) = split; write; } _TOP indicates a heading . ends a format description $% is the page number @<<<< is a left justified field @>>>> is a right justified field @###.## is a right justified, two decimal number Debugger: perl has a built-in debugging system. To use the debugger, all you have to do is add a -d to the first line of the program. #! /usr/local/bin/perl -d commands When you run the program, it will start in debug mode. You then have many debugging commands you can use including: h help on debugger s step c continue to next break c continue until line n next (does not step into subroutines) l list program statements in the b sets a breakpoint at line p prints which is usually a variable AN EXAMPLE USING AN EXPORTED HORIZON FILE: Background: A typical exported horizon file from Seisworks is in the form: Line Trace X Y Z where Z is often the time, but in this example, we are going to export the amplitude as Z. What we want to do in this example is to clip the amplitudes to some specific range of values. Anything below the range will be set to the lowest value in the range, anything above will be clipped back to the highest value in the range. This can also be done using bcm. We chose this example because it is easy to follow and can be extended to do things bcm cannot do. This simplified example is taken from a program used by BHP Petroleum (Americas) to suppress tuning effects resulting from a formation thickness being close to the size of a seismic wave length. This example is also kept simple. An experienced perl programmer would use more sophisticated programming to write a shorter, faster, program. Usage: Before using this program the first time, you must use chmod +x horizonClip to make it an executable file. There is no compile step or link step as in C, Fortran or other languages. You have to extract the file using the data export feature of SeisWorks. The program is called by typing: horizonClip low high filein fileout You can then re-import the horizon using the data import feature of SeisWorks. Program: Note: The line numbers do not appear in the file or in the program, they are just used in this paper to help you follow the program: 1 #! /usr/local/bin/perl 2 die "Usage: horizonClip low high in out\n" 3 if $#ARGV !=3; 4 $low = $ARGV[0]; 5 $high = $ARGV[1]; 6 if ($low > $high) 7 { 8 $tmp = $low; 9 $low = $high; 10 $high = $tmp; 11 } 12 $filein = $ARGV[2]; 13 $fileout = $ARGV[3]; 14 if ($filein eq "-") 15 { 16 open (IN,"<&STDIN"); 17 } 18 else 19 { 20 open (IN,<$filein) 21 || die "No file $filein $!\n"; 22 } 23 if ($fileout eq "-") 24 { 25 open (OUT,">&STDOUT"); 26 } 27 else 28 { 29 open (OUT,>$fileout) 30 || die "Cannot make $fileout $! \n"; 31 } 32 while ( ) 33 { 34 ($line, $trace, $x, $y, $z) = split(\s+); 35 if ($z < $low) {$z = $low;} 36 if ($z > $high) {$z = $high;} 37 printf OUT ("%20s %12s %12 %12s %12.2f", 38 $line, $trace, $x, $y, $z); 39 $count++; 40 } 41 print STDOUT "Processed $count records"; Details: 1 The first line of all perl programs (on Unix platforms). 2 - 3 Post-fix if. There are four arguments. $#ARGV is 3 because $ARGV starts at 0. 4 - 5 We could have as easily said: ($low, $high, $filein, $fileout) = @ARGV; 16 - 25 We can assign filehandles to other filehandles (merge the output of the filehandles) using the open statement and the &. 20 - 29 These are more standard opens. 32 - 40 The while loop continues until becomes false. 39 $count++ adds one to the value of $count. HOW TO GET PERL: perl is free, it is NOT PUBLIC DOMAIN. Public Domain means there is no identifiable owner or the public at large is the owner. perl is owned by Larry Wall. Larry gives everybody a free LICENSE to use perl. That is not the same as ownership. If Larry let you use his lawnmower for free, you wouldn't own it. There is a license file that comes with perl. READ AND UNDERSTAND THE LICENSE. Read it again before selling any software based on perl. Many of the CD-ROMS available have perl on them. In many cases you can get the perl executable binary so you don't even have to compile it. These CD-ROM's are advertized in most Unix trade magazines. Some Walnut Creek and some Prime Time Freeware CD-ROM's have perl in SunOS executable form. There is book available at BookStop and other bookstores called: Prime Time Freeware for Unix, $60.00 ISBN 1-881957-04-7 The CD-ROM in the book Unix Power Tools has perl on it, and is available from several bookstores in the Houston area. Unix Power Tools, $59.95 ISBN 0-679-79073-X The best way to get perl is via the Internet. This will get you the latest version with the latest bug fixes. One place to get perl on the Internet is: ftp ftp.uu.net login: anonymous password: your-internet-name@your-internet-site ftp> cd /gnu ftp> binary ----- DON'T FORGET THIS LINE ftp> get perl-4xxxx.tar.Z ftp> bye When you get it back to your machine, you will need to uncompress it, un-tar it, and execute the make command. It is a good idea to get gcc (also free) rather than using the bundled C compiler on SunOS. gcc will make a much faster executable of perl. BOOKS ON PERL The main reference is Programming Perl, usually called "The Camel Book", by Larry Wall and Randal Schwartz. Published by O'Reilly & Associates, ISBN 0-937175-64-1. A more tutorial, but less complete, book is Learning Perl, usually called "The Llama Book", by Randall Schwartz. Published by O'Reilly & Associates, ISBN 1-56592-042-2. A book giving examples of perl for systems administration is supposed to come out soon, but I have been unable to get details about it. The Energy Related Unix User's Group (ERUUG) Unix Cookbook has several example programs related to petroleum. This book is available to members of ERUUG, and is available to guests. It is also available on the Internet at this world wide web location: http:/www.glg.ed.ac.uk/ SUPPORT FOR PERL: The main source of support for perl is the Internet newsgroup comp.lang.perl All the big names in perl follow this newsgroup and many people on the net will answer questions. I usually get an answer in a few hours. That is better than any computer department, vendor help desk, or on-site support representative I have ever dealt with. some big names on the Internet for perl include: Larry Wall (author of perl) lwall@netlabs.com Randall Schwartz merlyn@stonehenge.com Tom Christiansen tchrist@perl.com (303) 444-3212 Randall Schwartz and Tom Christiansen are consultants. The Energy Related Unix User's Group (ERUUG) has several members with at least some perl experience. There are several consulting services that can install and support perl. Sometimes MIS Departments insist that all programs acquired must be paid for and have paid support. Of course no-one supports most programs in Unix, particularly not awk or the bundled C compiler, but most MIS Departments are still learning about Unix and want to run things the way they did on the VAX or IBM mainframe. You can usually get around the "MIS shuffle" by buying the CD. Cygnus Support sells support for many free software packages, and unlike most software vendors supporting their own packages, Cygnus Support actually provides support. APPENDIX I SOME USEFUL PERL SCRIPTS: These scripts have been written to illustrate points made in this paper. They are not always the most efficient, compact, or best way to write the particular script. Correct bcm2d histogram: page 23 Dumping SEG-Y Headers: page 24 Change nulls to spaces in file page 25 Splitting an HPGL file: page 25 Program to correct bcm histogram: The histogram feature of the bcm program has a small but annoying round off error. It also does not make a visual histogram. This program reads a bcm listing, selects the histogram portion, recalculates the percentages and draws a histogram to the side. #! /usr/local/bin/perl if ($#ARGV != 1) { print STDERR "Usage: histofix in.file out.file\n"; exit; } open (IN, "<$ARGV[0]"); open (OUT, ">$ARGV[1]"); while ( ) { print OUT; last if /\*\*\* *\.STATS *: *Summary/; } $skip = ; print OUT "$skip\n"; $skip = ; print OUT "$skip\n"; while ( ) { last if m/^ *$/; m/^\s*(\d+\.\d*)\s+(\d+) /; ($interval[$i],$count[$i]) = ($1, $2); $total += $count[$i]; $big = $count[$i] if $count[$i] > $big; $i++; } while ($i > $j) { $pc = ($count[$j] * 100.0) / $total; $pct += $pc; $graph = "X" x int((($count[$j] / $big) * 20) + 1); printf OUT "%12.4f %15.0f $7.2f %7.2f %s", $interval[$j], $count[$j], $pc, $pct, $graph; $j++; } while ( ) {print OUT;} Dumping SEG-Y Headers: This program is around 500 lines long and thus too long to include here in its entirety. The program reads the EBCDIC, Binary, and first Trace header of a SEG-Y file on disk or tape. #! /usr/local/bin/perl for $i (0..255) {$ebcdic{$i} = "_";} $ebcdic{ 0} = "~"; $ebcdic{ 64} = " "; ... $ebcdic{129} = "a"; ... $ebcdic{193} = "A"; ... $ebcdic{249} = "9"; $binHeadTemplate = "l3s25s170"; $traceHeadTemplate = "l7s4l8s2l4s13S2s31f5ss17"; ... sysread (IN,$ebcdicHeader,3200); sysread (IN,$binaryHeader,400); sysread (IN,$traceHeader,240); print STDOUT "--------------EBCDIC---------"; for $i (0..3199) {substr($asciiHeader,$i) = $ebcdic{ord(substr($ebcdicHeader,$i,1))} }; for $i(0..39) { $line = substr($asciiHeader,$i*80,80); print STDOUT "$line\n"; } ( $jobid, $lineid, $reel, ... $vibratoryPolarity ) = unpack($binHeadTemplate,$binaryHeader); print STDOUT "--------------Binary---------"; print STDOUT "jobid $jobid \n"; print STDOUT "lineid $lineid \n"; print STDOUT "reel $reel \n"; ... print STDOUT "vibratory polarity $vibratoryPolarity \n"; ( $traceLine, $traceReel, ... $overTravelTaper, ) = unpack($traceHeadTemplate,$traceHeader); print STDOUT "--------------Binary---------"; print STDOUT "Trace Line $traceLine \n"; .... print STDOUT "Over Travel Taper $overTravelTaper \n"; exit; The information here should give anyone who is familiar with SEG-Y enough information to reconstruct the program. If you are not familiar with SEG-Y, obtain the Seismic Unix (SU) package from the Center for Wave Phenomenon at the Colorado School of Mines. This package contains more than enough information to complete this program. Program to convert nulls to spaces in bcm output: The output of bcm2d and bcm3d has some sloppy code that prints nulls instead of spaces. This is usually okay for vi and more, but interferes with the correct operation of aXe or Xless. It also makes it harder to read the file into a spreadsheet. This program looks in any file for null characters and changes them to spaces: #! /usr/local/bin/perl open (IN,"<$ARGV[0]"); open (OUT,">$ARGV[1]"); while (!eof(IN)) { $c = getc(IN); $c = " " if ord($c) == 0; print OUT $c; } Converting a monolithic HPGL file to records: It is common to find HPGL and other plotter control files given as one long record with no new-lines. These files are hard to troubleshoot or transfer between programs. The fold command can split the file into arbitrary length records, but what you want is to be able to make sense of the commands. HPGL files contain plotter commands separated by semicolons. HPGL ignores embedded new-lines. A perl program to fix this can be as simple as: #! /usr/local/bin/perl while (<>) { s/;/;\n/g; print; } It could actually be done as simply as: perl -pi.bak -e 's/;/;\n/g' hpgl.file APPENDIX II PERL 5: perl 5 was released just as this report was being prepared. These are a few new features we expect to see in perl 5. * awk-like BEGIN and END sections. * Better access to system function calls. * Pointers and structures * Object-oriented programming features * Additional regular expression features. Tkperl 5: Tk is a set of libraries and functions to create X-windows "widgets", picture elements such as scroll bars, pull down menus, and even "canvas" graphics. Tkperl 4 used embedded Tcl (Tool Command Language) to use Tk. Tkperl 5 has native access to Tk. APPENDIX III OTHER FREE SOFTWARE YOU SHOULD KNOW ABOUT: expect: expect is a program that lets you run the kind of programs that ask you stupid questions every fifteen minutes or so. An expect script can read what the program prints and give it an answer according to your instructions. A good example is bcm3d. Part way through the program, bcm3d asks you for a real number. Later on in the program, it asks if you want to run the program. This makes it hard to put together a shell script of ten or twenty bcm3d jobs and run them overnight. expect can anticipate these questions and answer them for you. You can start the job overnight and go home to your family. This is an example: #! /usr/local/bin/expect -f # disable timeouts set timeout -1 # start the bcm3d program using the # parameter specifying a .pcl file spawn bcm3d [lindex $argv 0] # wait for the reel number question expect "*number :*" exec sleep 1 send 1\r # wait for the ready question expect "*Ready, or A to Abort :*" exec sleep 1 send r\r # wait for completion message expect "*ended normally*" exec sleep 1 exit There is a book coming out about expect that you will want to read. Exploring Expect by Don Libes Published by O'Reilly & Associates, ISBN: 1-56592-090-2 Tcl/Tk: Tcl/Tk is a shell script like language for writing X-windows applications. It is not terribly easy, but is much easier than writing C, C++, Motif and X programs. There is a new book out about Tcl/Tk. Tcl and the Tk Toolkit John Ousterhout Published by Addison-Wesley ISBN: 0-201-63337-X GNUtar: GNUtar is like regular tar except: * It can write tapes across the network (no more RFS or dd to worry about). * It can compress files as it backs them up. * It strips the leading slash off the path, so you don't have absolute paths in your tar. If you get nothing else, get GNUtar. gzip / gunzip: gzip, gunzip, and znew are programs that work basically like the standard SunOS compress and uncompress, except that they typically get 40% more compression. znew takes a file compressed with compress and turns it into a gzip file. Files compressed with gzip have a .gz extension rather than the .Z extension of compress.