New commands will be introduced here. Some will be very fully explained but for others you are expected to do your own research using the “man pages". “man pages" are the definitive source of information on any command.
file /bin/*sh |
We could use any of these shells that we wanted to, but the shell we
are going to use for writing shell scripts is the “bash” shell –
a shell that is very similar to the Korn shell, except it is provided for
free. This is rapidly becoming the dominant shell of choice at the time
of writing. This may not be the same shell that you use interactively on
Unix. Your interactive shell is defined to a system environment variable
named “SHELL”. You can find out the shell you use when you first log onto
Unix by typing in the following command:
echo $SHELL |
If you see “/bin/bash” then you are using the bash shell, if “/bin/sh” then the Bourne shell, if “/bin/ksh” then the Korn shell, if “/bin/ksh93” then the extended Korn shell (even more similar to bash), if “/bin/csh” then the C shell and if “/bin/tcsh” then the tc shell. There are others.
Now we know what a “shell” is, then we are left with the meaning of the word “script”. A “script” is just written text. In the case of a shell script, the text will be Unix commands and other things that the shell understands. A “script” could be a list of commands that you commonly use put together in one file so you don’t have to type it all each time. Usually it will be a list of commands that accept parameters such as file names and perhaps a choice of options. If you think of the “print” utility, you choose options such as the layout and the printer and lastly you put the filename you want to print. What you are doing is calling a “shell script”. This is what you are going to learn how to write, using this document to help you.
(If you do not have the bash shell but have the Korn shell then you can still learn to write shell scripts using this document. Differences between the Korn and bash syntax are described at the end of this document).
You may think that shell scripts have got nothing to do with running SAS® software. However you will see that a number of the shell scripts available to you do call SAS software. The coding of these scripts that call SAS software is much more difficult than the others but there is a shell script called “sasunixskeleton” that can create a template shell script for you to make life easier. Using it, you then only have to fill in with the SAS software code you need, plus make some other very minor changes at the indicated places.
Coding shell scripts is an exercise in keeping things simple. A script should do what its name suggests and do no more. No formatting of output. No titles or explanations with the output. No writing to files. No use of “more” to give you a page at a time. If the end user doesn’t like the look of the output then as a script writer, you don’t care, so long as the correct information is there in a simple format such as one item per line. If the end user wants the output to go to a file then they can redirect the output to a file themselves. You do not do it for them in your script. If they do not know how to redirect output then they can read the “Common Unix Commands” document or other manuals. The only “end user” you should care about is another script that hasn’t been written yet that might read in the output from your utility. You keep things as simple as possible for that script, not yet written, to pick up the output. The approach is almost the opposite of what you do when you write programs that use SAS software, so never be concerned that writing shell scripts is something complicated. If it ever becomes complicated then you know you are doing it wrong.
#!/bin/bash
# Script : testing # Version : 1.0 # Author : Roland Rashleigh-Berry # Date : 02-Feb-2004 # Purpose : testing testing # SubScripts : none # Notes : none # Usage : testing xxxxxxxx # testing #================================================================================ # PARAMETERS: #-pos- -------------------------------description-------------------------------- # 1 #================================================================================ # AMENDMENT HISTORY: # init --date-- mod-id ----------------------description------------------------- # #================================================================================ # Edit the test below for the number of expected parameters and
put out
|
Everything following a “#” in any position on a line is comment but the first such line is treated differently. The “!/bin/bash” in the first line tells it what shell to use. In this case it is the “bash” shell in the “/bin” directory. This is the shell that gets invoked to run the script. If you did not specify the shell then the current shell would be used but you should always specify the shell in this way.
Looking down the header, we come to the parameters. Script parameters do not have names. They have numbers. When you invoke a script that requires parameters, you list them after the script name, separated by spaces. The first string will be parameter 1 that you resolve in your script as $1 (note that if your parameter value contains a space then you should enclose it in quotes). The second will be parameter 2 that you resolve as $2. You can do this up to parameter 9 which used to be the limit for the original Bourne shell upon which the bash shell is based. The bash shell places no limit on referencing parameters, but beyond 9 you have to refer to the parameters using curly brace notation such as ${10}. It is worth mentioning at this point that $0 will resolve to the name of the script as invoked (with the path in front). Since it is possible that a symbolic link was used to access the script by perhaps a different file name, then it becomes useful to know the name of the script as originally invoked and have it stored in $0 for any error messages that might get put out mentioning the script name. For example, a call to “awk” is sometimes a symbolic link to “gawk” so if called as “awk” then any error messages should mention “awk” instead of “gawk”. This is made possible using $0 though we have to strip the path name from the front (which is easy to do using “basename”).
Just as you can refer to the single parameters with their number, such as $1 and $2, there are other ways to refer to the entire collection of parameters and the number of them. To refer to all of them you can use $* or better still $@ since if one or more of these parameters contained a space then "$@" would protect them by putting every parameter value in double quotes. Another important value is $# which is the number of parameters supplied. It is common practise to place a test for the expected number of parameters at the top of the script code and to exit with a non-zero return code if it is incorrect after putting out a usage message. Because testing for the number of parameters is so common, it is generated after the header to save you some editing work. See the “man” pages for “test” for an extensive description of various types of comparisons and also file properties testing.
Suppose you had a script named “print”. You could specify what file
or files to print and the printer. You could also specify the page orientation
and adjust the lines per page as well. If you were to implement this using
only parameters, then things would get messy. The files would have to go
at the end, since there may be more than one, so the other parameters would
precede it. Your call might look something like this:
print printer1 landscape 40 file1 file2 |
Now suppose the script got enhanced so that you could additionally specify
left, right, top and bottom margins. If you were using parameters only
then it might become:
print printer1 landscape 40 0.5in 0.6in 1in 1.1in file1 file2 |
Not only is it getting messy but we will have a problem with existing scripts that call the “print” script and pass parameters. Now the files you want to print are in the wrong positions. If you run an old script again then it will think that the first file to print is the left margin size. And it has become almost impossible for people to remember what parameter does what. Also you can not leave off the parameter values and allow then to default because that will change the parameter positions. The answer to this problem is to use options for all but the files you want to print. You could leave the printer identity as the first parameter if you wanted to but with so many of the others becoming options then the printer identity may as well be an option also. At least, that way, you could allow its value to default.
grep string *.sas |
If the answer you thought of was “2”, then you got it wrong. Before “grep” even gets invoked, the command parser does its work. It can recognise that “*.sas” is a file pattern and it will convert that pattern into a list of all files that match that pattern before invoking “grep”. By the time “grep” sees these parameters, there might be any number of files that end in “.sas” after “string”. You should always keep in mind what the command parser will do when you write a script.
#!/bin/bash
# Script : ssss # Version : 1.0 # Author : Roland Rashleigh-Berry # Date : 11-Feb-2004 # Purpose : To test "sohdr" utility # SubScripts : none # Notes : none # Usage : ssss xxxxxxxx # ssss #================================================================================ # OPTIONS: #-opt- -------------------------------description-------------------------------- # a (switch) apples # b (value) name of bear: bruno, pooh # c (switch) cats # d (value) diameter #================================================================================ # PARAMETERS: #-pos- -------------------------------description-------------------------------- # 1 #================================================================================ # AMENDMENT HISTORY: # init --date-- mod-id ----------------------description------------------------- # #================================================================================ # Define give-usage-message-then-exit function. Edit to add any
useful option
# The following variable names and defaults have been generated
for you
# "case" statement for action to take on selected options
# shift to bring first parameter to position 1
# USE THIS CODE FOR DEBUGGING AND THEN REMOVE
# Edit the test below for the number of expected parameters and
call
|
Note that in this case the usage message is a function. It is effectively a sub-routine. Implementing it as a function, allows it to be called at multiple points in the code. This function does not need any parameters. If you write a function that does then it too uses numbered parameters like $1, $2 etc. and you pass it parameter values in the same way you pass parameters to a script or command. They are positional. You do not put function arguments enclosed in brackets like you do with SAS software functions.
Next in the code you see a group of variables that are related to the options. These are only suggested names and you are free to change the names to something else but you must remember to change them in all other places in the code if you do. Options that require a value have variable names that start “value” and those that are “switches” start with “switch”. Switch values default to 0 and are set to 1 if selected. Options that require an argument have a default assigned. You are asked what you want the default to be when you run the “sohdr” script.
Next in the code comes the hard part. It is the standard code to handle the options. The inbuilt “getopts” utility is used to read in the options and action is taken for each one. A “case” statement is used for this. “getopts” uses a variable named OPTARG in which to place the supplied argument of an option requiring an argument. You assign the contents of this to your variable.
“getopts” also uses a variable named OPTIND. Options like “-p” are just ordinary parameters to a script. If “-p” directly followed a script name than that would be “$1”. There is nothing special about options that make them different to parameters. They are just handled differently. So after “getopts” has handled all the options then it points to the first parameter that is beyond these options. What you then do is “shift” the parameters out the way by the number in OPTIND – 1 and then the first parameter ($1) will be the first “true” parameter. This is what the code does in the next section.
Next comes some code for debugging the script. When this code gets generated then it will actually work immediately and display the variable values that get set depending on what options you select and what values and parameters you supply.
Lastly you will see code familiar as output by the “shdr” utility. It is a piece of code to test for the correct number of parameters.
echo $((2 + 3 * 4)) |
…and afterwards try this:
echo $(2 + 3 * 4) |
In the last case it will try to run the command or script “2” and it will fail.
In your current session, set up three variables var2, var3 and var4
with the values “2”, “3” and “4” respectively and then set the variable
“tot” to:
tot=$(($var2 + $var3 * $var4)) |
Echo “tot” and you will see it has the value 14 in it. This is not a
very intuitive way of doing arithmetic. Here is a way that is simpler to
remember:
let tot=var2+var3*var |
This gives you the same result. Note that there are no spaces in the expression.
If you want to do floating point arithmetic within your script then
you should use the “bc” utility (see “man bc”) and you will probably need
the “-l” option. Try these commands:
x=3.5
y=10.5 z=$(echo “$x * $y” | bc –l) echo $z |
var=abcdefgh
echo $var echo ${var} echo ${#var} # this returns the length of the string echo ${var:0:1} # the first character in the string echo ${var:1:3} # three characters starting with the second character echo ${var:2} # all characters starting with the third character |
Arrays can be “declared” in a bash shell script. Without going into
lengthy explanations, here is a shell script you can copy and play around
with. Don’t forget to make it executable.
#!/bin/bash
nums=137 declare -a names # declare "names" as an array i=0 while [ $i -lt ${#nums} ] ; do echo -n "How do you spell the number \"${nums:$i:1}\"? " read names[$i] let i=i+1 done echo
echo
|
Script parameters are not in an array but they have numbered elements
just like arrays elements. Sometimes you will want to refer to them like
they were an array. Suppose you wanted to refer to the last parameter value
supplied, not knowing in advance how many parameters were supplied, then
you would need to know how to do it. It can again be done using curly brackets,
this time with the “!” character in front to resolve the value of a variable.
Here is simple script to tell you how many parameters you entered and what
they are.
#!/bin/bash
echo Parameters supplied was $# and the last parameter value was: ${!#} param=0 until [ $param -gt $# ] ; do echo "parameter $param value is: " ${!param} let param=param+1 done |
#!/bin/bash
ix=$(expr "$1" : '[^,]*,') if [ $ix -gt 0 ] ; then echo "parameter 1 contains a comma at position $ix" fi |
The above method is not good for a target string. It only works for
single characters. The following will return the result 7 because not only
does it pick up the last occurrence of the string, but it adds the target
string as well to the length. The “7” says that the second pattern is matched
by the first 7 characters in the first string. Indeed it is, but that is
not what you want to know.
expr "abcdxcde" : '.*cd' |
If you need to know the first position of a target string then this
will give you the correct result, but partly by chance, as will be explained:
#!/bin/bash
echo "$1" | grep "$2" | sed "s/$2.*//" | wc –m |
What the above does is first use “grep” to make sure the string ($1) contains the target string ($2). In the following “sed” step that target string and all that follows is replaced with nothing. Then it is passed to “wc” to count the number of characters. But if the target string is the first character of the string then the “sed” step should replace the whole thing with nothing. But if something gets that far then it will also be followed with a line feed generated by the first “echo” command which “wc” counts as one extra character. So if a match on the first character is obtained then you get the result “1” which is what you wanted. And if there is not match then nothing gets passed to “wc” (not even a line feed character) and so you get the result “0”.
Because the above methods are so complicated, an in-house script named
“index” has been written that works very much like the index() function
in SAS software code. It uses the index() function of “awk”. The code of
the macro is essentially the following:
#!/bin/bash
echo "$1" | awk '{print index($0,"'"$2"'")}' |
Here is an example of using the script “index” to loop through a comma-delimited
list and print each element in turn. Note that “index” regards the first
position in a string as “1” whereas bash regards it as “0” so the expression
${str:$(index "$str" ',')} conveniently places us at the next position
in the string after each comma. If we had to add a “1” to it then it would
take the form ${str:$(index "$str" ',')+1}.
#!/bin/bash
# commaloop: will list out each element delimited by a comma str="$1" while [[ "$str" == *,* ]] ; do echo "$str" | cut -d, -f1 str=${str:$(index "$str" ',')} done echo $str |
If you do not need to know the position of a character in a string –
just whether there is one or not, then it is a lot simpler. You can test
a string for the presence of a character using this method. In this case
it is testing the first parameter for the presence of a comma with any
number of characters either side of it. Because it is using bash’s inbuilt
functionality then it is preferred to methods that call other commands.
#!/bin/bash
if [[ "$1" == *,* ]] ; then echo "string contains a comma" fi |
#!/bin/bash
if echo "$1" | /usr/xpg4/bin/grep -q ',' ; then echo "grep detected that parameter 1 contained a comma" fi |
Note that “grep” is being called from a specific library. This is because
we need to use the “q” (“quiet”) option with “grep” but this option only
exists in some implementations of “grep”. When used with the “q” option,
“grep” works in “quiet” mode and does not produce any output. What it does
produce is a return code of 0 if something was selected. “0” means “true”
in bash programming (the opposite of how it works in SAS software code)
so if something is selected then the “0” gets returned to the “if” statement,
making it “true”, and so the message that “grep” found a comma is displayed.
These differences between utilities of identical names in different directories,
leads to a style of writing shell scripts. It is considered “good practise”
to declare exactly what utilities you will use at the top of your code
by setting up variables and then referencing these variables. So the same
code as above, if following “good practise”, would become:
#!/bin/bash
GREP=/usr/xpg4/bin/grep if echo "$1" | $GREP -q ',' ; then echo "$GREP detected that parameter 1 contained a comma" fi |
These “xpg4” utilities generally have more options than those in other directories. When you read the “man” pages on a command you will often see the “xpg4” versions listed at the top. If you use a command and you wished it had an extra option, then it is well worth checking the “man” pages for the command to see if there is an “xpg4” version that gives you what you want.
echo -n "Do you want to continue? (y/n): "
read ans if [ "$(echo ${ans:0:1} | tr [a-z] [A-Z])" != "Y" ] ; then echo "QUITTING" exit 1 else echo "CONTINUING" fi |
You could always write your own “upcase” function and call it in your
script but keep in mind that functions in bash work like commands and scripts
in the sense of having positional parameters. You do not put the argument
to a function in brackets like you do with SAS software functions. Here
is the above code again but with “upcase” as a function. Note the way the
“upcase” function is called.
upcase () {
echo "$@" | tr [a-z] [A-Z] } echo -n "Do you want to continue? (y/n): "
if [ "$(upcase ${ans:0:1})" != "Y" ] ; then
|
Another way to test for a yes/no response is to use the “case” statement.
If you write scripts that use options then you will get to know the syntax
of the “case” statement. What you have is a list of values that it can
match, and if none of the values match then it will match with “*”. Doing
a simple test and exiting if the characters are not found can be very simple
in form as the following shows.
echo -n "Do you want to continue? (y/n): "
read ans case “${ans:0:1}” in [yY]) ;; *) exit 1 ;; esac |
…and the case statement can be made even simpler using pattern matching
like this:
case “$ans” in [yY]*) ;; *) exit 1 ;; esac |
The “case” statement above is using pattern matching. The “[yY]” at
the beginning is testing for the first character being either a “y” or
“Y” and the following “*” signifies any string of any length including
the null string (do not get pattern matching confused with regular expressions).
We could test for the same using an “if” statement, but the test has to
be enclosed in double square brackets. Single square brackets belong to
the “test” feature of the bash shell that was inherited from the Bourne
shell (see “man test”) whereas double square brackets belong to bash itself
(see “man bash”). You can use the features of “test” in double square brackets
since the bash shell emulates “test” itself, but you can not use the pattern
matching features of the bash shell in single square brackets. Here is
the “if” test: Note that “!=” means “not equal” (“==” means equal) and
the pattern must not be enclosed in quotes or it will be treated as a literal
string.
if [[ "$resp" != [yY]* ]] ; then exit 1; fi |
#!/bin/bash
echo -n "Enter string for numeric test: " ; read str if [ -z "${str//[0-9.]/}" ] && [ -n "$str" ] ; then echo "numeric" else echo "non-numeric" fi |
The way the previous example works is that the brace expression of the form ${var//pattern/string} replaces all occurrences of the pattern, which in this case is the range 0-9 plus a “.”, with the string, which in this case is null. So all numeric digits and “.” get removed. The “-z” test tests for zero length of a string and the –n test makes sure a null string was not entered. This method is not entirely accurate as the string “12.34.56” would pass the test for numeric.
For more information on brace expressions, consult the “man” pages for “bash” and for more information on condition testing, consult the “man” pages for “test”.
Lastly, there is an in-house written script called numtest
for testing if a string is a valid number. It will accept negative numbers
as well. What you do is to call it and give it the string as a parameter
and test the return code $? after you call it (NB: testing return codes
is an important part of learning to write scripts). Try this out yourself
on the command line and try setting “str” to different values.
str=1234f.12345 ; numtest $str ; echo $? |
“numtest” will return the value “0” (“true”) if it detects that a string
is numeric, so you can use it like this:
#!/bin/bash
if numtest "$1" ; then echo "parameter 1 is numeric" else echo "parameter 1 is non-numeric" fi |
You could also test for “not true” using “!” like this:
#!/bin/bash
if ! numtest "$1" ; then echo "parameter 1 is not numeric" else echo "parameter 1 is numeric" fi |
grep 'pattern' *.sas |
…then although you think you might have entered only 2 parameter values
to “grep”, what grep sees is the pattern value followed my multiple file
names, each of which is a parameter. The same situation applies to any
script you write. Suppose you wrote a script called “search” that did the
same thing as above. You would want to store the pattern in a variable
and then loop through each file name searching for the pattern. This is
how you would do it:
#!/bin/bash
pattern=$1 shift while [ $# -gt 0 ] ; do grep “$pattern” “$1” shift done |
Note that for the above the parameters are “shifted” each time such that the file you are working on is always parameter 1, hence it is referred to as $1 to the “grep” command. And because it is possible that what is defined to $1 might contain a space then $1 is enclosed in double quotes. It makes sense to always enclose references to parameters or variables in double quotes, unless you know that it is not possible for them to contain them, to avoid problems from enclosed spaces.
There is another way to loop through parameters that is just as good
and is maybe more descriptive. Here it is again:
#!/bin/bash
pattern=$1 shift for file in “$@” ; do grep “$pattern” “$file” done |
In this last case, “$@” represents the entire list, but because it is enclosed in double quotes it has a special meaning in that each element in the list is enclosed in quotes. It is not generally true that when you enclose a resolved variable in double quotes it quotes all elements. It only works for “$@”). This protects parameter values that may have one or more spaces in it from being interpreted as multiple entries. So if you use this second method, then make sure you use “$@” enclosed in double quotes. Also, since a parameter value might contain one or more spaces then the individual element (in this case $file) should be enclosed in double quotes.
#!/bin/bash
pattern=$1 shift for file in “$@” ; do grep “$pattern” “${file%.*}.log” done |
Before leaving this section, it is worth pointing out that the “##”
notation is very good for dropping a leading path name. Compare the outputs
from these two commands:
echo $PWD
echo ${PWD##*/} |
Note that if there is any possibility that a user will be running the same script simultaneously (and it migth be hard to know if this is the case if it is used as a subscript) then you have to avoid a name clash with these temporary files. If you suffix your file name with "$$" like this "$HOME/tempfile_$$.tmp" then the "$$" will resolve to your pid number and so your file name will be unique. The only trouble you might get with this technique is that if users continually interrupt the script they will get files left over in their home directory with this "$$" suffix whereas if a straight name is used without the "$$" then even if a file gets left over, it will get overwritten next time the script is run, so this will not fill up a person's home directory.
Here is a script that lists all files with a “.log” and “.lst” extension
where no corresponding file with a “.sas” extension exists.
#!/bin/bash
ls *.sas | sed 's/\.sas$//' | sort > $HOME/saslist.tmp ls *.log | sed 's/\.log$//' | sort > $HOME/loglist.tmp ls *.lst | sed 's/\.lst$//' | sort > $HOME/lstlist.tmp join -v 2 $HOME/saslist.tmp $HOME/loglist.tmp | awk '{print $0 ".log"}'
> $HOME/bad.tmp
sort $HOME/bad.tmp rm -f $HOME/saslist.tmp $HOME/loglist.tmp $HOME/lstlst.tmp $HOME/bad.tmp |
The numbers “1” for standard output and “2” for standard error are file
descriptor numbers. There is also a “0” for standard input. These three
files get opened automatically. But you can open files up to a number that
is currently set to the limit (that might be 256 - find out by using the
command “ulimit –n” or “ls /dev/fd” to see how many are listed). You can
use the “ls” command to get more information on these file descriptors.
The command:
ls –l /dev/std* |
…will show you that “stdin” (standard input), “stdout” (standard output)
and “stderr” (standard error) are symbolic links to file descriptor locations.
And to list these out you can use:
ls -l /proc/self/fd/[0-2] |
…and you will see that these files are “character special” files (see “man ls”) and that the “group” column has “tty” in it, which is short for your current terminal.
Here is something for you to try out in your terminal window. First
open up file 7 by entering this command:
exec 7>$HOME/seven.tmp |
Next write a couple of lines to file 7 like this:
echo hello 1>&7
echo world 1>&7 |
Now play the lines back like this:
cat $HOME/seven.tmp |
Next close file 7 like this:
exec 7>&- |
Now try writing to it again and you will get an error message:
$ echo hello world 1>&7
bash: 7: Bad file number |
So you see you have extra files that you can open and close and write
results to. A file can be opened to write to the same location as an existing
file. This is how you would open file 3 as having the same output as file
1. Then, in a sense, you will have stored the location of file 1 (standard
output).
exec 3>&1 |
You could then redirect standard output to a file like this:
exec 1>$HOME/output.tmp |
…and after that, all your standard output would go to this file. When
you had finished writing to the file you could restore standard output
to its location stored in the file 3 descriptor like this:
exec 1>&3 |
…and then close file 3 like this:
exec 3>&- |
To save some typing, the above 4 “exec” statements could be combined
into 2 (but note that the order is important) like this:
exec 3>&1 1>$HOME/output.tmp
exec 1>&3 3>&- |
A format in a “printf” string is indicated by a “%” sign. What follows it is optionally a number for a minimum length and it ends in an “s” (for “string”) or a “d” (for “decimal”). If it is preceded by a “-“, then the value will be left-aligned. A new line must be written in the string (usually at the end) as “\n”. This last thing is something that you will often forget to do. Note also that the lengths are the minimum lengths. If your number or string is longer then it will not be truncated. This mean if you were using “printf” to try to maintain alignment of a rightmost character then you would have to ensure the format length was long enough for any possible value you might encounter.
Try entering this command as see what happens when you change the length
of the formats, omit the length and the effects of removing the preceding
“-“ sign.
printf "Hello %-10s to %-8d you \n" abc 3 |
It is not possible to “pipe” into “printf” and yet sometimes you will
need to do so. But you can achieve it using the “printf” command within
a call to “awk” (or “nawk” or “gawk”). Here is an example of a call to
“printf” within “awk”. $1 is the first script parameter and is enclosed
in double and single quotes but $0, not enclosed in any quotes, has a special
meaning to “awk” as “the entire input line” ($1, not enclosed in quotes,
would be the first field of the input line). What it is doing in this case
is printing a file name in the first 30 positions (left-aligned) and following
that with a line from the file.
{printf "%-30s %s\n","'$1'",$0} |
#!/bin/bash
#-- open up an extra input file --
#-- read each file from standard input, prompting for file
#-- close the extra input file --
|
Go to a directory where you know there are log files (ending in the
“.log” extension) and list these files and redirect to a file in your home
are like this:
ls *.log > $HOME/loglist.tmp |
Next edit the file you created in your home area and add spaces in the
middle of some of the file names. Then type in this command to list them:
for file in $(cat $HOME/loglist.tmp); do echo "$file"; done |
You will discover that if you do this then the file names with spaces
in are treated as multiple files. So this method does not work with these
files and yet this is a very common technique used by people who write
shell scripts because they assume there will never be spaces in a file
name. It is best if you do not use this method where file names are concerned
and instead use a “while read” loop like this:
while read file; do echo "$file"; done < $HOME/loglist.tmp |
If you use the “while read” method then you will not have a problem as “read” waits for a line feed (new line) and so it reads the file name including the spaces in it.
date '+%a %b %e %T %Z %Y' |
The “+” indicates to date that it should use the string as a format.
“%a” is the short day, “%b” is the short month name, etc.. To use long
day names and a long month name and leave off the time information then
use:
date '+%A %B %e %Y' |
You can put any characters you like within the format string. For example, “date '+%e-%b-%Y'” will display the date in the form “26-Feb-2004”. Refer to “man date” for more information.
Putting out the date and time can be very useful sometimes. Suppose
you had written a script that ran a suite of programs. When it runs then
you would probably want to keep a log of messages and errors by redirecting
standard output and standard error like this:
runsuite 1>runsuite.msg 2>runsuite.err |
It would be useful if the messages started off saying who ran the script
and when and maybe also when the script ended. You could put out a start
message like this. Try it out to see what you get.
echo "Started by $USER at $(date '+%T on %A, %e %B, %Y')" |
# cat the following to a file until "FINISH" is encountered
cat > $HOME/hdr.tmp << FINISH /* / Program : $progname / Author : $author / Date : $date / Drug : $drug / Protocol/Inc : $prot / $inc / Purpose : $purpose / Notes : /================================================================================ / AMENDMENT HISTORY: / init --date-- mod-id ----------------------description------------------------- / /===============================================================================*/ FINISH |
The string that stops the lines being written to $HOME/hdr.tmp is “FINISH”. You could choose any name you wanted but this would have to be put in the first column for it to work.
var1=13.41;var2=13.4677777; echo "$var1 - $var2" | bc –l |
You will see a minus sign at the start. To test for it you could do
it like this, using the minus sign as the start of a pattern:
if [[ "$(echo $var1 - $var2 | bc -l)" == -* ]] ; then echo "less than" ; fi |
Although this works, it is not very intuitive. An in-house utility named compfl has been written to make the comparison of floating point numbers easier.
http://www.faqs.org/faqs/unix-faq/shell/shell-differences/
cd / |
…and then do a full search for bash like this:
find . –name bash 2>/dev/null |
If it did come back with something and the bash file was in a “bin”
directory then it would mean they have bash installed. If the search comes
back with nothing and you are sure the search was done from the root directory
then the next thing to do is to ensure that the Korn shell exists somewhere
by again doing a search from the root directory downwards like this:
find . –name ksh 2>/dev/null |
It is extremely likely that they have the Korn shell somewhere. It needs
to be in a “bin” directory, otherwise we would have to wonder whether it
was another type of file with the same name. What we want is for it to
be located at /bin/ksh. If it is located in another “bin” directroy, according
to the list returned, then ask the user to try to invoke the Korn shell
like this:
/bin/ksh |
If the above returns a “no such file or directory” message, then we know not to use that location. If no message was returned then we know the location is good. Tell the user to enter the command “exit” to leave the Korn subshell. If /bin/ksh did not work, then of the other locations returned by the “find” utility, choose the one with the shortest file name. In other words, use “/usr/bin/ksh” instead of “/usr/local/bin/ksh” if both were returned. This is the location that we will put in the first line of our Korn scripts, assuming /bin/ksh did not work. It should definitely be in a “bin” directory otherwise we would have to wonder if it was a different type of file that happened to have the same name.
echo -n "Enter script name: "
read scripname |
That way the input starts on the same line as the message. But the Korn
shell does not support this. It will not complain, but it will display
the “-n” and then the message and then start a new line. To hold the input
line where it is you need to use “printf” like this:
printf "Enter script name: "
read scripname |
“print” on its own would start a new line but “printf” holds on the current line unless you specifically cause a line feed by putting “\n” at the end. This is a little annoying, as “printf” means “print formatted”. Yet for the Korn shell, it is used for printing ordinary messages and holding on the same line. If you have an interactive script then you will have to convert any “echo –n”s to “printf”s. A search and replace will do this quickly.
var2=${var1:0:1} |
The above will give var2 the first character held in var1. Korn does not accept this. It will say “bad substitution” when it encounters this. But guess what? There is more than one Korn shell. There is an “enhanced” Korn shell, “ksh93”, that does accept this syntax. You can read about this “enhanced” Korn shell in the link below.
http://publib16.boulder.ibm.com/pseries/en_US/aixuser/usrosdev/ksh93.htm
Chances are you do not have ksh93 so you need to implement bash substring
in a different way. And you may be unlucky in that the substring operation
depends on the value of another variable like this:
var2=${var1:$pos:1} |
You can achieve this in the Korn shell by using the “cut” command. You
have got to remember, however, that arrays in bash start with element 0
but the “cut” command considers the first character to be character 1.
The best solution would be to create a function in the Korn script called
“substr” and use that. Remember that parameters to a shell function are
positional.
substr ()
{ if [ -n "$3" ] ; then echo $1 | cut -c $(($2+1))-$(($2+$3)) else echo $1 | cut -c $(($2+1))- fi } |
Using this function in a script then “var2=${var1:$pos:1}” in the bash
shell script turns into this, instead, in the Korn shell script:
var2=${var1:$pos:1} # was this in the bash shell script
var2=$(substr $var1 $pos 1) # becomes this in the Korn shell script |
This is a nuisance but not a major problem. To recap, if you get the “bad substitution” message and it is clear that the bash shell script is using the ${var:pos:len} substring form, then add a function named “substr”, as shown above, at the start of the Korn script, and call that instead.
There are other differences between the shells, but you are unlikely to encounter them. There is a very condensed summary of the differences between the shells on the following Internet page:
http://www.unixguide.net/unix/bash/C2.shtml
echo hello world | read x # works in the Korn shell
echo $x |
This works fine in the Korn shell. When you echo $x you will see the
text “hello world”. But if you do that with the bash shell then you see
nothing. This is because “read x” was done in a sub-process and when it
has ended then the “x” variable contents are destroyed. To have the same
effect in the bash shell you would have to change it to this:
x=$(echo hello world)
echo $x |
So if you are converting a Korn shell script to bash then check all the piped actions to see if the results are being read into a variable at the end and make the appropriate change as shown above. You will find this common in Korn shell scripts. It is a pity it does not work in the bash shell, since piped actions are supposed to work like that. Many people consider this to be a fault with bash and it is one of the reasons why some people prefer the Korn 93 shell.
There are other differences between the Korn and bash shells. Pattern
matching operators are different and there are the double operators “&&”
and “||”. You can read more about it here:
http://www.llywelyn.net/docs/ref/korn.html#reg
Bash is a very good interactive shell as well as being very good for programming. It makes more sense to use a shell that is good for both so that a user can “progress” from being an interactive user to writing shell scripts without having to relearn some syntax. In the “battle of the shells”, the race has already been run, although we will probably have to wait a while for the winner to be announced.
There is a further topic that might interest you if you deal with large
text files (and perhaps also postscript printing) and that is to do with
the powerful stream editor “gawk”. There is a another document you might
want to read, written by the same author of this document, called “Writing
gawk programs”.
Use the "Back" button of your browser to return to the previous page
SAS and all other SAS Institute Inc. product or service names are registered
trademarks or trademarks of SAS Institute Inc. in the USA and other countries.
® indicates USA registration.
Read more about:
shell and script
The Fastest
FTPS (SSL)
anywhere,
FREE Go FTP
Program