Understanding Unix shells and environment variables

How much do you know?

Summary
Unix shells and commands use shell variables. Understanding what these variables are, and how they're created and deployed, can be very useful. This month, Mo Budlong gives you a rundown on shell environment variables, and explains how you can get around some of their limitations. (2,200 words)

A shell variable is a memory storage area that can be used to hold a value, which can then be used by any built-in shell command within a single shell. An environment variable is a shell variable that has been exported or published to the environment by a shell command so that shells and shell scripts executed below the parent shell also have access to the variable.

Unix shells and environment variables: Read the whole series!

Part 1. A Breakdown of Shell and Environment Variables

Part 2. Examine and customize your Unix environment

One built-in shell command can set a shell variable value, while another can pick it up. In the following doecho script example, $PLACE is set in the first line and picked up in the second line by the built-in echo command.

Create this script and save it as doecho. Change the mode using chmod a+x doecho:

# doecho sample variable
PLACE=Hollywood
echo "doecho says Hello " $PLACE

Run the program as shown below.

In all of the following examples, I use the convention of ./command to execute a shell script in the current directory. You don't need to do this if your $PATH variable contains the . as one of the searched directories. The ./command method works for scripts in your current directory, even if the current directory isn't included on your path.

$ ./doecho
doecho says Hello Hollywood
$

In this first example, $PLACE is a shell variable.

Now, create another shell script called echoplace and change its mode to executable.

# echoplace echo $PLACE variable
echo "echoplace says Hello " $PLACE

Modify doecho to execute echoplace as its last step.

# doecho sample variable
PLACE=Hollywood
echo "doecho says Hello " $PLACE
./echoplace

Run the doecho script. The output is a bit surprising.

$ ./doecho
doecho says Hello Hollywood
echoplace says Hello
$

In this example, echoplace is run as the last command of doecho. It tries to echo the $PLACE variable but comes up blank. Say goodbye to Hollywood.

To understand what happened here you need understand something about shell invocation -- the sequence of events that occur when you run a shell or shell script. When a shell begins to execute any command, it checks to see if the command is built-in (like echo), an executable program (like vi or grep), a user-defined function, or an executable shell script. If it's any of the first three, it directly executes the command, function, or program; but if the command is an executable shell script, the shell spawns another running copy of itself -- a child shell. The spawned child shell uses the shell script as an input file and reads it in line by line as commands to execute.

When you type ./doecho to execute the doecho script, you're actually executing a command that is something like one of the following, depending on which shell you're using. (See the Resources section at the end of this column for more information on redirection.)

$ sh < ./doecho
            (or)
$ ksh <./doecho

The new shell, spawned as a child of your starting-level shell, opens doecho and begins reading commands from that file. It performs the same test on each command, looking for built-in commands, functions, programs, or shell scripts. Each time a shell script is encountered, another copy of the shell is spawned.

I have repeated the running of doecho so you can follow it through the steps described below. The output of doecho is repeated here, with extra spacing and notes.

$ ./doecho                  <-the command typed in shell one
                              launches shell two reading doecho.
doecho says Hello Hollywood <-shell two sets $PLACE and echoes
                              Shell three starts echoplace
echoplace says Hello        <-shell three cannot find $PLACE and
                              echoes a blank
$                           <-shells three and two exit. Back at shell one

As you're looking at a prompt on the screen, you're actually running a top-level shell. If you've just logged on, this will be shell one, where you type the command ./doecho. Shell two is started as a child of shell one. Its job is to read and execute doecho. The doecho script is repeated below.

The first command in doecho creates the shell variable $PLACE and assigns the value "Hollywood" to it. At this point, the $PLACE variable only exists with this assignment inside shell two. The echo command on the next line will print out doecho says Hello Hollywood and move on to the last line. Shell two reads in the line containing ./echoplace and recognizes this as a shell script. Shell two launches shell three as a child process, and shell three begins reading the commands in echoplace.

# doecho sample variable
PLACE=Hollywood
echo "doecho says Hello " $PLACE
./echoplace

The echoplace shell script is repeated below. The only executable line in echoplace is a repeat of the echoed message. However, $PLACE only exists with the value "Hollywood" in shell two. Shell three sees the line to echo echoplace says Hello and the $PLACE variable, and cannot find any value for $PLACE. Shell three creates its own local variable named $PLACE as an empty variable. When it echoes the script, it's empty and prints nothing.

# echoplace echo $PLACE variable
echo "echoplace says Hello " $PLACE

The assignment of "Hollywood" to $PLACE in shell two is only available inside shell two. If you type in a final command in shell one to echo $PLACE at the shell one level, you'll find that $PLACE is also blank in shell one.

$ echo "shell one says Hello " $PLACE
shell one says Hello
$

Thus far, you've only created and used a variable inside of a single shell level. You can, however, publish a shell variable to the environment, thereby creating an environment variable that's available both to the shell that published it and to all child shells started by the publishing shell. Use export in the Bourne and Korn shells.

$ PLACE=Hollywood; export PLACE
$

The Korn shell also has a command that both exports the variable and assigns a value to it.

$ export PLACE=Hollywood
$

The C shell uses a very different syntax for shell and environment variables. Assign a value to a shell variable by using set, then assign an environment variable using setenv. Note that setenv doesn't use the = operator.

> set PLACE=Hollywood
> setenv PLACE Hollywood

Back in the Korn or Bourne shells, if we revisit the doecho script and edit it to export the $PLACE variable, it becomes available in shell two (the publishing shell) and shell three (the child shell).

# doecho sample variable
PLACE=Hollywood; export PLACE
echo "doecho says Hello " $PLACE
./echoplace

When doecho is run, the output is changed. This happens because in shell three $PLACE is found as an environment variable that has been exported from shell two.

$ ./doecho
doecho says Hello Hollywood
echoplace says Hello Hollywood
$

Assigning a value to $PLACE before you run doecho will help you verify its scope. After doecho is complete, echo the value of $PLACE at the shell one level. Notice that doecho in shell two and echoplace in shell three both see $PLACE's value as "Hollywood", but the top-level shell sees the value "Burbank". This is because $PLACE was exported in shell two. The environment variable $PLACE has scope in shell two and shell three, but not in shell one. Shell one creates its own local shell variable named $PLACE that is unaffected by shells two and three.

$ PLACE=Burbank
$ ./doecho
doecho says Hello Hollywood
echoplace says Hello Hollywood
$ echo "shell one says Hello " $PLACE
$ shell one says Hello Burbank
$

Once a shell variable has been exported and become an environment variable, it can be modified by a subshell. The modification affects the environment variable at all levels where the environment variable has scope.

Make some changes to doecho by adding a repeat of the echo line after the return from echoplace.

# doecho sample variable
PLACE=Hollywood
echo "doecho says Hello " $PLACE
./echoplace
echo "doecho says Hello " $PLACE

After it has been echoed, modify echoplace to change the value of $PLACE. Once this is done, echo it again.

# echoplace echo $PLACE variable
echo "echoplace says Hello " $PLACE
PLACE=Pasadena
echo "echoplace says Hello " $PLACE

Retype the previous sequence of commands as shown below. Shell three alters the value of $PLACE, a change that appears in shell three -- and in shell two, even after it returns from echoplace. Once a variable is published to the environment, it's fair game to any shell at or below the publishing level.

$ PLACE=Burbank
$ ./doecho
doecho says Hello Hollywood
echoplace says Hello Hollywood
echoplace says Hello Pasadena
doecho says Hello Pasadena
$ echo "shell one says Hello " $PLACE
$ shell one says Hello Burbank
$

You have seen that the default action of a shell is to spawn a child shell whenever a shell script is encountered on the command line. Such behavior can be suppressed by using the dot command, which is a dot and a space placed before a command.

Execute doecho by starting it with a dot and a space, then echo the value of $PLACE when doecho is complete. In this example, shell one recognizes $PLACE as having been given the value "Pasadena".

$ . ./doecho
doecho says Hello Hollywood
echoplace says Hello Hollywood
echoplace says Hello Pasadena
doecho says Hello Pasadena
$ echo "shell one says Hello " $PLACE
$ shell one says Hello Pasadena
$

Normally, when a shell discovers that the command to execute is a shell script, it would spawn a child shell and that child would read in the script as commands. If the shell script is preceded by a dot and a space, however, the shell stops reading the current script or commands and starts reading in the new script or commands without starting a new subshell.

When you type in . ./doecho, shell one doesn't spawn a child shell, but instead switches gears and begins reading from doecho. The doecho script initializes and exports the $PLACE variable. The export of $PLACE now affects all shells because you exported it at the shell one level.

A dot script is very useful for setting up a temporary environment that you don't want to set up in your .profile. Suppose for instance that you have a specialized task that you do only on certain days, and that you need to set up some special environment variables for it. Place these variables in a file named specvars.

# specvars contains special variables for the
# special task that I do sometimes
WORKDIR=/some/dir
SPECIALVAR="Bandy Legs"
REPETITIONS=52
export WORKDIR SPECIALVAR REPETITIONS

If you execute this variable by simply typing in the name of the specvars file, you won't get the expected effect because a subshell, shell two, is created to execute specvars and the export command exports to shell two and below. Shell one doesn't view these exports as environment variables.

$ specvars
$ echo "WORKDIR IS " $WORKDIR
WORKDIR is
$

Using the dot command causes the script to execute as part of shell one; the effect is now correct.

$ . specvars
$ echo "WORKDIR IS " $WORKDIR
WORKDIR is /some/dir
$

So there you have some of the ins and outs of shell and environment variables, as well as some ways to get around some of their limitations. If you want to see your current environment variables, type the printenv command; a list of all variables that are available to the current shell, including all child shells, is printed out.

Contact us for a free consultation.

MENU:

SOFTWARE DEVELOPMENT:

• EXPERIENCE

PRODUCTS:

UNIX:

• UNIX TUTORIALS

LEGACY SYSTEMS:

    • LEARN COBOL
    • PRODUCTS
    • GEN-CODE
    • COMPILERS

INTERNET:

• CYBERSUITE

WINDOWS:

• PRODUCTS