LowEndBox - Cheap VPS, Hosting and Dedicated Server Deals

Here Today, Gone When You Exit: Proper Tempfiles in Shell Scripts

Shebang BookIn the course of my career, I’ve periodically come across code like this in shell scripts:

TEMP_FILE=/tmp/tempfile

Or sometimes, slightly more elegantly:

TEMPFILE=/tmp/tempfile.$$

The problems with the first example are obvious, especially if it appears in many different scripts.  The second is better.  The “$$” means “my process ID”, who if whatever script had a process ID of 5309, the TEMPFILE variable would be set to /tmp/tempfile.5309.  This makes collisions between scripts extremely unlikely, but is still suboptimal.  What if there is a file called /tmp/tempfile.5309 and it’s owned by another user, or what if you don’t have permission to write to /tmp?  It’d be better to find out immediately than many lines later when you try to write something.

That’s a core consideration here.  In the above examples, we’re just assigning a value to a variable.  We’re not guaranteeing that we can use the tempfile.  What should happen is that we somehow (1) get a tempfile name, and (2) guarantee that we can use it (at least at the instant it’s created).  Fortunately we can do just that with mktemp!

root@crash:~# mktemp
/tmp/tmp.OSN8Yv7RUj
root@crash:~# ls -l /tmp/tmp.OSN8Yv7RUj
-rw------- 1 root root 0 Apr  9 10:33 /tmp/tmp.OSN8Yv7RUj
root@crash:~# 

The mktemp(1) command chooses a unique tempfile name for us, opens it, sets ownership, etc. to the caller, and then returns the name.  So the One True Way to choose a tempfile is:

TEMPFILE=$(mktemp)

mktemp comes with many useful options such as using a different directory, creating a temp directory instead of a file, specifying the naming pattern to use when using a file, and more.  But in true Unix fashion, when called “naked” (no arguments), it does its primary function with useful defaults.

There’s still one problem though.  Script writers are legendary for creating tempfiles and forgetting to clean them up.  Go log into any large Unix system with many users and you’ll probably find hundreds or thousands of tempfiles in /tmp that someone created in a script and forgot to clean up.  Sometimes it’s just ignorance, and sometimes programmer laziness.  Sometimes it’s because a program exits through an unexpected branch of logic and the “clean up” at the end is never called.

Wouldn’t it be cool if there was a way to automatically clean up all created tempfiles?  I mean, regardless of how your program exits, the tempfiles will be cleaned up?

There is!

I’m using Bash 5.1, but as I recall, what I’m showing you goes back to ksh88 and probably earlier.

What we’re going to look at is the trap facility.  First, you need to know a little about signals.  When a process is sent a signal, such as by someone typing “kill <PID>” it normally does whatever the default for that signal is.  For example, if you type

kill 5309

Then process 5309 is sent a QUIT signal.  If it does nothing with the signal, the default is to terminate the process.  However, the process install a handler for that signal and say “wait, before I quit, I want to do X and Y” or “sorry, I’m ignoring your QUIT”.  Note that the KILL (9) signal cannot be handled or ignored.

Sometime signaling is used to tell a program to do something.  For example, if you send the HUP (hangup) signal to named, it will reread its config, a common convention.  There are two signals (USR1 and USR2) which are undefined – they’re for you to use for whatever you want in your program.

The Bash trap (a shell built-in) syntax is:

trap [what to do] [on what signal]

For example, here is a quick example of how a Bash script can handle the USR1 signal to reread its configuration:

#!/bin/bash

function reread_config() {
 printf "rereading config\n"
}

trap reread_config USR1

# infinite wait loop
while [ 1 ] ; do
 sleep 1
done 

And in action:

root@crash:~# ./trap_example.sh &
[1] 3285
root@crash:~# kill -USR1 3285
root@crash:~# rereading config

root@crash:~#

I should note that there was a pause of a second or so after I sent that kill signal.  Bash only fires signal handlers “between commands”.

Now, for our purpose of cleaning up tempfiles, we can use Bash’s “fake signal” called EXIT.  There is no “EXIT” signal in Unix, but Bash fires a signal called EXIT when the script is exiting, just so you can add a hook if you want.  So whether it’s your logic calling the “exit” command or someone hitting control-C or sending SIGQUIT, the EXIT handler will always fire.  The only exception is if someone does a kill -9 (SIGKILL), which is untrappable and will immediately stop the process.

So to clean up TEMP_FILE, all we need do is:

trap "rm -f $TEMP_FILE" EXIT

Example:

#!/bin/bash

TEMPFILE=$(mktemp)
trap "rm $TEMPFILE" EXIT
printf "our tempfile is %s:\n" $TEMPFILE
ls -l $TEMPFILE

In action:

root@crash:~# ./trap_example2.sh 
our tempfile is /tmp/tmp.no5wPXRwfj:
-rw------- 1 root root 0 Apr  9 11:25 /tmp/tmp.no5wPXRwfj
root@crash:~# ls -l /tmp/tmp.no5wPXRwfj
ls: cannot access '/tmp/tmp.no5wPXRwfj': No such file or directory
root@crash:~#

Note that the trap statement evaluates TEMPFILE at the time it’s called.  This would not work:

#!/bin/bash

trap "rm $TEMPFILE" EXIT
TEMPFILE=$(mktemp)
printf "our tempfile is %s:\n" $TEMPFILE
ls -l $TEMPFILE

The problem is that when the trap statement is evaluated, TEMPFILE is blank, so the EXIT signal handler is “rm ” not “rm /tmp/tmp.xxxxx”.

But what if you have many tempfiles?  The solution there is to use a unique pattern for your tempfiles, so that you can issue an rm with a wildcard.  BUT BEWARE OF USING DOUBLE QUOTES TO WILDCARD HERE.  They are evaluated when the trap statement is created, not when it’s executed!  So I’d recommend always using single quotes.

Here are two solutions, and there others.

First, you can use a UUID pattern to create the tempfiles.  Then when you go to remove them, you can just rm with a wildcard match for that UUID.  Barring a one-5.3×1036-chance of someone else getting the same UUID (it’s not going to happen), you can remove them with a pattern.

Example:

#!/bin/bash

UUID=$(uuid -v 4)

TEMPFILE1=$(mktemp /tmp/${UUID}-XXX)
printf "TEMPFILE1 is $TEMPFILE1\n"
TEMPFILE2=$(mktemp /tmp/${UUID}-XXX)
printf "TEMPFILE2 is $TEMPFILE2\n"
TEMPFILE3=$(mktemp /tmp/${UUID}-XXX)
printf "TEMPFILE3 is $TEMPFILE3\n"

trap "rm -f /tmp/${UUID}*" EXIT

In action:

root@crash:~# ./trap_example3.sh 
TEMPFILE1 is /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64-SsR
TEMPFILE2 is /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64-fRv
TEMPFILE3 is /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64-bJc
root@crash:~# ls -l /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64*
ls: cannot access '/tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64*': No such file or directory
root@crash:~#

Another approach would be to use an array for tempfiles, and then clean that up.  Example:

#!/bin/bash 

declare -a TEMPFILES

TEMPFILES[1]=$(mktemp)
TEMPFILES[2]=$(mktemp)
TEMPFILES[3]=$(mktemp)
TEMPFILES[4]=$(mktemp)
TEMPFILES[5]=$(mktemp)

for i in 1 2 3 4 5 ; do
 printf "tempfile %d is %s:\n" $i ${TEMPFILES[i]}
 ls -l ${TEMPFILES[i]}
done

trap 'for tempfile in "${TEMPFILES[@]}" ; do rm -f $tempfile ; done' EXIT

In action:

root@crash:~# ./trap_example4.sh 
tempfile 1 is /tmp/tmp.DwuPExLh3B:
-rw------- 1 root root 0 Apr  9 11:49 /tmp/tmp.DwuPExLh3B
tempfile 2 is /tmp/tmp.9qiJxzLxCC:
-rw------- 1 root root 0 Apr  9 11:49 /tmp/tmp.9qiJxzLxCC
tempfile 3 is /tmp/tmp.RzWwthPXx9:
-rw------- 1 root root 0 Apr  9 11:49 /tmp/tmp.RzWwthPXx9
tempfile 4 is /tmp/tmp.u4RFSQfORY:
-rw------- 1 root root 0 Apr  9 11:49 /tmp/tmp.u4RFSQfORY
tempfile 5 is /tmp/tmp.D4HbvM9egS:
-rw------- 1 root root 0 Apr  9 11:49 /tmp/tmp.D4HbvM9egS
root@crash:~# ls -l /tmp/tmp*
ls: cannot access '/tmp/tmp*': No such file or directory
root@crash:~#

There are other solutions, such as using a function to maintain a list of tempfiles and then calling a cleanup function which consults that list.  See this Linux Journal article for an implementation.

 

 

raindog308

1 Comment

  1. cochon:

    A few embellishments if I may:

    By declaring the trap handler as a function you can delay the evaluation of any variables until the trap happens, so you can set, or even change them much later in the script.

    If you need to create more than just one or two transient files you can also create a temporary folder using mktemp, create files with regular names within that, without fear of conflict, and clear the whole lot in the trap handler.

    BASH always traps EXIT even on Ctrl or ‘kill -TERM’, but other shells (e.g. Debian’s DASH) do not, well worth trapping TERM and INT as well for portability.

    e.g.

    #!/bin/sh
    
    cleanup() {
      cd # out of the TEMPFOLDER folder
      [ -d "$TEMPFOLDER" ] && rm -rf "$TEMPFOLDER"
      trap - EXIT # to prevent double call if INT or TERM
      exit
    }
    
    trap "cleanup" EXIT INT TERM
    
    TEMPFOLDER=$(mktemp -t -d)
    cd "$TEMPFOLDER" || exit 1
    
    echo test > testfile
    
    exit
    
    April 9, 2022 @ 4:52 pm | Reply

Leave a Reply

Some notes on commenting on LowEndBox:

  • Do not use LowEndBox for support issues. Go to your hosting provider and issue a ticket there. Coming here saying "my VPS is down, what do I do?!" will only have your comments removed.
  • Akismet is used for spam detection. Some comments may be held temporarily for manual approval.
  • Use <pre>...</pre> to quote the output from your terminal/console, or consider using a pastebin service.

Your email address will not be published. Required fields are marked *