Here Today, Gone When You Exit: Proper Tempfiles in Shell Scripts

Apr 09, 2022 @ 2:53 pm

Shebang Book In the course of my career, I’ve periodically come across code like this in shell scripts:

TEMP_FILE=/tmp/tempfile

Or sometimes, slightly more elegantly:

TEMPFILE=/tmp/tempfile.$$

The problems with the first example are obvious, especially if it appears in many different scripts. The second is better. The “$$” means “my process ID”, who if whatever script had a process ID of 5309, the TEMPFILE variable would be set to /tmp/tempfile.5309. This makes collisions between scripts extremely unlikely, but is still suboptimal. What if there is a file called /tmp/tempfile.5309 and it’s owned by another user, or what if you don’t have permission to write to /tmp? It’d be better to find out immediately than many lines later when you try to write something.

That’s a core consideration here. In the above examples, we’re just assigning a value to a variable. We’re not guaranteeing that we can use the tempfile. What should happen is that we somehow (1) get a tempfile name, and (2) guarantee that we can use it (at least at the instant it’s created). Fortunately we can do just that with mktemp!

root@crash:~# mktemp
/tmp/tmp.OSN8Yv7RUj
root@crash:~# ls -l /tmp/tmp.OSN8Yv7RUj
-rw------- 1 root root 0 Apr  9 10:33 /tmp/tmp.OSN8Yv7RUj
root@crash:~#

The mktemp(1) command chooses a unique tempfile name for us, opens it, sets ownership, etc. to the caller, and then returns the name. So the One True Way to choose a tempfile is:

TEMPFILE=$(mktemp)

mktemp comes with many useful options such as using a different directory, creating a temp directory instead of a file, specifying the naming pattern to use when using a file, and more. But in true Unix fashion, when called “naked” (no arguments), it does its primary function with useful defaults.

There’s still one problem though. Script writers are legendary for creating tempfiles and forgetting to clean them up. Go log into any large Unix system with many users and you’ll probably find hundreds or thousands of tempfiles in /tmp that someone created in a script and forgot to clean up. Sometimes it’s just ignorance, and sometimes programmer laziness. Sometimes it’s because a program exits through an unexpected branch of logic and the “clean up” at the end is never called.

Wouldn’t it be cool if there was a way to automatically clean up all created tempfiles? I mean, regardless of how your program exits, the tempfiles will be cleaned up?

There is!

I’m using Bash 5.1, but as I recall, what I’m showing you goes back to ksh88 and probably earlier.

What we’re going to look at is the trap facility. First, you need to know a little about signals. When a process is sent a signal, such as by someone typing “kill <PID>” it normally does whatever the default for that signal is. For example, if you type

kill 5309

Then process 5309 is sent a QUIT signal. If it does nothing with the signal, the default is to terminate the process. However, the process install a handler for that signal and say “wait, before I quit, I want to do X and Y” or “sorry, I’m ignoring your QUIT”. Note that the KILL (9) signal cannot be handled or ignored.

Sometime signaling is used to tell a program to do something. For example, if you send the HUP (hangup) signal to named, it will reread its config, a common convention. There are two signals (USR1 and USR2) which are undefined – they’re for you to use for whatever you want in your program.

The Bash trap (a shell built-in) syntax is:

trap [what to do] [on what signal]

For example, here is a quick example of how a Bash script can handle the USR1 signal to reread its configuration:

#!/bin/bash

function reread_config() {
 printf "rereading config\n"
}

trap reread_config USR1

# infinite wait loop
while [ 1 ] ; do
 sleep 1
done

And in action:

root@crash:~# ./trap_example.sh &
[1] 3285
root@crash:~# kill -USR1 3285
root@crash:~# rereading config

root@crash:~#

I should note that there was a pause of a second or so after I sent that kill signal. Bash only fires signal handlers “between commands”.

Now, for our purpose of cleaning up tempfiles, we can use Bash’s “fake signal” called EXIT. There is no “EXIT” signal in Unix, but Bash fires a signal called EXIT when the script is exiting, just so you can add a hook if you want. So whether it’s your logic calling the “exit” command or someone hitting control-C or sending SIGQUIT, the EXIT handler will always fire. The only exception is if someone does a kill -9 (SIGKILL), which is untrappable and will immediately stop the process.

So to clean up TEMP_FILE, all we need do is:

trap "rm -f $TEMP_FILE" EXIT

Example:

#!/bin/bash

TEMPFILE=$(mktemp)
trap "rm $TEMPFILE" EXIT
printf "our tempfile is %s:\n" $TEMPFILE
ls -l $TEMPFILE

In action:

root@crash:~# ./trap_example2.sh 
our tempfile is /tmp/tmp.no5wPXRwfj:
-rw------- 1 root root 0 Apr  9 11:25 /tmp/tmp.no5wPXRwfj
root@crash:~# ls -l /tmp/tmp.no5wPXRwfj
ls: cannot access '/tmp/tmp.no5wPXRwfj': No such file or directory
root@crash:~#

Note that the trap statement evaluates TEMPFILE at the time it’s called. This would not work:

#!/bin/bash

trap "rm $TEMPFILE" EXIT
TEMPFILE=$(mktemp)
printf "our tempfile is %s:\n" $TEMPFILE
ls -l $TEMPFILE

The problem is that when the trap statement is evaluated, TEMPFILE is blank, so the EXIT signal handler is “rm ” not “rm /tmp/tmp.xxxxx”.

But what if you have many tempfiles? The solution there is to use a unique pattern for your tempfiles, so that you can issue an rm with a wildcard. BUT BEWARE OF USING DOUBLE QUOTES TO WILDCARD HERE. They are evaluated when the trap statement is created, not when it’s executed! So I’d recommend always using single quotes.

Here are two solutions, and there others.

First, you can use a UUID pattern to create the tempfiles. Then when you go to remove them, you can just rm with a wildcard match for that UUID. Barring a one-5.3×10³⁶-chance of someone else getting the same UUID (it’s not going to happen), you can remove them with a pattern.

Example:

#!/bin/bash

UUID=$(uuid -v 4)

TEMPFILE1=$(mktemp /tmp/${UUID}-XXX)
printf "TEMPFILE1 is $TEMPFILE1\n"
TEMPFILE2=$(mktemp /tmp/${UUID}-XXX)
printf "TEMPFILE2 is $TEMPFILE2\n"
TEMPFILE3=$(mktemp /tmp/${UUID}-XXX)
printf "TEMPFILE3 is $TEMPFILE3\n"

trap "rm -f /tmp/${UUID}*" EXIT

In action:

root@crash:~# ./trap_example3.sh 
TEMPFILE1 is /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64-SsR
TEMPFILE2 is /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64-fRv
TEMPFILE3 is /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64-bJc
root@crash:~# ls -l /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64*
ls: cannot access '/tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64*': No such file or directory
root@crash:~#

Another approach would be to use an array for tempfiles, and then clean that up. Example:

#!/bin/bash 

declare -a TEMPFILES

TEMPFILES[1]=$(mktemp)
TEMPFILES[2]=$(mktemp)
TEMPFILES[3]=$(mktemp)
TEMPFILES[4]=$(mktemp)
TEMPFILES[5]=$(mktemp)

for i in 1 2 3 4 5 ; do
 printf "tempfile %d is %s:\n" $i ${TEMPFILES[i]}
 ls -l ${TEMPFILES[i]}
done

trap 'for tempfile in "${TEMPFILES[@]}" ; do rm -f $tempfile ; done' EXIT

In action:

root@crash:~# ./trap_example4.sh 
tempfile 1 is /tmp/tmp.DwuPExLh3B:
-rw------- 1 root root 0 Apr  9 11:49 /tmp/tmp.DwuPExLh3B
tempfile 2 is /tmp/tmp.9qiJxzLxCC:
-rw------- 1 root root 0 Apr  9 11:49 /tmp/tmp.9qiJxzLxCC
tempfile 3 is /tmp/tmp.RzWwthPXx9:
-rw------- 1 root root 0 Apr  9 11:49 /tmp/tmp.RzWwthPXx9
tempfile 4 is /tmp/tmp.u4RFSQfORY:
-rw------- 1 root root 0 Apr  9 11:49 /tmp/tmp.u4RFSQfORY
tempfile 5 is /tmp/tmp.D4HbvM9egS:
-rw------- 1 root root 0 Apr  9 11:49 /tmp/tmp.D4HbvM9egS
root@crash:~# ls -l /tmp/tmp*
ls: cannot access '/tmp/tmp*': No such file or directory
root@crash:~#

There are other solutions, such as using a function to maintain a list of tempfiles and then calling a cleanup function which consults that list. See this Linux Journal article for an implementation.

The End of Uniprocessor Configs on Linux - It's a Multicore-Only Kernel Now

Five Times When Updating Your OS Would Have Saved You From Being Hacked

"OMG! I Never Knew That!": The Simply Linux Tip That Has Got Me More Thanks Than Anything I've Ever ...

Need a Laugh? Read the Linux Kernel List's Foam-Mouthed Responses to Russian Programmers Banned from...

Nontechnical Nonsense: Rust Stirs Up a Storm of Drama in the Linux Kernel: Ted T'so Shouting, Mainta...

RedHat Prepares to Give Bootloaders the Boot

raindog308

Raindog308 is a longtime LowEndTalk community administrator, technical writer, and self-described techno polymath. With deep roots in the *nix world, he has a passion for systems both modern and vintage, ranging from Unix, Perl, Python, and Golang to shell scripting and mainframe-era operating systems like MVS. He’s equally comfortable with relational database systems, having spent years working with Oracle, PostgreSQL, and MySQL.

As an avid user of LowEndBox providers, Raindog runs an empire of LEBs, from tiny boxes for VPNs, to mid-sized instances for application hosting, and heavyweight servers for data storage and complex databases. He brings both technical rigor and real-world experience to every piece he writes.

Beyond the command line, Raindog is a lover of German Shepherds, high-quality knives, target shooting, theology, tabletop RPGs, and hiking in deep, quiet forests.

His goal with every article is to help users, from beginners to seasoned sysadmins, get more value, performance, and enjoyment out of their infrastructure.

You can find him daily in the forums at LowEndTalk under the handle @raindog308.

1 Comment

cochon:
A few embellishments if I may:
By declaring the trap handler as a function you can delay the evaluation of any variables until the trap happens, so you can set, or even change them much later in the script.
If you need to create more than just one or two transient files you can also create a temporary folder using mktemp, create files with regular names within that, without fear of conflict, and clear the whole lot in the trap handler.
BASH always traps EXIT even on Ctrl or ‘kill -TERM’, but other shells (e.g. Debian’s DASH) do not, well worth trapping TERM and INT as well for portability.
e.g.
```
#!/bin/sh

cleanup() {
  cd # out of the TEMPFOLDER folder
  [ -d "$TEMPFOLDER" ] && rm -rf "$TEMPFOLDER"
  trap - EXIT # to prevent double call if INT or TERM
  exit
}

trap "cleanup" EXIT INT TERM

TEMPFOLDER=$(mktemp -t -d)
cd "$TEMPFOLDER" || exit 1

echo test > testfile

exit
```
April 9, 2022 @ 4:52 pm | Reply