In the course of my career, I’ve periodically come across code like this in shell scripts:
TEMP_FILE=/tmp/tempfile
Or sometimes, slightly more elegantly:
TEMPFILE=/tmp/tempfile.$$
The problems with the first example are obvious, especially if it appears in many different scripts. The second is better. The “$$” means “my process ID”, who if whatever script had a process ID of 5309, the TEMPFILE variable would be set to /tmp/tempfile.5309. This makes collisions between scripts extremely unlikely, but is still suboptimal. What if there is a file called /tmp/tempfile.5309 and it’s owned by another user, or what if you don’t have permission to write to /tmp? It’d be better to find out immediately than many lines later when you try to write something.
That’s a core consideration here. In the above examples, we’re just assigning a value to a variable. We’re not guaranteeing that we can use the tempfile. What should happen is that we somehow (1) get a tempfile name, and (2) guarantee that we can use it (at least at the instant it’s created). Fortunately we can do just that with mktemp!
root@crash:~# mktemp /tmp/tmp.OSN8Yv7RUj root@crash:~# ls -l /tmp/tmp.OSN8Yv7RUj -rw------- 1 root root 0 Apr 9 10:33 /tmp/tmp.OSN8Yv7RUj root@crash:~#
The mktemp(1) command chooses a unique tempfile name for us, opens it, sets ownership, etc. to the caller, and then returns the name. So the One True Way to choose a tempfile is:
TEMPFILE=$(mktemp)
mktemp comes with many useful options such as using a different directory, creating a temp directory instead of a file, specifying the naming pattern to use when using a file, and more. But in true Unix fashion, when called “naked” (no arguments), it does its primary function with useful defaults.
There’s still one problem though. Script writers are legendary for creating tempfiles and forgetting to clean them up. Go log into any large Unix system with many users and you’ll probably find hundreds or thousands of tempfiles in /tmp that someone created in a script and forgot to clean up. Sometimes it’s just ignorance, and sometimes programmer laziness. Sometimes it’s because a program exits through an unexpected branch of logic and the “clean up” at the end is never called.
Wouldn’t it be cool if there was a way to automatically clean up all created tempfiles? I mean, regardless of how your program exits, the tempfiles will be cleaned up?
There is!
I’m using Bash 5.1, but as I recall, what I’m showing you goes back to ksh88 and probably earlier.
What we’re going to look at is the trap facility. First, you need to know a little about signals. When a process is sent a signal, such as by someone typing “kill <PID>” it normally does whatever the default for that signal is. For example, if you type
kill 5309
Then process 5309 is sent a QUIT signal. If it does nothing with the signal, the default is to terminate the process. However, the process install a handler for that signal and say “wait, before I quit, I want to do X and Y” or “sorry, I’m ignoring your QUIT”. Note that the KILL (9) signal cannot be handled or ignored.
Sometime signaling is used to tell a program to do something. For example, if you send the HUP (hangup) signal to named, it will reread its config, a common convention. There are two signals (USR1 and USR2) which are undefined – they’re for you to use for whatever you want in your program.
The Bash trap (a shell built-in) syntax is:
trap [what to do] [on what signal]
For example, here is a quick example of how a Bash script can handle the USR1 signal to reread its configuration:
#!/bin/bash function reread_config() { printf "rereading config\n" } trap reread_config USR1 # infinite wait loop while [ 1 ] ; do sleep 1 done
And in action:
root@crash:~# ./trap_example.sh & [1] 3285 root@crash:~# kill -USR1 3285 root@crash:~# rereading config root@crash:~#
I should note that there was a pause of a second or so after I sent that kill signal. Bash only fires signal handlers “between commands”.
Now, for our purpose of cleaning up tempfiles, we can use Bash’s “fake signal” called EXIT. There is no “EXIT” signal in Unix, but Bash fires a signal called EXIT when the script is exiting, just so you can add a hook if you want. So whether it’s your logic calling the “exit” command or someone hitting control-C or sending SIGQUIT, the EXIT handler will always fire. The only exception is if someone does a kill -9 (SIGKILL), which is untrappable and will immediately stop the process.
So to clean up TEMP_FILE, all we need do is:
trap "rm -f $TEMP_FILE" EXIT
Example:
#!/bin/bash TEMPFILE=$(mktemp) trap "rm $TEMPFILE" EXIT printf "our tempfile is %s:\n" $TEMPFILE ls -l $TEMPFILE
In action:
root@crash:~# ./trap_example2.sh our tempfile is /tmp/tmp.no5wPXRwfj: -rw------- 1 root root 0 Apr 9 11:25 /tmp/tmp.no5wPXRwfj root@crash:~# ls -l /tmp/tmp.no5wPXRwfj ls: cannot access '/tmp/tmp.no5wPXRwfj': No such file or directory root@crash:~#
Note that the trap statement evaluates TEMPFILE at the time it’s called. This would not work:
#!/bin/bash trap "rm $TEMPFILE" EXIT TEMPFILE=$(mktemp) printf "our tempfile is %s:\n" $TEMPFILE ls -l $TEMPFILE
The problem is that when the trap statement is evaluated, TEMPFILE is blank, so the EXIT signal handler is “rm ” not “rm /tmp/tmp.xxxxx”.
But what if you have many tempfiles? The solution there is to use a unique pattern for your tempfiles, so that you can issue an rm with a wildcard. BUT BEWARE OF USING DOUBLE QUOTES TO WILDCARD HERE. They are evaluated when the trap statement is created, not when it’s executed! So I’d recommend always using single quotes.
Here are two solutions, and there others.
First, you can use a UUID pattern to create the tempfiles. Then when you go to remove them, you can just rm with a wildcard match for that UUID. Barring a one-5.3×1036-chance of someone else getting the same UUID (it’s not going to happen), you can remove them with a pattern.
Example:
#!/bin/bash UUID=$(uuid -v 4) TEMPFILE1=$(mktemp /tmp/${UUID}-XXX) printf "TEMPFILE1 is $TEMPFILE1\n" TEMPFILE2=$(mktemp /tmp/${UUID}-XXX) printf "TEMPFILE2 is $TEMPFILE2\n" TEMPFILE3=$(mktemp /tmp/${UUID}-XXX) printf "TEMPFILE3 is $TEMPFILE3\n" trap "rm -f /tmp/${UUID}*" EXIT
In action:
root@crash:~# ./trap_example3.sh TEMPFILE1 is /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64-SsR TEMPFILE2 is /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64-fRv TEMPFILE3 is /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64-bJc root@crash:~# ls -l /tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64* ls: cannot access '/tmp/62d7075d-d837-4327-8ae6-2bed81e1ad64*': No such file or directory root@crash:~#
Another approach would be to use an array for tempfiles, and then clean that up. Example:
#!/bin/bash declare -a TEMPFILES TEMPFILES[1]=$(mktemp) TEMPFILES[2]=$(mktemp) TEMPFILES[3]=$(mktemp) TEMPFILES[4]=$(mktemp) TEMPFILES[5]=$(mktemp) for i in 1 2 3 4 5 ; do printf "tempfile %d is %s:\n" $i ${TEMPFILES[i]} ls -l ${TEMPFILES[i]} done trap 'for tempfile in "${TEMPFILES[@]}" ; do rm -f $tempfile ; done' EXIT
In action:
root@crash:~# ./trap_example4.sh tempfile 1 is /tmp/tmp.DwuPExLh3B: -rw------- 1 root root 0 Apr 9 11:49 /tmp/tmp.DwuPExLh3B tempfile 2 is /tmp/tmp.9qiJxzLxCC: -rw------- 1 root root 0 Apr 9 11:49 /tmp/tmp.9qiJxzLxCC tempfile 3 is /tmp/tmp.RzWwthPXx9: -rw------- 1 root root 0 Apr 9 11:49 /tmp/tmp.RzWwthPXx9 tempfile 4 is /tmp/tmp.u4RFSQfORY: -rw------- 1 root root 0 Apr 9 11:49 /tmp/tmp.u4RFSQfORY tempfile 5 is /tmp/tmp.D4HbvM9egS: -rw------- 1 root root 0 Apr 9 11:49 /tmp/tmp.D4HbvM9egS root@crash:~# ls -l /tmp/tmp* ls: cannot access '/tmp/tmp*': No such file or directory root@crash:~#
There are other solutions, such as using a function to maintain a list of tempfiles and then calling a cleanup function which consults that list. See this Linux Journal article for an implementation.
Related Posts:
- LowEndBoxTV: Are All Hetzner Locations Alike?No!And Piotr Has the Proof: “This is Clearly Not the Same Processor” - November 22, 2024
- One Week From Tomorrow…THE WORLD WILL LOSE THEIR MINDS!Lines Are Already Forming! - November 21, 2024
- Crunchbits Discontinuing Popular Annual Plans – The Community Mourns! - November 20, 2024
A few embellishments if I may:
By declaring the trap handler as a function you can delay the evaluation of any variables until the trap happens, so you can set, or even change them much later in the script.
If you need to create more than just one or two transient files you can also create a temporary folder using mktemp, create files with regular names within that, without fear of conflict, and clear the whole lot in the trap handler.
BASH always traps EXIT even on Ctrl or ‘kill -TERM’, but other shells (e.g. Debian’s DASH) do not, well worth trapping TERM and INT as well for portability.
e.g.