shell `trap` signal
What’s wrong with signal handling like this:
#!/bin/sh
trap 'echo Cleanup…' EXIT HUP INT TERM
...
Exit and signals
Before we begin:
Actually exit codes are mutual exclusive to signal statuses:
A process may either exit normally using exit
or terminate via a signal.
If you read man:bash you will read this:
The return value of a simple command is its exit status, or 128+n if the command is terminated by signal n.
That might give you the idea, that they are the same, but that is only a (broken) shell convention to map signal statuses to exit codes. Reading man:exit you see this:
The value status & 0xFF is returned to the parent process as the process’s exit status,
So there are 256 exit codes from 0 to 255, which a process can use to exit.
The parent process then uses waitpid() to wait for the childs state change:
That may be the process exited by calling
exit()
itself or caught asignal()
, which might havekill()
ed the process or just suspended it.
You then have to first use WIFEXITED()
or WIFSIGNALED()
to check, if the child exited normally via exit()
or caught a signal()
.
Only after that you should either use WEXITSTATUS()
to extract the byte containing the exit code or use WTERMSIG()
to extract the signal number.
In a shell script you do not have access to these low-level C functions, but only get the mangled exit status.
You cannot distinguish is the called process did exit(130)
itself or was terminated by the user pressing Ctrl-C so send SIGINT
to it.
Signals and EXIT trap
Here’s a short overview of commonly used signals and traps.
signal | number | trigger | when |
---|---|---|---|
EXIT | “0” | exit |
shell process exits |
SIGHUP | 1 | login TTY closed | |
SIGINT | 2 | Ctrl-C | user aborts process |
SIGQUIT | 3 | Ctrl-\ | user aborts process |
SIGTERM | 15 | kill $PID |
Please not that shells misuse signal 0
here:
By default there is not signal numbered 0
.
Actually it is a no-operation and can be used to check, if process A can send signals to process B or if process B is still alive.
bash
and other shells re-use that number to give their EXIT
handler a number, which is supposed to be called on any exit from shell.
But that behaviour is very implementation dependant as you will see later on.
Implementation specific handling of EXIT
Let’s try this with the more informative shell script trap.sh
:
#!/bin/bash
cleanup () {
local rv=$? sig=${1:-0}
echo "Process $$ received signal $sig after rv=$rv"
case "$sig" in
0|'') exit "$rv";;
*) trap - "$sig"; kill "-$sig" "$$";;
esac
}
trap 'cleanup 0' EXIT
trap 'cleanup 1' HUP
trap 'cleanup 2' INT
trap 'cleanup 3' QUIT
trap 'cleanup 15' TERM
[ -n "${1:-}" ] && kill "-$1" "$$"
bash
$ bash ./trap.sh 0 # EXIT
Process 499218 received signal 0 after rv=0
$ bash ./trap.sh 1 # SIGHUP
Process 499237 received signal 1 after rv=0
Process 499237 received signal 0 after rv=0
Hangup
$ bash ./trap.sh 2 # SIGINT
Process 499256 received signal 2 after rv=0
Process 499256 received signal 0 after rv=0
$ bash ./trap.sh 3 # SIGQUIT
Process 499275 received signal 3 after rv=0
Process 499275 received signal 0 after rv=0
$ bash ./trap.sh 15 # SIGTERM
Process 499294 received signal 15 after rv=0
Process 499294 received signal 0 after rv=0
Terminated
As you can see bash
always calls the trap handler for EXIT
!
dash
Let’s repeat this with dash
:
$ dash ./trap.sh 0 # EXIT
Process 502873 received signal 0 after rv=1
$ dash ./trap.sh 1 # SIGHUP
Process 501892 received signal 1 after rv=0
Hangup
$ dash ./trap.sh 2 # SIGINT
Process 501912 received signal 2 after rv=0
$ dash ./trap.sh 3 # SIGQUIT
Process 501929 received signal 3 after rv=0
Verlassen (Speicherabzug geschrieben)
$ dash ./trap.sh 15 # SIGQUIT
Process 501971 received signal 15 after rv=0
Terminated
busybox
And once more with busybox
:
$ busybox sh ./trap.sh 0 # EXIT
Process 502338 received signal 0 after rv=0
$ busybox sh ./trap.sh 1 # SIGHUP
Process 502366 received signal 1 after rv=0
Hangup
$ busybox sh ./trap.sh 2 # SIGINT
Process 502402 received signal 2 after rv=0
$ busybox sh ./trap.sh 3 # SIGQUIT
Process 502439 received signal 3 after rv=0
Process 502439 received signal 0 after rv=0
$ busybox sh ./trap.sh 15 # SIGTERM
Process 502269 received signal 15 after rv=0
Terminated
There EXIT
is almost never called, except by busybox
on SIGQUIT
.
That is why portable shell scripts setup trap
not only for EXIT
, but also for other SIG
nals.
But if you do that, please make sure to do it right:
- Reset the
trap
handler to its default. - Afterwards kill the process by re-sending the received signal to the process again.
Why proper trap handling is important
Viacheslav Biriukov wrote a great blog post about Process groups, jobs and sessions explaining why proper exiting is important.
A program might setup a signal handler for SIGINT
to prevent the program from just terminating, which might loose important data.
It might ask the user if terminating is okay or if the data should be saved first before quitting.
A surrounding shell script must then decide, if this is an abnormal exit and should terminate itself afterwards, or should continue normally.
The UNIX convention is to transfer that detail via exit codes and signal statuses.
So be careful and do it right if your shell script starts using trap
.
Conclusion
- Use
bash
as it has consistent handling oftrap EXIT
. - If you want to or must use other shells: Do not use the same
cleanup
trap ofEXIT
and other signals. - If you trap signals, make sure to reset the handler and to re-raise the signal to properly propagate them.