The Universe of Discourse
Fri, 26 Jan 2007

Environmental manipulations
Unix is full of little utility programs that run some other program in a slightly modified environment. For example, the nohup command:


nohup COMMAND [ARG]...


Run COMMAND, ignoring hangup signals.

The nohup basically does signal(NOHUP, SIG_IGN) before calling execvp(COMMAND, ARGV) to execute the command.

Similarly, there is a chroot command, run as chroot new-root-directory command args..., which runs the specified command with its default root inode set to somewhere else. And there is a nice command, run as nice nice-value-adjustment command args..., which runs the specified command with its "nice" value changed. And there is an env environment-settings command args... which runs the specified command with new variables installed into the environment. The standard sudo command could also be considered to be of this type.

I have also found it useful to write trivial commands called indir, which runs a command after chdir-ing to a new directory, and stopafter, which runs a command after setting the alarm timer to a specified amount, and, just today, with-umask, which runs a command after setting the umask to a particular value.

I could probably have avoided indir and with-umask. Instead of indir DIR COMMAND, I could use sh -c 'cd DIR; exec COMMAND', for example. But indir avoids an extra layer of horrible shell quotes, which can be convenient.

Today it occurred to me to wonder if this proliferation of commands was really the best way to solve the problem. The sh -c '...' method solves it partly, for those parts of the process user area to which correspond shell builtin commands. This includes the working directory, umask, and environment variables, but not the signal table, the alarm timer, or the root directory.

There is no standardized interface to all of these things at any level. At the system call level, the working directory is changed by the chdir system call, the root directory by chroot, the alarm timer by alarm, the signal table by a bunch of OS-dependent nonsense like signal or sigaction, the nice value by setpriority, environment variables by a potentially complex bunch of memory manipulation and pointer banging, and so on.

Since there's no single interface for controlling all these things, we might get a win by making an abstraction layer for dealing with them. One place to put this abstraction layer is at the system level, and might look something like this:

	/* declares USERAREA_* constants,
                     int  userarea_set(int, ...)
	        and void *userarea_get(int) 
	#include <sys/userarea.h>

	userarea_set(USERAREA_NICE, 12);
	userarea_set(USERAREA_CWD, "/tmp");
	userarea_set(USERAREA_UMASK, 0022);
This has several drawbacks. One is that it requires kernel hacking. A subitem of this is that it will never become widespread, and that if you can't (or don't want to) replace your kernel, it cannot be made to work for you. Another is that it does not work for the environment variables, which are not really administered by the kernel. Another is that it does not fully solve the original problem, which is to obviate the plethora of nice, nohup, sudo, and env commands. You would still have to write a command to replace them. I had thought of another drawback, but forgot it while I was writing the last two sentences.

You can also put the abstraction layer at the C library level. This has fewer drawbacks. It no longer requires kernel hacking, and can provide a method for modifying the environment. But you still need to write the command that uses the library.

We may as well put the abstraction layer at the Unix command level. This means writing a command in some language, like Perl or C, which offers a shell-level interface to manipulating the process environment, perhaps something like this:

	newenv nice=12 cwd=/tmp signal=HUP:IGNORE umask=0022 -- command args...
Then newenv has a giant dispatch table inside it to process the settings accordingly:

	nice => sub { setpriority(PRIO_PROCESS, $$, $_) },
	cwd  => sub { chdir($_) },
	signal => sub {
		    my ($name, $result) = split /:/;
		    $SIG{$name} = $result;
	umask => sub { umask(oct($_)) },
One question to ask is whether something like this already exists. Another is, if not, whether it's because there's some reason why it's a bad idea, or because there's a simpler solution, or just because nobody has done it yet.

[Other articles in category /Unix] permanent link