The Universe of Discourse
Wed, 16 Nov 2011

Arithmetic expressions in shell scripts
This spring will be the 25th anniversary of my involvement with Unix, and I have spent way too much of that time writing shell scripts. Back before we had Perl and the other 'P' languages (Python, PHP, Puby, and Pickle) you programmed in C or you programmed in shell. Bourne shell, to be specific. (It was named for its author, Steven Bourne. There was a time before there was a Bourne shell, when there was only "the shell", written by Ken Thompson, but that predates even my experience.) People did sometimes try to program the C shell, but only the very foolish tried it more than once. (Tom Christiansen once wrote a very detailed article explaining why, if you are interested.)

C is still used, but it is still C, and, as they say, C is a language that combines the power of raw assembly with the expressiveness of raw assembly. If you wanted to do systems programming, you wrote in C, because that was what there was, but if you wanted to do almost anything else, you wrote in Bourne shell, because otherwise you spent a lot of time counting bytes and groveling over core dumps. If you knew what you were doing, you wrote as much as possible in Bourne shell, and for the parts where your shell script needed to do something interesting, you had it invoke some small utility program that you or someone else had written in C.

"Interesting" in this case had an extremely low threshhold. You called out to a C utility to sort data. You called out to a C utility to remove or rename a file. You called out to a C utility to test for the existence of a file. You called out to a C utility to compare two strings. In early versions of the shell, you called out to a C utility to perform file globbing—that is, to expand something like dir?/*.c to a list of files—although this function had been absorbed into the shell itself by 1979, several years before I arrived. You called out to a C utility to print a string to the terminal. And you called out to a C utility if you wanted to do arithmetic.

Even including languages that nobody is expected to actually use, Bourne shell is probably the only programming language I have ever used that does not have any built-in operators for performing arithmetic. Instead, there is a C utility program called expr which interprets its command-line arguments as an arithmetic expression, evaluates the expression, and prints the result on the standard output. So for example, if your script has variables x and y and you want to add these and store the result into z, you write:

       z=`expr $x + $y`
This will fork a subprocess, which will execute the command expr 3 + 4 (or whatever). The command will emit the string 7 into a pipe, and the shell will read the string out of the pipe and store it into z. Astounding!

The expr program is a real piece of crap. The following reasonable-seeming invocations of expr all fail:

       z=`expr $x + 1.5`
       z=`expr $x+$y`
       z=`expr $x * $y`
The first fails because the craptastic yacc parser in expr has a value stack that is integer-only, so the program was not written to handle fractional values, and will instantly abort with the message non-numeric argument upon encountering the string 1.5 in the input. The second fails because the craptastrophic lexer (a whole 12 lines of C code) assumes that each command argument will be a single token, and makes no effort to actually do any, you know, lexing. The third fails because expr is a command run in a subshell, and since the * character is special in the shell it expands to a list of the files in the current directory, so although you thought you were going to run expr 3 * 4 you actually ran expr 3 hostid sys3 sys3.tar.gz v5root v5root.tar.gz v6doc v6doc.tar.gz v6root v6root.tar.gz v6src v6src.tar.gz v7 v7.tar.gz 4. The whole thing is a craptaclysm of craptitude.

A better way to do arithmetic in a shell script was to invoke a different utility program, bc, the "basic calculator". You sent your arithmetic expression to bc on the standard input (which avoided the craptysmal shell expansion of *) and got the answer on the standard output, typically something like this:

    z=`echo "$x + $y" | bc -l`
You needed the -l flag to enable floating-point calculations; it also enabled certain higher functions such as square roots and trigonometry.

I had assumed that bc was a later development than expr, but it appeared in Unix version 6, while expr did not appear until version 7. So then I thought perhaps expr had been thrown in as a demonstration of yacc, but no, yacc was already present in version 5, and anyway, bc was written with yacc. So I no longer have any workable theory about who perpetrated expr, or why. (I have emailed Brian Kernighan to ask, and if he says anything interesting I will post an addendum.)

Anyway, about ten years after all this, the GNU project was in full swing and was reimplementing all the standard Unix tools, including the shell. Since they wanted their implementations to displace the standard implementations, they added all sorts of bells and whistles to them. So their shell, bash, contained all sorts of stuff. Among other things, it had built-in arithmetic. In bash, if you want to add x and y and put the result into z you can write:

    z=$(( x + y ))
or even:
The nifty $(( punctuation was necessary because the syntax had to be backward compatible with the Bourne shell, and every clean syntax was already used for something else. The $((…)) feature was a great improvement over expr, and in some ways, it was even an improvement over bc. It is much faster, for one thing. And since it does not invoke a subshell, you don't have to worry about * doing something weird.

But in other ways it was a step backwards. It does not have any of bc's higher mathematical functions. It doesn't do radix conversion. And it does all its calculation in machine integers, so not only does it fall short of bc's arbitrary-precision arithmetic, it can't even handle fractions:

	x=3; y=4.5
	echo $((x+y))
	     bash: 4.5: syntax error: invalid arithmetic operator (error token is ".5")
Why? Why why why??? Who ordered that? I mean, I hate floating-point arithmetic as much as the next guy—probably more—but even I recognize that people need to do it sometimes.

Well, here we are, eleven hundred words into this article and I have still not come to the point. That is typical for me, but I think that contrary to my usual practice, I will cut the scroll here and get to the real point in a day or two.

[ Addendum 20120215: At last, I got to the real point. ]

[Other articles in category /prog] permanent link