The Universe of Disco


Fri, 09 Dec 2005

Elimination of "f" system calls

Michael C. Toren:
> 1) Open a file descriptor pointing to the current working directory.
>
> 2) Create a temporary directory within the jail, and chroot() to it.
>
> 3) Using fchdir(), change the working directory to the file descriptor
> saved from step 1.

Oho, I hadn't seen that before. The chroot() in step 2 is required to avoid the special case in the Kernel that checks to see if you are doing ".." in the current root directory. But because you chrooted() yourself somewhere else, the special case isn't exercised.

Older systems don't have fchdir(), which is a fairly recent addition.

With the proliferation of "f" calls in recent years (fchdir, fchmod, fchown, fstat, fsync, etc.) I wonder what would be the result if the Unix system interface were redesigned to eliminate the non-"f" versions of the calls entirely. Instead, there would be a generic function, which we might call "iname", which transforms a path name to an "inode" structure:

        struct inode * iname (const char *path);

Unix kernels already contain a function with this name that does this job.

The system calls that formerly accepted path names are changed to require an inode structure. So instead of

        fd = open("dir/file", ...)

one now has

        fd = open(iname("dir/file"), ...)

(There are some minor language and usability issues here: what if iname() returns NULL? Ignore those; I want to discuss OS issues, not language issues.)

There would be a function, analogous to iname(), that also returned an inode structure, but which took an open file descriptor instead of a path name:

        struct inode * inode(int fd);

This is essentially equivalent to the fstat() function we have now.

chown() and fchown() would merge to become a single call that accepted an inode structure; instead of:

        chown("dir/file", owner) 
        fchown(fd, owner)

one would have:

        chown(iname("dir/file"), owner)
        chown(inode(fd), owner)

Similarly, instead of:

        chdir(path);
        fchdir(fd);

one would have:

        chdir(iname(path));
        chdir(inode(fd));

stat() and fstat() would not only merge but would disappear entirely; the struct inode can do everything that the struct stat can do. This code:

        stat(&statbuf, "dir/file");
        fstat(&statbuf, fd);

turns into this:

        statbuf = iname("dir/file"));
        statbuf = inode(fd);

There are some security implications to this idea. There needs to be protection against counterfeiting an inode structure. For example, consider a world-readable file in a secret, nonsearchable directory. Suppose the file happens to have i-number 123456. If it's possible to do this, then security has failed:

        struct inode I;
        I.inumber = 123456;
        fd = open(I, O_RDWR);

It should be impossible for anyone to manufacture the struct inode that represents the secret file without actually using iname() somewhere along the line. A simple way to arrange this would be to have the kernel cryptographically sign each struct inode. This can be done inexpensively.

This still has some access implications. Consider a world-readable file in a world-searchable directory. Process A iname()s the file, obtaining its struct inode. The search permissions on the directory are then removed. Process A can still open the file. This is analogous to a similar situation in standard Unix in which process A opens the file before the permissions are changed, and can still read and write it afterwards. So that's not a big change. What might be a big change is that A can dump the struct inode to a file and the a different process can read it back again, evading the increased access protections on the directory. The cryptographic signature technique can fix this problem by restricting struct inodes to be used by a single process.

Whether this is worth doing I don't know. My main idea in thinking it up was to avoid the increasing duplication of system calls. Does Unix need an "fsymlink" call? Does it need three different ones?

        symlink(oldpath, newpath);
        fsymlink1(fd, newpath);
        fsymlink2(oldpath, fd);
        fsymlink3(oldfile_fd, newdir_fd);

Perhaps not this week, but who knows what the future holds? With the iname() / inode() style, these are all a single call:

        symlink(iname(oldpath), iname(newpath));
        symlink(inode(fd), iname(newpath));
        symlink(iname(oldpath), inode(fd));
        symlink(inode(oldfile_fd), inode(newdir_fd));

This also fixes some of the proliferation in the system call interface between calls that work on symlinks and calls that work through symlinks. For example, stat() and lstat(), and chown() and lchown(). On normal files, each pair is the same. But on a symlink, stat() stats the pointed-to file while lstat() stats the symlink itself; similarly chown() changes the owner of the pointed-to file while lchown() changes the owner of the symlink itself. But where's lchmod()? What about llink()? There's no way to make a hard link to a symbolic link! With the inode() / iname() technique above, you only need one extra call to handle all possible operations on a symbolic link:

        lstat(path);
        lchown(path, owner);
        llink(path, newpath);

becomes:

        stat(liname(path));
        chown(liname(path), owner);
        link(liname(path), iname(newpath));

where liname() is just like iname(), except that if the resulting file is a symbolic link, its inode is returned immediately; iname() would have read the target of the symbolic link and called itself recursively to resolve the target.

It also seems to me that this interface might make it easier to communicate open files from one process to another. Some unix systems offer a experimental features for passing file descriptors around; this system only requires that the struct inode be communicated directly to the receiving process.


[Other articles in category /Unix] permanent link

Tue, 08 Nov 2005

Fibonacci-like sequences

John T. Guthrie:

I can hand you two complex numbers, a and b. If I tell you that for the sequence G(k), G(m)=a, and G(n)=b, and G(k)=G(k-1)+G(k-2), then you can compute the rest of the sequence G(k) for all integers k.

let's call such sequences "Fibonacci-like".

Since the set of Fibonacci-like sequences is closed under addition and under multiplication of each element by a constant, the set of such sequences forms a vector space of dimension 2.

Abbreviate such a sequence by writing (G(0), G(1)), which as you pointed out is sufficient to determine the whole thing. With this abbreviation, the standard Fibonacci sequence is just (0, 1).

(0, 1) and (1, 0) are a basis for the vector space. But (1, 0) is nothing more than the Fibonacci sequence shifted over by one element; it's (F(-1), F(0)).

Thus any Fibonacci-like sequence G satisfies:

G(n) = G(1) F(n) + G(0) F(n-1) (*)

In particular, consider the shifted Fibonacci sequence Sk(n) = F(n+k). Then (*) reduces to:

F(n+k) = F(k+1) F(n) + F(k) F(n-1)

which I think is a simpler proof of this identity than the obvious inductive proof.

All sorts of other identities fall out of special cases of (*). For example, the Lucas sequence 1, 3, 4, 7, 11, ... is Fibonacci-like with has L(0) = 1, L(1) = 3. By (*):

L(n) = 3F(n) + F(n-1)

or if you prefer:

L(n) = 2F(n) + F(n+1)

and of course a change of basis for the vector space produces a version of (*) that allows one to write any Fibonacci-like sequence in terms of the Lucas sequence.

(Very easy exercise: The Lucas sequence contains no multiples of 5. Slightly harder: For which n does the Fibonacci sequence contain no multiples of n?)


[Other articles in category /math] permanent link

Fri, 28 Oct 2005

Diagonalization

I am now reading A Digit of the Moon, which is an 1898 translation of a Sanskrit manuscript of unknown age. The story concerns a king who falls madly in love with the beautiful and wise princess Anangaraga. The custom of the princess is to marry only the one who can ask her a question she cannot answer; if a suitor fails twenty-one times, he becomes her slave for life.

The king's friend propounds nineteen questions, all of which the wise and clever princess answers without difficulty. On the morning of the twentieth day, the king, after praying to the goddess Saraswarti for help, is inspired with the solution: he will ask the princess to solve his own difficulty, and to tell him what question he should ask her. "And if she tells me, then I will ask her tomorrow what she tells me to-day: and if she does not tell me, then she is mine according to the terms of the agreement, to-day: and so in either alternative, the bird is caged."

[Addendum: Despite the many notes in the text on translation details, untranslatable puns, and so forth, the author, F.W. Bain, probably wrote it himself, rather than translating it from an unknown manuscript as he claimed.]


[Other articles in category /CS] permanent link