The Universe of Disco

Mark Dominus (陶敏修)
mjd@pobox.com

12 recent entries

Loki the comedian
A triviality about numbers that look like abbc
My reply to the people who want to designate my neighborhood a "historic district"
A potpourri of cool-looking scripts
Horst Wessel and John Birch
ChatGPT opines on cruciferous vegetables, Decameron, and Scheherazade
It's an age of marvels
Hawat! Hawat! Hawat! A million deaths are not enough for Hawat!
Rod R. Blagojevich will you please go now?
Well, I guess I believe everything now!
R.I.P. Oddbins
Talking Dog > Stochastic Parrot

Archive:

2024: JF M A M J
J
2023: JF M A M J
J A S O N D
2022: J F M A M J
JAS O N D
2021: J F M AMJ
J A S O N D
2020: J F M A M J
J A S O N D
2019: JFM A M J
J A S O N D
2018: J F M A M J
J A S O N D
2017: J F M A M J
J A S O N D
2016: JF M A M J
JASON D
2015: JFM A M J
J A S O N D
2014: J F M AMJ
JASON D
2013: JFMAMJ
JAS OND
2012: J F MAMJ
JASOND
2011: JFMAM J
JASOND
2010: JFMAMJ
JA S O ND
2009: J F MAM J
JASOND
2008: J F M A M J
JAS O ND
2007: J F M A M J
J A S O N D
2006: J F M A M J
JAS O N D
2005: O N D

Subtopics:

Mathematics 240

Programming 99

Language 93

Miscellaneous 69

Book 50

Tech 49

Etymology 34

Haskell 33

Oops 30

Unix 27

Cosmic Call 25

Math SE 24

Physics 21

Law 21

Perl 17

Biology 15

Comments disabled

Sat, 08 Dec 2007

Corrections about sync(2)
I made some errors in today's post about sync and fsync.

Most important, I said that "the sync() system call marks all the kernel buffers as dirty". This is totally wrong, and doesn't even make sense. Dirty buffers are those with data that needs to be written out. Marking a non-dirty buffer as dirty is a waste of time, since nothing has changed in the buffer, but it will now be rewritten anyway. What sync() does is schedule all the dirty buffers to be written as soon as possible.

On some recent systems, sync() actually waits for all the dirty buffers to be written, and a bunch of people tried to correct me about this. But my original article was right: historically, it was not so, and even today it's not universally true. In former times, sync() would schedule the buffers for writing, and then return before the data was actually written.

I said that one of the duties of init was to call sync() every thirty seconds, but this was mistaken. That duty actually fell to a separate program, known as update. While discussing this with one of the readers who wrote to correct me, I looked up the source for Version 7 Unix, to make sure I was right, and it's so short I thought I might as well show it here:

        /*
         * Update the file system every 30 seconds.
         * For cache benefit, open certain system directories.
         */

        #include <signal.h>

        char *fillst[] = {
                "/bin",
                "/usr",
                "/usr/bin",
                0,
        };

        main()
        {
                char **f;

                if(fork())
                        exit(0);
                close(0);
                close(1);
                close(2);
                for(f = fillst; *f; f++)
                        open(*f, 0);
                dosync();
                for(;;)
                        pause();
        }

        dosync()
        {
                sync();
                signal(SIGALRM, dosync);
                alarm(30);
        }

The program is so simple I don't have much more to say about it. It initially invokes dosync(), which calls sync() and then schedules another call to dosync() in 30 seconds. Note that the 0 in the second argument to open had not yet been changed to O_RDONLY. The pause() call is equivalent to sleep(0): it causes the process to relinquish its time slice whenever it is active.

In various systems more recent than V7, the program was known by various names, but it was update for a very long time.

Several people wrote to correct me about the:

        # sync
        # sync
        # sync
        # halt

thing, some saying that I had the reason wrong, or that it did not make sense, or that only two syncs were used, rather than three. But I had it right. People did use three, and they did it for the reason I said, whether that makes sense or not. (Some of the people who miscorrected me were unaware that sync() would finish and exit before the data was actually written.) But for example, see this old Usenet thread for a discussion of the topic that confirms what I said.

Nobody disputed my contention that Linus was suffering from the promptings of the Evil One when he tried to change the semantics of fsync(), and nobody seems to know the proper name of the false god of false efficiency. I'll give this some thought and see what I can come up with.

Thanks to Tony Finch, Dmitry Kim, and Stefan O'Rear for discussion of these points.

[Other articles in category /Unix] permanent link