Fri, 30 May 2008
A missing feature in document viewers
Typically, the viewer numbers all the pages sequentially, starting with 1. But many documents have some front matter, such as a table of contents, that is outside the normal numbering. For example, there might be a front cover page, and then a table of contents labeled with page numbers i through xviii, and then the main content of the document follows on pages 1 through 263.
Computer programmers, I just realized, have a nice piece of jargon to describe this situation, which is very common. They speak of "logical" and "physical" pages. The "physical" page numbers are the real, honest-to-goodness numbers of the pages, what you get if you start at 1 and count up. The "logical" page numbers are the names by which the pages are referred. In the example document I described, physical page 1 is the front cover, physical page 2 is logical page i, physical page 19 is logical page xviii, physical page 20 is logical page 1, and so forth. The document has 282 physical pages, and the last one is logical page 263.
Let's denote physical pages with square brackets and logical pages with curvy brackets. So "(xviii)" and "" denote the same page in this document. Page (1) is page , and page (20) is page . Page  has no logical designation, or perhaps it is something like "(front cover sheet)".
Now the problem I want to discuss is as follows: Every viewer program has a little box where it displays the current page number, and the little box is usually editable. You scan the table of contents, find the topic you want to read about, and the table says that it's on page (165). Then you tell the document viewer to go to page 165, and it does, but it's not the page 165 you want, because the viewer gives you , which is actually (146). You actually wanted (165), which is page .
Then you curse, mentally subtract 146 (what you got) from 165 (what you wanted), add the result, 19, back to 165, getting 184, and then you ask for 184 to get 165. And if you're me you probably mess up one time in three and have to do it over, because subtraction is hard.
But it would be extremely easy for viewer programs to mostly fix this. They need to support an option where you can click on the box and tell it "your page number is wrong here". Maybe you would right-click the little page-number box, and the process would pop up a dialog:
Then you would type in 146 (which you can see at the bottom of the page you're viewing) and click "OK". From then on the process would know that the logical and physical page numbers differed by 19, and it would subtract 19 from the number in the little box until you told it something else. You could then type 165 into the little box, and the process would think "well, you asked for (165), and I know that (165) is really  because you told me earlier that  is really (146)" and then you would get , which is what you wanted. And when you scrolled down from (165) to (166), the program would think "ho, you just went from  to , so I will change the display in the little box and display -19 = (166) there".
But no, none of them do this.
The document itself should carry this information, and some of them do, sometimes. But not every document will, so viewers should support this feature, which is useful anyway.
Some document formats support internal links, but most documents do not use those features, and anyway they are useless when what you are trying to do is look up a reference from someone else's bibliography: "(See Ogul, pp. 662–664.)"
This is not a complete solution, but it's an almost complete solution, and it can be implemented unilaterally, by which I mean that the document author and the viewer program author need not agree on anything. It's really easy to do.
[ Addendum 20080521: Chung-chieh Shan informs me that current versions of xdvi have this feature. I was unaware of this, because the version installed on my machine was compiled in celebration of the 1926 Philadelphia Sesquicentennial Exhibition and so predates the addition of this feature. ]
[ Addendum 20080530: How I made the dialog box graphic. ]