|
Archive:
In this section:
Subtopics:
Comments disabled |
Thu, 12 Feb 2026
Language models imply world models
In a recent article about John Haugeland's rejection of micro-worlds I claimed:
Nobody has objected to this remark, but I would like to expand on it. The claim may or may not be true — it is an empirical question. But as a theory it has been widely entertained since the very earliest days of digital computers. Yehoshua Bar-Hillel, the first person to seriously investigate machine translation, came to this conclusion in the 1950s. Here's an extract of Haugeland's discussion of his work:
(Artifical Intelligence: The Very Idea; John Haugeland; p.174–176.) Bar-Hillel says, and I agree, that an accurate model of language requires an accurate model of the world. In 1960, this appeared “utterly chimerical”. Perhaps so, but here we are, and 55 years later we have what most agree is a language model capable of producing intelligible text complex enough to fool sophisticated readers. Even people who call the LLM a “stochastic word garbage spewer” and object when it is called “AI” seem to have no qualms about the term “large language model”. The Winograd SHRDLU project was an attempt to construct a world whose model was small enough to not be utterly chimerical. This worked, but as Haugeland says, it wasn't actually interesting. Doug Lenat tried for forty years construct an explicit world model. When he died in 2023 I think people still considered the project to have been utterly chimerical. I may be misrepresenting Bar-Hillel's views. He didn't actually say “world model”, he said “universal encyclopedia”. Lenat's project's name, “Cyc”, acknowledges this “Cyc” is short for “encyclopedia”. And until recently a universal encyclopedia did seem, to many, to be utterly chimerical. (Imperial China produced many interesting attempts.) But we do now in fact have a universal encyclopedia, and Claude has ingested that universal encyclopedia. Let's try it and seeBar-Hillel thought that the “box is in the pen” example proved that machine translation would be at the very least extremely difficult: it requires understanding of the usual relative sizes of boxes and pens, and that the possibility of such understanding in general “hardly deserves any further discussion”. I asked Claude:
Claude said:
Presumably Bar-Hillel's article and Haugeland's book was in Claude's training data, and perhaps it somehow absorbed the right thing to say here without actually knowing anything about boxes or pens. I tried the next similar example that came to mind:
Okay, that's just what I would have said.
I think Claude is wrong about the grammar here. I'm not sure what “predicate noun” means and I suspect Claude is using the term incorrectly. I can't understand “what suit was the king?” as grammatical in any plausible construction, only in highly contrived situtations such as an evil wizard transforming the king into a suit of armor and hiding him in the armory. But if one agrees to take it to mean “what suit was the king have” I agree that this is a perfectly plausible interpretation even though there isn't anything clearly clothing-related nearby, and if one assumes it meant “what suit was the king wearing” then the clothing interpretation is unavoidable.
Again I disagree that this makes grammatical sense. Without “in” I think a native listener would be at least puzzled. (“Do you mean ‘what suit was the king involved in?’?”)
In my opinion, it is not grammatically plausible. However, I think focusing on the grammatical errors would missing the point. The “soot” interpretation is unlikely, I think the bad grammar rules it out, and Claude's response does not seem to recognizer this. But Claude does express understanding that that the interpretation is unikely without some surrounding soot-related context, and a reasonable idea of what that context would need to be like.
I had been thinking of “color” as referring to the king's skin color, but I agree that the chess piece or game token interpretation is more plausible. Again Claude seems to understand something about how we talk about things. If the king were a playing card, we might be asking if it is a red king or a black king. But without that context it's now more likely that the king is a chess piece than a playing card.
That's just what I would have said.
I thought Claude might botch this. There is no grammatical or semantic reason why “what age was the king” doesn't work, so the only reason to reject this is orthographic. And Claude's grasp of spelling still seems quite weak. The other day it claimed that “‘Warner’ is likely either a typo for ‘Cham’ or …” which couldn't be wronger. I told it at the time that it is impossible that "Cham" could be a typo for “Warner”, and if pressed to explain why, I would have said something like what Claude said here. Tentative conclusionsInteresting as this all is, it is a digression. My main points, again:
Addendum20260214I disagreed with Claude that “what suit was the king” made grammatical sense. Rik Signes has pointed out that it it is certainly grammatical, because the grammar is the same as “what person was the king” or “what visitor was the king”. My discomfort with it is not grammatical, it is pragmatic. [Other articles in category /tech/gpt] permanent link Thu, 05 Feb 2026
John Haugeland on the failure of micro-worlds
One of the better books I read in college was Artificial Intelligence: The Very Idea (1985) by philosopher John Haugeland. One of the sections I found most striking and memorable was about Terry Winograd's SHRDLU. SHRDLU, around 1970, could carry on a discussion in English in which it would manipulate imaginary colored blocks in a “blocks world”. displayed on a computer screen. The operator could direct it to “pick up the pyramid and put it on the big red cube” or ask it questions like “what color is the biggest cylinder that isn't on the table?”. Haugeland was extremely unimpressed (p.190, and more generally 185–195):
He imagines this exchange between the operator and SHRDLU:
What does Haugeland say he would like to have seen?
On this standard, at least, an LLM is a smashing success. It does, in fact, have a model of trading, acts, property, and water pistols, or at least of how we talk about such things. We might criticize the model's accuracy, or usefulness, but it certainly exists. The large language model is a model of the semantics of trading, acts, property, water pistols, and so on. Curious to see how it would go, I asked Claude to pretend it had access to a SHRDLU-like blocks world:
I asked it a few SHRDLU-like questions about the blocks, then asked it to put a block on a pyramid. It clearly understood the point of the exercise:
SHRDLU could handle this too, although I think
its mechanism was different: it would interact with the separate
blocks world subsystem and ⸢actually⸣ try to put the block on the
pyramid; the simulated physics would simulate the block falling off
the pyramid, and SHRDLU would discover that its stacking attempt had
been unsuccessful. With Claude, something very different is
happening; there is no physics simulation separate from Claude. I
think the answer here demonstrates that Claude's own model includes
something about pyramids and something about physics.
Then I made the crucial offer:
Would Haugeland have been satisfied in 1985 if SHRDLU had said this? I think certainly. Haugeland wanted SHRDLU to respond to the offer directly, as the beginning of a negotiation. Claude's response is one level better from that: it not only recognizes that I negotiating, it recognizes that actually negotiating for the squirt gun would not make sense, and offers a sensible workaround. I pushed it a little farther:
Perhaps I'm reading too much into this, but
it seems to me that, having recognized that the offer to
negotiate was itself silly, Claude is responding in the same mode with
its comments about threatening the pyramids.
Mostly I just tried this for fun. The Haugeland discussion of SHRDLU has been knocking around my head for forty years, but now it has knocked against something new, and I wanted to see what would actually happen. But I do have a larger point. Haugeland clearly recognized in 1985 that a model of the world was a requirement for intelligence:
and later:
Are there are any people who are still saying “it's not artificial intelligence, it's just a Large Language Model”. I suppose probably. But as a “Large Language Model”, Claude necessarily includes a model of the world in general, something that has long been recognized as an essential but perhaps unattainable prerequisite for artificial intelligence. Five years ago a general world model was science fiction. Now we have something that can plausibly be considered an example. And second: maybe this isn't “artificial intelligence” (whatever that means) and maybe it is. But it does the things I wanted artificial intelligence to do, and I think this example shows pretty clearly that it does at least one of the things that John Haugeland wanted it to do in 1985. My complete conversation with Claude about this. Addenda20260207I don't want to give the impression that Haugeland was scornful of Winograd's work. He considered it to have been a valuable experiment:
(p. 195) 20260212More about my claim that
I was not just pulling this out of my ass; it has been widely theorized since at least 1960. [Other articles in category /tech/gpt] permanent link Wed, 28 Jan 2026
Crooked politicians love crab cakes!
I recently posted an article about the 2013 Philadelphia Traffic Court fiasco, in which most of the Traffic Court judges were convicted of accepting bribes:
(The Philadelphia Inquirer, Nine current and former Traffic Court judges charged; Martin, John P. and Craig R. McCoy; January 31, 2013) Then in 2024, John “Johnny Doc” Dougherty, an influential Philadelphia union boss, pled guilty to embezzlement and bribery, paid in part in, guess what?
(The Philadelphia Inquirer, For leader John Dougherty, union-paid generosity began at home; Fazollah, Mark, Dylan Purcell, Jeremy Roebuck, and Craig R. McCoy; Feb 5 2019) He called them out specifically in his guilty plea:
(The Philadelphia Inquirer, ‘I am guilty:’ John Dougherty’s stunning statements at sentencing delivered an about-face few had predicted; Roebuck, Jeremy and Oona Goodin-Smith; July 13, 2024.) And now, in today's New York Times, I find:
(The New York Times, Former Adams Aide Took Diamond Earrings as Bribe, Prosecutors Say; Meko, Hurubie; January 27, 2026.) Poor Fenchurch, usually a gentle soul, is speechless with indignation. [Other articles in category /law] permanent link A couple of years back I wrote an article about this bit of mathematical folklore:
I have an non-apocryphal update in this space! In episode 94 of the podcast “My Favorite Theorem”, Jeremy Alm of Lamar University reports:
(At 04:15) In the earlier article, I had said:
In the podcast, Alm introduces this as evidence that he “wasn't very good at algebra”. Fortunately, he added, it was after he had graduated. The episode title is “In Which Every Thing Happens or it Doesn't”. I started listening to it because I expected it to be about the ergodic theorem, and I'd like to understand the ergodic theorem. But it turned out to be about the Rado graph. This is fine with me, since I love the Rado graph. (Who doesn't?) [Other articles in category /math] permanent link Mon, 26 Jan 2026
An anecdote about backward compatibility
A long time ago I worked on a debugger program that our company used to debug software that it sold that ran on IBM System 370. We had IBM 3270 CRT terminals that could display (I think) eight colors (if you count black), but the debugger display was only in black and white. I thought I might be able to make it a little more usable by highlighting important items in color. I knew that the debugger used a macro called In those days, that office didn't have online manuals, instead we had shelf after shelf of yellow looseleaf binders. Finding the binder you wanted was an adventure. More than once I went to my boss to say I couldn't proceed without the REXX language reference or whatever. Sometimes he would just shrug. Other times he might say something like “Maybe Matthew knows where that is.” I would go ask Matthew about it. Probably he would just shrug. But if he didn't, he would look at me suspiciously, pull the manual from under a pile of papers on his desk, and wave it at me threateningly. “You're going to bring this back to me, right?” See, because if Matthew didn't hide it in his desk, he might become the person who couldn't find it when he needed it. Matthew could have photocopied it and stuck his copy in a new binder, but why do that when burying it on his desk was so much easier? For years afterward I carried around my own photocopy of the REXX language reference, not because I still needed it, but because it had cost me so much trouble and toil to get it. To this day I remember its horrible IBM name: SC24-5239 Virtual Machine / System Product System Product Interpreter Reference. That's right, "System Product" was in there twice. It was the System Product Interpreter for the System Product, you see. Anyway, I'm digressing. I did eventually find a copy of the IBM
Assembler Product Macro Reference Document or whatever it was called,
and looked up My glee turned to puzzlement. If omitted, the default value for
Black? Not white? I read further. And I learned that the only other permitted value was
[Other articles in category /prog] permanent link |