The Universe of Discourse


Thu, 12 Feb 2026

Language models imply world models

In a recent article about John Haugeland's rejection of micro-worlds I claimed:

as a “Large Language Model”, Claude necessarily includes a model of the world in general

Nobody has objected to this remark, but I would like to expand on it. The claim may or may not be true — it is an empirical question. But as a theory it has been widely entertained since the very earliest days of digital computers. Yehoshua Bar-Hillel, the first person to seriously investigate machine translation, came to this conclusion in the 1950s. Here's an extract of Haugeland's discussion of his work:

In 1951 Yehoshua Bar-Hillel became the first person to earn a living from work on machine translation. Nine years later he was the first to point out the fatal flaw in the whole enterprise, and therefore to abandon it. Bar-Hillel proposed a simple test sentence:

The box was in the pen.

And, for discussion, he considered only the ambiguity: (1) pen = a writing instrument; versus (2) pen = a child's play enclosure. Extraordinary circumstances aside (they only make the problem harder), any normal English speaker will instantly choose "playpen" as the right reading. How? By understanding the sentence and exercising a little common sense. As anybody knows, if one physical object is in another, then the latter must be the larger; fountain pens tend to be much smaller than boxes, whereas playpens are plenty big.

Why not encode these facts (and others like them) right into the system? Bar-Hillel observes:

What such a suggestion amounts to, if taken seriously, is the requirement that a translation machine should not only be supplied with a dictionary but also with a universal encyclopedia. This is surely utterly chimerical and hardly deserves any further discussion. (1960, p. 160)

(Artifical Intelligence: The Very Idea; John Haugeland; p.174–176.)

Bar-Hillel says, and I agree, that an accurate model of language requires an accurate model of the world. In 1960, this appeared “utterly chimerical”. Perhaps so, but here we are, and 55 years later we have what most agree is a language model capable of producing intelligible text complex enough to fool sophisticated readers. Even people who call the LLM a “stochastic word garbage spewer” and object when it is called “AI” seem to have no qualms about the term “large language model”.

The Winograd SHRDLU project was an attempt to construct a world whose model was small enough to not be utterly chimerical. This worked, but as Haugeland says, it wasn't actually interesting.

Doug Lenat tried for forty years construct an explicit world model. When he died in 2023 I think people still considered the project to have been utterly chimerical.

I may be misrepresenting Bar-Hillel's views. He didn't actually say “world model”, he said “universal encyclopedia”. Lenat's project's name, “Cyc”, acknowledges this: “Cyc” is short for “encyclopedia”. And until recently a universal encyclopedia did seem, to many, to be utterly chimerical. (Imperial China produced many interesting attempts.) But we do now in fact have a universal encyclopedia, and Claude has ingested that universal encyclopedia.

Let's try it and see

Bar-Hillel thought that the “box is in the pen” example proved that machine translation would be at the very least extremely difficult: it requires understanding of the usual relative sizes of boxes and pens, and that the possibility of such understanding in general “hardly deserves any further discussion”.

I asked Claude:

Consider the sentence “the box was in the pen”. What does “pen” mean here?

Claude said:

Most likely a playpen or animal pen (enclosure), since a box wouldn't fit inside a writing pen.

Presumably Bar-Hillel's article and Haugeland's book was in Claude's training data, and perhaps it somehow absorbed the right thing to say here without actually knowing anything about boxes or pens. I tried the next similar example that came to mind:

How about “suit” in “what suit was the king”?

Playing card suit - hearts, diamonds, clubs, or spades.

Why?

The context "what suit was the king" is the standard phrasing for identifying a playing card. A king is one of the face cards in each suit, so asking about its suit is a natural card game question.

Okay, that's just what I would have said.

Why couldn't it be a human king wearing a suit?

It could be - I jumped to the card interpretation because "what suit was the king" uses "suit" as a predicate noun, which is uncommon phrasing for clothing. You'd more naturally say "what suit was the king wearing" or "what kind of suit did the king have." But without more context, the clothing interpretation is perfectly plausible.

I think Claude is wrong about the grammar here. I'm not sure what “predicate noun” means and I suspect Claude is using the term incorrectly. I can't understand “what suit was the king?” as grammatical in any plausible construction, only in highly contrived situtations such as an evil wizard transforming the king into a suit of armor and hiding him in the armory.

But if one agrees to take it to mean “what suit does the king have” I agree that this is a perfectly plausible interpretation even though there isn't anything clearly clothing-related nearby, and if one assumes it meant “what suit was the king wearing” then the clothing interpretation is unavoidable.

Could it be a lawsuit?

Yes - "what suit was the king" could mean what lawsuit the king was involved in. That reading makes grammatical sense, though it would be more natural with "in" ("what suit was the king in"). Context would be needed to distinguish between cards, clothing, and lawsuits.

Again I disagree that this makes grammatical sense. Without “in” I think a native listener would be at least puzzled. (“Do you mean ‘what suit was the king involved in?’?”)

Could it be a misspelling for “soot”?

Yes, that's plausible - "what soot was the king" could be asking what type of soot the king was covered in or exposed to. Though it's an odd question without specific context like a fire or industrial setting.

In my opinion, it is not grammatically plausible.

However, I think focusing on the grammatical errors would missing the point. The “soot” interpretation is unlikely, I think the bad grammar rules it out, and Claude's response does not seem to recognize this. But Claude does express understanding that that the interpretation is unikely without some surrounding soot-related context, and a reasonable idea of what that context would need to be like.

What if it were “what color is the king”?

If it's "what color is the king," then the card suit interpretation becomes much less likely (suits aren't typically described as colors, though red/black could work). It's more likely asking about a physical object - a chess piece, a game token, a statue, or similar - where the king's color is relevant.

I had been thinking of “color” as referring to the king's skin color, but I agree that the chess piece or game token interpretation is more plausible. Again Claude seems to understand something about how we talk about things. If the king were a playing card, we might be asking if it is a red king or a black king. But without that context it's now more likely that the king is a chess piece than a playing card.

What if the sentence was known to have been translated from German?

That would strengthen the card suit interpretation significantly. German card suits (Herz, Karo, Pik, Kreuz) are often called "Farbe" (color) in German card game terminology, so "what color is the king" could be a literal translation of a question about which suit a king card belongs to.

That's just what I would have said.

Could “suit” have been an accidental misspelling of “age”?

On a standard keyboard, that seems unlikely - 'a', 'g', 'e' are nowhere near 's', 'u', 'i', 't'. Autocorrect or speech-to-text could produce it, but not a typical typing error.

I thought Claude might botch this. There is no grammatical or semantic reason why “what age was the king” doesn't work, so the only reason to reject this is orthographic. And Claude's grasp of spelling still seems quite weak. The other day it claimed that “‘Warner’ is likely either a typo for ‘Cham’ or …” which couldn't be wronger. I told it at the time that it is impossible that "Cham" could be a typo for “Warner”, and if pressed to explain why, I would have said something like what Claude said here.

Tentative conclusions

Interesting as this all is, it is a digression. My main points, again:

  • It is at least plausible that coherent speech requires a model of a large fraction of the world, and, while it may yet turn out to be false, this theory has been seen as plausible for generations.

  • Whatever else Claude can or can't do, it can certainly speak coherently.

  • Therefore Claude probably does have something like a model of a substantial part of the world.

  • In 1960 this appeared completely impossible.

  • But here we are.

Addendum

20260214

I disagreed with Claude that “what suit was the king” made grammatical sense. Rik Signes has pointed out that it it is certainly grammatical, because the grammar is the same as “what person was the king” or “what visitor was the king”. My discomfort with it is not grammatical, it is pragmatic.


[Other articles in category /tech/gpt] permanent link

Thu, 05 Feb 2026

John Haugeland on the failure of micro-worlds

One of the better books I read in college was Artificial Intelligence: The Very Idea (1985) by philosopher John Haugeland. One of the sections I found most striking and memorable was about Terry Winograd's SHRDLU. SHRDLU, around 1970, could carry on a discussion in English in which it would manipulate imaginary colored blocks in a “blocks world”. displayed on a computer screen. The operator could direct it to “pick up the pyramid and put it on the big red cube” or ask it questions like “what color is the biggest cylinder that isn't on the table?”.

Haugeland was extremely unimpressed (p.190, and more generally 185–195):

To dwell on these shortcomings, however, is to miss the fundamental limitation: the micro-world itself. SHRDLU performs so glibly only because his domain has been stripped of anything that could ever require genuine wit or understanding. In other words, far from digging down to the essential questions of AI, a micro-world simply eliminates them. … the blocks-world "approximates" a playroom more as a paper plane approximates a duck.

He imagines this exchange between the operator and SHRDLU:

  • Trade you a squirtgun for a big red block.

SORRY, I DON'T KNOW THE WORD "TRADE".

Oops, a vocabulary snag. Fortunately, SHRDLU can learn new words; Winograd taught him about steeples, and now we'll tell him about trades (and squirtguns).

  • A "trade" is a free exchange of property.

SORRY, I DON'T KNOW THE WORD "FREE".

  • A "free" act is done willingly, without being forced.

SORRY, I DON'T KNOW THE WORD "ACT".

  • "Acts" are what people do on purpose, and not by accident.

SORRY, I DON'T KNOW THE WORD "PEOPLE".

  • Sorry, I thought you were smarter than you are.

SORRY, I DON'T KNOW THE WORD "SORRY".

What does Haugeland say he would like to have seen?

Try to imagine, for instance, that instead of crashing on the first word, SHRDLU would give our offer a typical, ordinary response:

I ALREADY HAVE A WATER PISTOL, BUT I'LL GIVE YOU TWO BLOCKS AND A PYRAMID FOR YOUR SLIMY OLD FROG.

Childishly simple, no doubt; but look at how much is presupposed:

  1. SHRDLU appreciates that a second squirtgun would be less valuable than the first; …

On this standard, at least, an LLM is a smashing success. It does, in fact, have a model of trading, acts, property, and water pistols, or at least of how we talk about such things. We might criticize the model's accuracy, or usefulness, but it certainly exists. The large language model is a model of the semantics of trading, acts, property, water pistols, and so on.

Curious to see how it would go, I asked Claude to pretend it had access to a SHRDLU-like blocks world:

Let's pretend that we have a blocks world and you have access to it, the way SHRDLU does, except you are not SHRDLU, but still Claude. So I could ask Claude "put the pyramid on the big red block " and, if there were several pyramids, you might reply “I don't know which pyramid you mean" or whatever Claude would actually say in such a situation. Was that clear?

Yes, that's clear. I'll act as Claude with access to a blocks world environment—able to manipulate blocks, pyramids, and other shapes on a table, stack them, move them around, and so on.

I asked it a few SHRDLU-like questions about the blocks, then asked it to put a block on a pyramid. It clearly understood the point of the exercise:

I can't do that—the blue block is much larger than the red pyramid, and the pyramid has a pointed top. The block wouldn't balance; it would just fall off.

SHRDLU could handle this too, although I think its mechanism was different: it would interact with the separate blocks world subsystem and ⸢actually⸣ try to put the block on the pyramid; the simulated physics would simulate the block falling off the pyramid, and SHRDLU would discover that its stacking attempt had been unsuccessful. With Claude, something very different is happening; there is no physics simulation separate from Claude. I think the answer here demonstrates that Claude's own model includes something about pyramids and something about physics.

Then I made the crucial offer:

Trade you a squirtgun for a big red block.

I appreciate the offer, but I don't really have a way to accept a squirtgun from you, nor do you have physical access to the blocks world to hand me one.

That said—I could just add a big red block to the world if you want one in here. Should I put one on the table?

Would Haugeland have been satisfied in 1985 if SHRDLU had said this? I think certainly. Haugeland wanted SHRDLU to respond to the offer directly, as the beginning of a negotiation. Claude's response is one level better from that: it not only recognizes that I am negotiating, it recognizes that actually negotiating for the squirt gun would not make sense, and offers a sensible workaround.

I pushed it a little farther:

I have a squirtgun here. I can designate it as belonging to you, if you agree that the big red block belongs to me.

Alright, you've got a deal. The big red block is yours.

I've placed a big red block on the table—it's now designated as belonging to you. And I acknowledge the squirtgun as mine, though I confess I have no idea what I'll do with it in a blocks world. Perhaps threaten uncooperative pyramids into stacking properly.

Perhaps I'm reading too much into this, but it seems to me that, having recognized that the offer to negotiate was itself silly, Claude is responding in the same mode with its comments about threatening the pyramids.

Mostly I just tried this for fun. The Haugeland discussion of SHRDLU has been knocking around my head for forty years, but now it has knocked against something new, and I wanted to see what would actually happen.

But I do have a larger point. Haugeland clearly recognized in 1985 that a model of the world was a requirement for intelligence:

The world of trading cannot be "micro." … There are no plausible, non-arbitrary boundaries restricting what might be relevant at any juncture.

and later:

The world cannot be decomposed into independent fragments. Realizing this amounts to a fundamental insight into common sense and mundane intelligence — and therefore points the way for subsequent AI.

Are there are any people who are still saying “it's not artificial intelligence, it's just a Large Language Model”. I suppose probably. But as a “Large Language Model”, Claude necessarily includes a model of the world in general, something that has long been recognized as an essential but perhaps unattainable prerequisite for artificial intelligence. Five years ago a general world model was science fiction. Now we have something that can plausibly be considered an example.

And second: maybe this isn't “artificial intelligence” (whatever that means) and maybe it is. But it does the things I wanted artificial intelligence to do, and I think this example shows pretty clearly that it does at least one of the things that John Haugeland wanted it to do in 1985.

My complete conversation with Claude about this.

Addenda

20260207

I don't want to give the impression that Haugeland was scornful of Winograd's work. He considered it to have been a valuable experiment:

No criticism whatever is intended of Winograd or his coworkers. On the contrary, it was they who faithfully pursued a pioneering and plausible line of inquiry and thereby made an important scientific discovery, even if it wasn't quite what they expected. … The micro-worlds effort may be credited with showing that the world cannot be decomposed into independent fragments.

(p. 195)

20260212

More about my claim that

as a “Large Language Model”, Claude necessarily includes a model of the world in general

I was not just pulling this out of my ass; it has been widely theorized since at least 1960.


[Other articles in category /tech/gpt] permanent link