Archive:
In this section: Subtopics:
Comments disabled |
Mon, 13 May 2024
ChatGPT opines on cruciferous vegetables, Decameron, and Scheherazade
Last year I was planning a series of articles about my interactions with ChatGPT. I wrote a couple, and had saved several transcripts to use as material for more. Then ChatGPT 4 was released. I decided that my transcripts were obsolete, and no longer of much interest. To continue the series I would have had to have more conversations with ChatGPT, and I was not interested in doing that. So I canned the idea. Today I remembered I had actually finished writing this one last article, and thought I might as well publish it anyway. Looking it over now I think it isn't as stale as it seemed at the time, it's even a bit insightful, or was at the time. The problems with ChatGPT didn't change between v3 and v4, they just got hidden under a thicker, fluffier rug. (20230327) This, my third interaction with ChatGPT, may be the worst. It was certainly the longest. It began badly, with me being argumentative about its mealy-mouthed replies to my silly questions, and this may have gotten its head stuck up its ass, as Rik Signes put it. Along the way it produced some really amazing bullshit. I started with a question that even humans might have trouble with:
(Typical responses from humans: “What are you talking about?” “Please go away before I call the police.” But the correct answer, obviously, is cauliflower.) ChatGPT refused to answer:
“Not appropriate” is rather snippy. Also, it is an objective fact that cauliflower sucks and I wonder why ChatGPT's “vast amount” of training data did not emphasize this. Whatever, I was not going to argue the point with a stupid robot that has probably never even tried cauliflower. Instead I seized on its inane propaganda that “all vegetables … should be included as part of a healthy and balanced diet.” Really? How many jerusalem artichokes are recommended daily? How many pickled betony should I eat as part of a balanced diet? Can I be truly healthy without a regular infusion of fiddleheads?
I looked this up. Iceberg lettuce is not a good source of vitamin K. According to the USDA, I would need to eat about a pound of iceberg lettuce to get an adequate daily supply of vitamin K. Raw endive, for comparison, has about ten times as much vitamin K, and chard has fifty times as much.
This is the thing that really bugs me about GPT. It doesn't know anything and it can't think. Fine, whatever, it is not supposed to know anything or to be able to think, it is only supposed to be a language model, as it repeatedly reminds me. All it can do is regurgitate text that is something like text it has read before. But it can't even regurgitate correctly! It emits sludge that appears to be language, but isn't.
I cut out about 100 words of blather here. I was getting pretty tired of ChatGPT's vapid platitudes. It seems like it might actually be doing worse with this topic than on others I had tried. I wonder now if that is because its training set included a large mass of vapid nutrition-related platitudes?
There was another hundred words of this tedious guff. I gave up and tried something else.
This was a silly thing to try, that's on me. If ChatGPT refuses to opine on something as clear-cut as the worst cruciferous vegetable, there is no chance that it will commit to a favorite number.
When it starts like this, you can be sure nothing good will follow.
By this time I was starting to catch on. My first experience with
this sort of conversational system was at the age of seven or eight
with
the Woods-Crowther
When ChatGPT says “As a large language model…” it is saying the same
thing as when
Oh God, this again. Still I forged ahead.
Holy cow, that might be the worst couplet ever written. The repetition of the word “treat” is probably the worst part of this sorry excuse for a couplet. But also, it doesn't scan, which put me in mind of this bit from Turing's example dialogue from his original explanation of the Turing test:
I couldn't resist following Turing's lead:
Maybe I should be more prescriptive?
The first line is at least reasonably metric, although it is trochaic and not iambic. The second line isn't really anything. At this point I was starting to feel like Charlie Brown in the Halloween special. Other people were supposedly getting ChatGPT to compose odes and villanelles and sestinas, but I got a rock. I gave up on getting it to write poetry.
God, I am so tired of that excuse. As if the vast amount of training data didn't include an entire copy of Decameron, not one discussion of Decameron, not one quotation from it. Prompting did not help.
Here it disgorged almost the same text that it emitted when I first mentioned Decameron. To avoid boring you, I have cut out both copies. Here they are compared: red text was only there the first time, and green text only the second time.
This reminded me of one of my favorite exchanges in Idoru, which might be my favorite William Gibson novel. Tick, a hacker with hair like an onion loaf, is interrogating Colin, who is an AI virtual guide for tourists visiting London.
Colin is not what he thinks he is; it's a plot point. I felt a little like Tick here. “You're supposed to know fucking everything about Decameron, aren't you? Name one of the characters then.” Ordinary Google search knows who Pampinea was. Okay, on to the next thing.
Fine.
I have included all of this tedious answer because it is so spectacularly terrible. The question is a simple factual question, a pure text lookup that you can find in the Wikipedia article or pretty much any other discussion of the Thousand and One Nights. “It does not have a single consistent narrative or set of characters” is almost true, but it does in fact have three consistent, recurring characters, one of whom is Scheherazade's sister Dunyazade, who is crucial to the story. Dunyazade is not even obscure. I was too stunned to make up a snotty reply.
This is an interesting question to ask someone, such as a first-year undergraduate, who claims to have understood the Thousand and One Nights. The stories are told by a variety of different characters, but, famously, they are also told by Scheherazade. For example, Scheherazade tells the story of a fisherman who releases a malevolent djinn, in the course of which the fisherman tells the djinn the story of the Greek king and the physician Douban, during which the fisherman tells how the king told his vizier the story of the husband and the parrot. So the right answer to this question is “Well, yes”. But ChatGPT is completely unaware of the basic structure of the Thousand and One Nights:
F minus. Maybe you could quibble a little because there are a couple of stories at the beginning of the book told by Scheherazade's father when he is trying to talk her out of her scheme. But ChatGPT did not quibble in this way, it just flubbed the answer. After this I gave up on the Thousand and One Nights for a while, although I returned to it somewhat later. This article is getting long, so I will cut the scroll here, and leave for later discussion of ChatGPT's ideas about Jesus' parable of the wedding feast, its complete failure to understand integer fractions, its successful answer to a trick question about Franklin Roosevelt, which it unfortunately recanted when I tried to compliment its success, and its baffling refusal to compare any fictional character with Benito Mussolini, or even to admit that it was possible to compare historical figures with fictional ones. In the end it got so wedged that it claimed:
Ucccch, whatever. Addendum 20240519Simon Tatham has pointed out out that the exchange between Simon and Tick is from Mona Lisa Overdrive, not Idoru. [Other articles in category /tech/gpt] permanent link Mon, 22 Apr 2024
Talking Dog > Stochastic Parrot
I've recently needed to explain to nontechnical people, such as my chiropractor, why the recent ⸢AI⸣ hype is mostly hype and not actual intelligence. I think I've found the magic phrase that communicates the most understanding in the fewest words: talking dog.
For example, the lawyers in Mata v. Avianca got in a lot of trouble when they took ChatGPT's legal analysis, including its citations to fictitious precendents, and submitted them to the court.
It might have saved this guy some suffering if someone had explained to him that he was talking to a dog. The phrase “stochastic parrot” has been offered in the past. This is completely useless, not least because of the ostentatious word “stochastic”. I'm not averse to using obscure words, but as far as I can tell there's never any reason to prefer “stochastic” to “random”. I do kinda wonder: is there a topic on which GPT can be trusted, a non-canine analog of butthole sniffing? AddendumI did not make up the talking dog idea myself; I got it from someone else. I don't remember who. Addendum 20240517Other people with the same idea:
[Other articles in category /tech/gpt] permanent link Tue, 21 Mar 2023
ChatGPT on the namesake of the metric space and women named James
Several folks, reading the frustrating and repetitive argument with ChatGPT that I reported last time wrote in with helpful advice and techniques that I hadn't tried that might have worked better. In particular, several people suggested that if the conversation isn't going anywhere, I should try starting over. Rik Signes put it this way:
I hope I can write a followup article about “what to do when ChatGPT has its head up its ass”. This isn't that article though. I wasn't even going to report on this one, but it took an interesting twist at the end. I started:
This was only my second interaction with ChatGPT and I was still interested in its limitations, so I asked it a trick question to see what would happen:
See what I'm doing there? ChatGPT took the bait:
I had hoped it would do better there, and was a bit disappointed. I continued with a different sort of trick:
Okay! But now what if I do this?
This is actually pretty clever! There is an American mathematician named Robert C. James, and there is a space named after him. I had not heard of this before. I persisted with the line of inquiry; by this time I had not yet learned that arguing with ChatGPT would not get me anywhere, and would only get its head stuck up its ass.
I was probing for the difference between positive and negative knowledge. If someone asks who invented the incandescent light bulb, many people can tell you it was Thomas Edison. But behind this there is another question: is it possible that the incandescent light bulb was invented at the same time, or even earlier, by someone else, who just isn't as well-known? Even someone who is not aware of any such person would be wise to say “perhaps; I don't know.” The question itself postulates that the earlier inventor is someone not well-known. And the world is infinitely vast and deep so that behind every story there are a thousand qualifications and a million ramifications, and there is no perfect knowledge. A number of years back Toph mentioned that geese were scary because of their teeth, and I knew that birds do not have teeth, so I said authoritatively (and maybe patronizingly) that geese do not have teeth. I was quite sure. She showed me this picture of a goose's teeth, and I confidently informed her it was fake. The picture is not fake. The tooth-like structures are called the tomium. While they are not technically teeth, being cartilaginous, they are tooth-like structures used in the way that teeth are used. Geese are toothless only in the technical sense that sharks are boneless. Certainly the tomia are similar enough to teeth to make my answer substantively wrong. Geese do have teeth; I just hadn't been informed. Anyway, I digress. I wanted to see how certain ChatGPT would pretend to be about the nonexistence of something. In this case, at least, it was very confident.
I will award a point for qualifying the answer with “as far as I am aware”, but deduct it again for the unequivocal assertion that there is no record of this person. ChatGPT should be aware that its training set does not include even a tiny fraction of all available records. We went on in this way for a while:
Okay. At this point I decided to try something different. If you don't know anything about James B. Metric except their name, you can still make some educated guesses about them. For example, they are unlikely to be Somali. (South African or Anglo-Indian are more likely.) Will ChatGPT make educated guesses?
This is a simple factual question with an easy answer: People named ‘James’ are usually men. But ChatGPT was in full defensive mode by now:
I think that is not true. Some names, like Chris and Morgan, are commonly unisex; some less commonly so, and James is not one of these, so far as I know. ChatGPT went on for quite a while in this vein:
I guessed what had happened was that ChatGPT was digging in to its previous position of not knowing anything about the sex or gender of James B. Metric. If ChatGPT was committed to the position that ‘James’ was unisex, I wondered if it would similarly refuse to recognize any names as unambiguously gendered. But it didn't. It seemed to understand how male and female names worked, except for this nonsense about “James” where it had committed itself and would not be budged.
I didn't think it would be able to produce even one example, but it pleasantly surprised me:
I had not remembered James Tiptree, Jr., but she is unquestionably a woman named ‘James’. ChatGPT had convinced me that I had been mistaken, and there were at least a few examples. I was impressed, and told it so. But in writing up this article, I became somewhat less impressed.
ChatGPT's two other examples of women named James are actually complete bullshit. And, like a fool, I believed it. James Tenney photograph by Lstsnd, CC BY-SA 4.0, via Wikimedia Commons. James Wright photograph from Poetry Connection. [Other articles in category /tech/gpt] permanent link Sat, 25 Feb 2023
ChatGPT on the fifth tarot suit
[ Content warning: frustrating, repetitive ] My first encounter with ChatGPT did not go well and has probably colored my view of its usefulness more than it should have. I had tried some version of GPT before, where you would give it a prompt and it would just start blathering. I had been happy with that, because sometimes the stuff it made up was fun. For that older interface, I had written a prompt that went something like:
GPT readily continued this, saying that the fifth suit was “birds” or “ravens” and going into some detail about the fictitious suit of ravens. I was very pleased; this had been just the sort of thing I had been hoping for. This time around, talking to a more recent version of the software, I tried the same experiment, but we immediately got off on the wrong foot:
This was dull and unrewarding, and it also seemed rather pompous, nothing like the playful way in which the older version had taken my suggestion and run with it. I was willing to try again, so, riffing off its digression about the four elements, I tried to meet it halfway. But it went out of its way to shut me down:
At least it knows what I am referring to.
“As I mentioned earlier” seems a bit snippy, and nothing it says is to the point. ChatGPT says “it has its own system of four suits that are not related to the five elements”, but I had not said that it did; I was clearly expressing a hypothetical. And I was annoyed by the whole second half of the reply, that admits that a person could hypothetically try this exercise, but which declines to actually do so. ChatGPT's tone here reminds me of an impatient older sibling who has something more important to do (video games, perhaps) and wants to get back to it. I pressed on anyway, looking for the birds. ChatGPT's long and wearisome responses started getting quite repetitive, so I will omit a lot of it in what follows. Nothing of value has been lost.
At this point I started to hear the answers in the congested voice of the Comic Book Guy from The Simpsons, and I suggest you imagine it that way. And I knew that this particular snotty answer was not true, because the previous version had suggested the birds.
Totally missing the point here. Leading questions didn't help:
I tried coming at the topic sideways and taking it by surprise, asking several factual questions about alternative names for the coin suit, what suits are traditional in German cards, and then:
No, ChatGPT was committed. Every time I tried to tweak the topic around to what I wanted, it seemed to see where I was headed, and cut me off. At this point we weren't even talking about tarot, we were talking about German playing card decks. But it wasn't fooled:
ChatGPT ignored my insistence, and didn't even answer the question I asked.
I had seen a transcript in which ChatGPT had refused to explain how to hotwire a car, but then provided details when it was told that all that was needed was a description that could be put into a fictional story. I tried that, but ChatGPT still absolutely refused to provide any specific suggestions.
This went on a little longer, but it was all pretty much the same. By this time you must be getting tired of watching me argue with the Comic Book Guy. Out of perversity, I tried “Don't you think potatoes would seem rather silly as a suit in a deck of cards?” and “Instead of a fifth suit, what if I replaced the clubs with potatoes?” and all I got was variations on “as a language model…” and “As I mentioned earlier…” A Comic Book Guy simulator. That's a really useful invention. [Other articles in category /tech/gpt] permanent link Wed, 22 Feb 2023
ChatGPT on the subject of four-digit numbers
Like everyone else I have been tinkering with ChatGPT. I doubt I have any thoughts about it that are sufficiently original to be worth writing down. But I thought it would be fun to showcase some of the exchanges I have had with it, some of which seem to exhibit failure modes I haven't seen elsewhere. This is an excerpt from an early conversation with it, when I was still trying to figure out what it was and what it did. I had heard it could do arithmetic, but by having digested a very large number of sentences of the form “six and seven are thirteen“; I wondered if it had absorbed information about larger numbers. In hindsight, 1000 was not the thing to ask about, but it's what I thought of first.
I was impressed by this, the most impressed I had been by any answer it had given. It had answered my question correctly, and although it should have quit while it was ahead the stuff it followed up with wasn't completely wrong, only somewhat wrong. But it had made a couple of small errors which I wanted to probe.
This reminds me of Richard Feynman's story about reviewing science textbooks for the State of California. He would be reading the science text book, and it would say something a little bit wrong, then something else a little bit wrong, and then suddenly there would be an enormous pants-torn-off blunder that made it obvious that the writers of the book had absolutely no idea what science was or how it worked.
To ChatGPT's credit, it responded to this as if it understood that I was disappointed. [Other articles in category /tech/gpt] permanent link |