When to choose ChatGPT over Claude

Chat GPT does (at least) one thing much better

Welcome to Issue #10. In a base-10 society, that feels like a milestone. The ancient Babylonians wouldn’t have cared at all.1  🎉

On today’s quest:

— “What was that word?”
— Claude versus ChatGPT for working with long documents
— How to survive the AI job apocalypse

Tip: AI as a memory prompt

I’m turning to AI more and more when I can’t remember the word or name for something.

Here's an example. Every few months, I want to use the Google Ngram browser to compare usage between British and American English, but I can never remember the exact format of the tag you add to search query. Usually, I spend at least five minutes searching Google to find it. (Yes, I should just write it down somewhere.)

But now, I can get the answer in seconds from any of the large language models (kind of).

PROMPT: I want to compare the British and American use of a word in the Google Ngram corpus. I know there's a tag I can add to the end of my search terms to limit it to searching in those books, but I can't remember what the tag is. Can you tell me?

I searched ChatGPT-4 first. It gave me a wrong answer, but it was close enough that it triggered my memory and let me do my search. But then I was curious, so I did a little “hallucination” experiment. Claude failed too.

Only Perplexity gave me the complete correct answer that would be useful to someone who didn't already know what they were looking for and just needed to be reminded.**

News

Claude just got zhuzhed up …

Anthropic says Claude 2.1 has “made significant gains in honesty” (in other words, it generates 2x fewer hallucinations).

Further, although Claude was always known for being able to handle a mountain of data, now it can do even more. They doubled the limit, which means you can enter approximately 150,000 words or a 500-page book. The company says, “By being able to talk to large bodies of content or data, Claude can summarize, perform Q&A, forecast trends, compare and contrast multiple documents, and much more.” — Anthropic

… But where your data is in the document matters

Greg Kamradt of Digits, an AI accounting company, posted an analysis of how Claude handles queries on long documents, and interestingly, where the data is in the document determines how accurate the response will be.

He buried a fact in different places in a 470-page document and then asked for that fact. He found that for Claude 2.1, “facts at the very top and very bottom of the document were recalled with nearly 100% accuracy,” but facts in the middle … not so much. Also, the shorter the document was, the better the accuracy. So just because you can enter a long document doesn’t mean you should if your goal is to get facts out of it.

Also, even though Claude is known for its long-document handling, Kamradt found that ChatGPT’s accuracy held up farther through a document than Claude. It looks to me like ChatGPT was dramatically better. Its limit is about 300 pages, and it was 100% accurate if the fact was in the first 150 pages or so. It was also very accurate when the fact was in the last half of the document even when the document was long.

The image below shows GPT-4 doing quite well.

The same chart for Claude was filled with red errors.

If you want to work with a long document, it may be worth the effort to pare it down and enrich for meaningful content, for example by deleting chapters you know you don’t need. And right now, I’d use ChatGPT over Claude for this kind of analysis (and I’d still double check all results).

Who are these ridiculous optimists?

I’ve been seeing stories lately that AI will lead to a 4-day or even 3-day work week. I’m not convinced. I think it’s much more likely that if we can do our jobs in 20% less time, employers will lay off 20% of their employees, not say, “Take Friday off forever. Have fun!”

In fact, as I was writing the newsletter, Christopher Penn shared an anecdote in his Almost Timely newsletter about a company that just laid off 80% of it’s content marketing team and replaced them with AI.

I agree with Chris that an 80% layoff is probably an edge case right now (and the company will probably be sorry), but I think this is where we’re heading. In fact …

A food metaphor

A writer recently told me about losing a client to AI: “Why pay me to write blogs when he can use a bot to do the same thing faster? I was disappointed that he didn't see the value I brought to his business.”

It’s definitely disappointing. There will be clients like this who don't value human contributions. They're like people who go to fast food restaurants because these restaurants are quick, cheap, and good enough. It meets their needs, and it’s a huge market! The Taco Bell equivalent of writing jobs will likely go away. ***

I’m more optimistic a market will still exist for mid-to-high-end writers and editors. The Cheesecake Factory and Chez Panisse of wordsmiths, if you will. Try to move yourself as far up that spectrum as possible (or make sure you’re the person in your office who knows how to use AI).

By the way, the answer to the question “Who are these ridiculous optimists?” is Bill Gates.

What is AI Sidequest?

Using AI isn’t my main job, and it probably isn’t yours either. I’m Mignon Fogarty, and Grammar Girl is my main gig, but I haven’t seen a technology this transformative since the development of the internet, and I have thoughts. So many thoughts. I bet you do too.

So here we are! Sidequesting together.

If you like the newsletter, please send your friends to the web version of the newsletter where they can read it and sign up.

* Ancient Babylonians operated on a base-60 system.

** I feel like it’s possible that if I had written a better prompt, ChatGPT and Claude may have performed better, but for simple things like this, I tend to just do stream-of-consciousness questions.

*** I don’t mean to imply that this particular writer was doing “Taco Bell” work. Companies seem to be trying AI for work beyond its abilities right now. But in the near future, they will figure out what AI can do well, and that will be the Taco Bell of writing.

Written by a human.