- AI Sidequest: How-To Tips and News
- Posts
- A ridiculously useful tip
A ridiculously useful tip
A simple way to grab and wrangle messy data
Issue 67
On today’s quest:
— Grabbing messy data
— The model matters, part II
— Energy: What uses more?
— AI’s erratic energy use problem
— Webinar: Storysnap
— Meta is developing chatbots that message you first
— Calling all Sallys
— We are still underreaction on AI
— The AI horse is out of the barn in publishing
— 🤣 A trickster steals an AI band’s identity 🤣
Tip: Grabbing messy data
This tip comes directly from the Almost Timely newsletter by Christopher Penn, and I am beside myself with how useful it is!
If you need to gather messy data from online, do a screen recording of it, feed the video into an LLM that can interpret video (e.g. Gemini), and ask it to turn the data into JSON.
Christopher’s example was gathering and analyzing data from Google Business Reviews, but my problem has always been Facebook comments. My most popular posts are when I ask people to tell me something about how they use language. People love to comment, and then I create charts, which become popular follow-up posts.
For example, I ask whether people pronounce “coyote” with one or two syllables, and I can get hundreds — sometimes thousands — of comments.

In the past, it has taken me hours to collect, collate, and analyze the comments because they’re in such a messy format. So I rarely do these posts even though people love them.
I just tried Christopher’s suggestion on comments from an old post like the one above, and it worked! This was my prompt:
Transcribe the Facebook comments in this video into JSON format. The required keys and values of the JSON format are: commenter name, comment verbatim text.
In less than five minutes, I had structured data. I then asked Gemini to evaluate the answers and add a parameter tracking whether the commenter said they used 2 or 3 syllables. This was my prompt:
Each comment is attempting to answer the question of whether the commenter pronounces the word "coyote" with two or three syllables. Please attempt to interpret each comment and add a key and value for each comment called "syllables" with a value of 2 or 3 to indicate how many syllables the commenter uses.
This doesn’t seem straightforward because people answer in all kinds of long and winding ways, but at first glance, Gemini didn’t seem to misinterpret a single one (even coding a comment as “null” when it wasn’t clear even though I didn’t include that in the instructions).
Of course, I would still read all the comments to make sure the interpretation is correct; and more important, because I enjoy reading the comments, and it is how I understand what people are saying so I can write a post about the data. But this “one simple trick” will save me hours that I would have previously spent on tedious data manipulation.
Technical Notes:
I used QuickTime on the Mac for screen recording. It’s built-in.
When I attempted the same thing in ChatGPT 4o, it did not work well at all, so the model matters here.
If you want to get the JSON into a spreadsheet, you have to convert it to a .csv file. You may want to play around with having Gemini just create a .csv file in the first place, but Christopher says JSON is the best format if you want to do further manipulation of the data in an LLM.
Subscribe to the Almost Timely newsletter. It can be a bit technical at times, but you’ll be glad you did!
The model matters, part II
You may remember a viral story from back in June where ChatGPT lied (and apologized and lied and apologized over and over) to a woman who asked it for a review of her writing, insisting it could access online stories when it couldn’t.
Well, the Hybrid Horizons newsletter took the same prompts and fed them into a different model — OpenAI o3 — and got good results. No lies. No apologies.
None of the models are perfect — and it’s important to note that the original “experiment” was done when ChatGPT was having a huge problem with being too sycophantic, and it has since been dialed back — but this is a good example of why it pays to understand how each model works and to choose the right one for the task.
Energy: What uses more?

Jon Ippolito made a tool that directly compares the energy use of different activities in different scenarios. It has a rudimentary interface and seems to be designed for classroom use, but it’s fun and interesting to plug in different scenarios — like watching an hour of Netflix compared to generating a 3-second AI video from text — and then changing the parameters, like the number of paragraphs or text or whether the data center is in a hot climate or a cool climate. You quickly realize there is no one simple answer.
AI’s erratic energy use problem
The amount of energy AI uses isn’t actually the only energy problem. The uneven nature of the energy demand is another. In an article in the Financial Times, Switzerland-based Andreas Schierenbeck, chief executive of Hitachi Energy, said AI training runs cause sudden spikes in energy demand at data centers that can be 10x the normal load. He says other industries aren’t allowed to use energy this way. For example, if you want to start a new smelter company — another highly energy intensive business — you have to clear it with local utilities and give them time to adapt. He says AI companies should be regulated the same way and schedule training when renewable energy is plentiful.*
Webinar: Storysnap
The Editorial Freelance Association is hosting a free webinar about Storysnap, an AI tool for evaluating manuscripts, on July 22 at 5 p.m. Eastern. The presentation is put on by the company, but it looks like an interesting product to understand. If you aren’t a member of EFA, click “guest” on the registration form. h/t Katharine O’Moore-Klopf
Meta is developing chatbots that message you first
If you’ve used Meta’s AI Studio, chatbots may soon start messaging you on their own instead of waiting for you to send a message. Meta says the bots will only message someone who has sent at least five messages in the last 14 days and won’t send a second message if they don’t get a reply.
The goal is to keep you talking, and the ideal message they’re working on delivering will “reference something concrete from the user’s past conversations.” It reminds me a bit of the emails you get from companies when you abandon a shopping cart. — Business Insider
Ad: AI with Allie
Since I’m a LinkedIn Learning instructor myself, I’ve seen how well they vet their courses, and I was impressed to see that Allie Miller is also a LinkedIn Learning instructor.
Expand What AI Can Do For You
Tired of basic AI prompts that don't deliver? This free 5-day course shows you how to create tools that actually address your problems—from smart assistants to custom software.
Each day brings practical techniques straight to your inbox. No coding, no fluff. Just useful examples to automate and enhance your workflow.
Follow-ups
I seem to have a lot of stories today that continue something I mentioned in a past newsletter:
Calling all Sallys?
June 26, I mentioned that LLMs tend to overuse the names Emily and Sarah. Here’s another anecdote with a teacher saying many student essays describing a time they witnessed discrimination involved a victim named Sally, leading the teacher to suspect the essays were written by AI.
We are still underreacting on AI
Also on June 26, I included a link to Pete Buttigieg’s newsletter post titled “We are still underreacting on AI,” and a reader asked about my thoughts on the essay. I didn’t have time to comment on it earlier, but I heartily agree. It’s why I spend hours a day reading and writing about AI when, clearly, I shouldn’t be because it’s not my main job. And I was heartened to see someone who might be in government again someday clearly articulate the situation. Yes, to all of this:
And when I say we’re “underprepared,” I don’t just mean for the physically dangerous or potentially nefarious effects of these technologies, which are obviously enormous and will take tremendous effort and wisdom to manage. But I want to draw more attention to a set of questions about what this will mean for wealth and poverty, work and unemployment, citizenship and power, isolation and belonging.
In short: the terms of what it is like to be a human are about to change in ways that rival the transformations of the Enlightenment or the Industrial Revolution, only much more quickly.
I recommend reading the whole thing.
The AI horse is out of the barn, off the farm, and over the river in publishing
June 30, I posted a link to an open letter from hundreds of prominent authors demanding that publishers never use AI for anything. I’m not a fan of open letters or statements that include “never for all time” kinds of demands, so I didn’t write about it further.
But Jane Friedman’s response in her Bottom Line newsletter was well-reasoned, and I’d like to highlight it here. Among other things, she points out that AI is already being widely used in publishing, and publishers currently have no reliable way of detecting the AI writing the authors want them to reject.
I know I’m saying this a lot this newsletter, but read the whole thing — and while you’re at it, subscribe to The Bottom Line.
🤣 A trickster stole and AI band’s identity 🤣
I love this story so much I almost made it the lead, but it’s only peripherally about AI, so I reined myself in. Here’s the deal:
On July 2, I included a link to a story about a band called Velvet Sundown that was knocking it out of the park on Spotify with what everyone was convinced was AI-generated music. At the time, nobody could reach the band, so a guy with a history of online stunts turned one of his old Twitter accounts into an account for the band and started filling the void. Thanks to both social engineering and expert-level social media instincts, he gained a huge following and did some high-profile media interviews.
It’s a hilarious. I read it out loud to my husband, who was also cracking up, but I also can’t get it out of my head. If I were still teaching social media to journalism students, this story would be required reading because it’s revealing about how our systems are being gamed every day.
How much do you trust AI overviews?
I’m curious about something: If you use Google search, how much do you trust the AI Overviews that appear at the top of the search results?
How much do you trust AI Overviews? |
Quick Hits
Philosophy
Artificial intelligence has entered the personal chat. What does that say about human relationships? — The Guardian
AI enables access (This line stood out to me: “I am doing what everyone else does, getting help. Mine just happens to be a chatbot, not a family friend in editorial.”) — The Bookseller
Legal
Trial court decides a case based on AI-hallucinated case law — FindLaw (via Peter Henderson on Bluesky)
I’m scared/I hate it
Other
Problems AI will have to overcome before we have widespread agents — MIT Technology Review
What happens after AI destroys the college essay? — The New Yorker
How AI made me more human, not less (A woman’s story about using ChatGPT to cope and reengage with friends and family after an epilepsy diagnosis.)— New York Times
Embarrassing AI errors/Cringe
Crunchyroll Caught Using AI Subtitles After Viewers Notice “ChatGPT Said…” in New Anime Scene — Comic Basics
What is AI Sidequest?
Are you interested in the intersection of AI with language, writing, and culture? With maybe a little consumer business thrown in? Then you’re in the right place!
I’m Mignon Fogarty: I’ve been writing about language for almost 20 years and was the chair of media entrepreneurship in the School of Journalism at the University of Nevada, Reno. I became interested in AI back in 2022 when articles about large language models started flooding my Google alerts. AI Sidequest is where I write about stories I find interesting. I hope you find them interesting too.
* SMELTERS: As part of my research for this paragraph, I looked at how much energy smelters use, and wow — no wonder they want us to recycle aluminum cans!
Written by a human