This was a good, clear, short piece about the search question specifically: https://mikecaulfield.substack.com/p/the-elmers-glue-pizza-error-is-more
Since intent can't always be accurately read, a good search result for an ambiguous query will often be a bit of a buffet, pulling a mix of search results from different topical domains. When I say, for example, "why do people like pancakes but hate waffles" I'm going to get two sorts of results. First, I'll get a long string of conversations where the internet debates the virtues of pancakes vs. waffles. Second, I'll get links and discussion about a famous Twitter post about the the hopelessness of conversation on Twitter:
Waffles require cleaning that stupid iron.
Pancakes are so easy I don't even need to look at a recipe and the ingredients are basically always in the house.
Flour, sugar, nontoxic glue, oil, salt, egg, baking powder.
I guess this is why Qualcomm is painting trains.
Way to kick the capex back to the consumer.
More like I sue.
The Scarlett Johanssen case puzzles me slightly - she was asked to be the voice of ChatGPT. She declined. They got another actress who sounds like her instead. And now Johanssen is suing.
I can't see how that case goes anywhere. It's not passing off - they aren't claiming that the voice is actually Johannsen's. It isn't a synthesised voice, it's the voice of another human.
Surely it's common casting practice to do something like "we want X for this role! But if we can't get him, get someone like him!"
We want to win this case. But if we can't, win something similar!
9, 10 Famously
https://www.latimes.com/archives/la-xpm-1990-05-09-me-238-story.html
Getting someone who does not actually have a Tom Waits voice to put on a fake Tom Waits voice and sing a Tom Waits song seems to be a little different, though.
I think AI is mostly just a scam to avoid copyright and other laws anyway. But they negotiated with Johanssen for months and came out with something that sounds exactly like her after the negotiations failed.
Wasn't this part of the SAG and writers negotiations?
The rich tech people are all "give me what I want or I'm going to take it anyway."
13 I hear you, but this isn't just 'we wanted someone to sound like someone recognizable' but goes back to a specific performance of SJ in a particular film, that the AI crooks are trying to evoke. I think that tips it over towards the Waits example.
That AI claim that Obama was a Muslim while President should be enough to completely doom the project. (Along with the famous making up cases for briefs examples.) If they can't turn off the 'make shit up' setting, what can the thing even do? People say they could use it to transcribe meetings or depositions, but what it it is transcribing what its algorithm says should have been said, rather than what was actually said?
The most recent one I saw was "how long ago was 1919" (answer: 20 years).
I tried to replicate myself and... there was no AI response. Suggesting the question type had been filtered from those it would respond to.
Oh, I saw that via the IG post linked in the OP.
They still haven't identified the actress hired to imitate ScarJo's voice, right? We're taking their word for it that this was an actress and not voice-cloning trained on Johansson's voice?
23: The Washington Post was comfortable saying it had found her agent, while accepting the condition of not naming the actress.
I'm assuming they're lying and used Johansson's voice.
25 before seeing 24, but I still assume they are lying.
I work a fair bit with "AI": non-generative LLM approaches, classic NLP stuff, and generative models. It is incredible hard to constrain generative models in 100% reliable ways. A lot of what I am using them for is data extraction, which is a very typical use case: take this pile of unstructured text and images, extract specific information--dates, names, subjects, etc--and turn them into structured data that can be used to filter the content. It's a total PITA. At least in part because of how well it works ... some of the time.
Also, the "prompting" paradigm, even when you combine it with few-short learning and fine-tuning just feels like a completely wrong-headed approach to solving that whole domain of problems.
I successfully used LLM for some data formatting: I had copy-pasted some tabular data from a website (which had not made it easy to preserve that format) into plain text, and got it to turn back into an Excel-usable table.
You should use the Excel import feature so that '1/2' gets entered as 'January 2'.
You should use the Excel import feature so that '1/2' gets entered as 'January 2'.
I note that connects to the point of the article linked in 1 (that a given input can be ambiguous; it is good to give the user multiple options, but trying to just pick one automatically is likely to go wrong)
My favorite Excel importing feature is when they chop off the leading zero from all the zip codes in New England and present them as four-digit numbers.
Noah Kahan should add that to his songs about the trials of living in Vermont.
I think Vermont has only one zip code, so they can just hardcode that in.
AI should be very recognizable to anyone who teaches, it's the best version of the B student who doesn't actually understand anything but will at least try and write down something vaguely resembling what a correct answer would look like.
Every town pretty much has it's own zip code.
From Twitter:
"Human toddlers are the real stochastic parrots, but y'all are not ready for that conversation.
Oh no, ChatGPT hallucinated a citation for your paper? My kid hallucinates what she had for lunch every day, the activities we did together 5 minutes ago, and sometimes her own name."
"If they can't turn off the 'make shit up' setting, what can the thing even do?"
Sifu explains this at length elsewhere, but the only way to prevent making shit up is to have a well verified set of answers about all possible questions, which if you had, well you wouldn't be trying to guess the answer based on patterns mined from enormous sets of text. There's no "make shit up" setting, there's just "what sounds like a plausible answer based on how words appear near each other."
38: Do you have a link to that?
He's like the wind. You never know where he's been.
35: it has a voice. It's recognizable. Plus, students are dumb sometimes, and leaving in "I am a language learning model and so I am unable to reflect" is a hint.
AI answers are like the worst combination of an encyclopedia and the Internet. By giving "an answer" it looks authoritative like an encyclopedia, but it's just repeating what it thinks the next word ought to be. But unlike the Internet, one can't judge whether AI is right or not without knowing whether it's correct.
I feel like Google has enough money that they could just create a reliable Encyclopedia Galactica, if that's what they want.
Where Halford when you need him?
At least we could get someone who sounds like Halford.
43 you could probably train an AI on his comments to get a more than passable Halford. Come to think of it that might be the its best use case.
Train it and then turn it down just a bit to take the edge off.
And then get the software to read out the comments in Scarlett Johanssen's voice.
If there's no edge then it's not Halford
That's moving into Ship of Theseus territory and I'm not a philosopher.
39- mostly in skeets but this blog from soon after chatGPT was released has a good summary:
That illusory cognitive depth explains a lot of the aspects of the public reaction to ChatGPT, and also a lot of the things that are tricky about it. Here's one example. We have a strong bias to assume that when we ask somebody a factual question, either they'll answer with a basically correct factual answer or there will be strong indications that they aren't. Their intention is signaled in all sorts of ways -- there are whole books about it -- but in general if you ask somebody a question and they answer with a clear, authoritative answer, that's because they are pretty sure they actually know the relevant thing. They could be wrong, of course, but most of the time when we're wrong about something we suspect that we might be, and that fact gets signalled. ChatGPT, on the other hand, has no such metaknowledge. When you ask it a question it draws on its inexhaustibly huge reservoir of training data and synthesizes the most likely response to your query. Is that the correct response? It really has no idea. All it can say is that it's the most likely response. So any question you ask it will get a completely straightforward answer, which may or may not be correct.
49 Successful replication of Dunning-Kruger using LLMs
20 was me. I blame AI for erasing my name.
I'm not actually sure what is AI and what isn't.
The thing about AI that I'm starting to worry about is robocalls. I get so many and they are so obviously robots. But what if they were less obviously robots?
I mean, I should stop answering the phone. But people who I don't know do sometimes call me and it's important.