Re: More like A-Sigh, right?

1

This was a good, clear, short piece about the search question specifically: https://mikecaulfield.substack.com/p/the-elmers-glue-pizza-error-is-more

Since intent can't always be accurately read, a good search result for an ambiguous query will often be a bit of a buffet, pulling a mix of search results from different topical domains. When I say, for example, "why do people like pancakes but hate waffles" I'm going to get two sorts of results. First, I'll get a long string of conversations where the internet debates the virtues of pancakes vs. waffles. Second, I'll get links and discussion about a famous Twitter post about the the hopelessness of conversation on Twitter:

Posted by: NickS | Link to this comment | 05-28-24 6:53 AM
horizontal rule
2

Waffles require cleaning that stupid iron.


Posted by: Moby Hick | Link to this comment | 05-28-24 6:58 AM
horizontal rule
3

Pancakes are so easy I don't even need to look at a recipe and the ingredients are basically always in the house.


Posted by: Moby Hick | Link to this comment | 05-28-24 7:11 AM
horizontal rule
4

Flour, sugar, nontoxic glue, oil, salt, egg, baking powder.


Posted by: Moby Hick | Link to this comment | 05-28-24 7:16 AM
horizontal rule
5

I guess this is why Qualcomm is painting trains.


Posted by: Mossy Character | Link to this comment | 05-28-24 7:20 AM
horizontal rule
6

Way to kick the capex back to the consumer.


Posted by: Mossy Character | Link to this comment | 05-28-24 7:22 AM
horizontal rule
7

More like I sue.


Posted by: Opinionated Scarlett Johannsen | Link to this comment | 05-28-24 7:36 AM
horizontal rule
8

1 is a good read on this!


Posted by: heebie | Link to this comment | 05-28-24 7:49 AM
horizontal rule
9

The Scarlett Johanssen case puzzles me slightly - she was asked to be the voice of ChatGPT. She declined. They got another actress who sounds like her instead. And now Johanssen is suing.

I can't see how that case goes anywhere. It's not passing off - they aren't claiming that the voice is actually Johannsen's. It isn't a synthesised voice, it's the voice of another human.

Surely it's common casting practice to do something like "we want X for this role! But if we can't get him, get someone like him!"


Posted by: ajay | Link to this comment | 05-28-24 8:10 AM
horizontal rule
10

Actors have won similar cases.


Posted by: Moby Hick | Link to this comment | 05-28-24 8:12 AM
horizontal rule
11

We want to win this case. But if we can't, win something similar!


Posted by: heebie | Link to this comment | 05-28-24 8:17 AM
horizontal rule
12

9, 10 Famously
https://www.latimes.com/archives/la-xpm-1990-05-09-me-238-story.html


Posted by: Barry Freed | Link to this comment | 05-28-24 8:30 AM
horizontal rule
13

Getting someone who does not actually have a Tom Waits voice to put on a fake Tom Waits voice and sing a Tom Waits song seems to be a little different, though.


Posted by: ajay | Link to this comment | 05-28-24 8:44 AM
horizontal rule
14

I think AI is mostly just a scam to avoid copyright and other laws anyway. But they negotiated with Johanssen for months and came out with something that sounds exactly like her after the negotiations failed.


Posted by: Moby Hick | Link to this comment | 05-28-24 8:54 AM
horizontal rule
15

Wasn't this part of the SAG and writers negotiations?


Posted by: heebie | Link to this comment | 05-28-24 9:00 AM
horizontal rule
16

The rich tech people are all "give me what I want or I'm going to take it anyway."


Posted by: Moby Hick | Link to this comment | 05-28-24 9:00 AM
horizontal rule
17

Where Halford when you need him?


Posted by: Barry Freed | Link to this comment | 05-28-24 9:15 AM
horizontal rule
18

Where's


Posted by: Barry Freed | Link to this comment | 05-28-24 9:15 AM
horizontal rule
19

Right behind you.


Posted by: Moby Hick | Link to this comment | 05-28-24 9:19 AM
horizontal rule
20

13 I hear you, but this isn't just 'we wanted someone to sound like someone recognizable' but goes back to a specific performance of SJ in a particular film, that the AI crooks are trying to evoke. I think that tips it over towards the Waits example.

That AI claim that Obama was a Muslim while President should be enough to completely doom the project. (Along with the famous making up cases for briefs examples.) If they can't turn off the 'make shit up' setting, what can the thing even do? People say they could use it to transcribe meetings or depositions, but what it it is transcribing what its algorithm says should have been said, rather than what was actually said?


Posted by: | Link to this comment | 05-28-24 9:24 AM
horizontal rule
21

The most recent one I saw was "how long ago was 1919" (answer: 20 years).

I tried to replicate myself and... there was no AI response. Suggesting the question type had been filtered from those it would respond to.


Posted by: Minivet | Link to this comment | 05-28-24 9:30 AM
horizontal rule
22

Oh, I saw that via the IG post linked in the OP.


Posted by: Minivet | Link to this comment | 05-28-24 9:31 AM
horizontal rule
23

They still haven't identified the actress hired to imitate ScarJo's voice, right? We're taking their word for it that this was an actress and not voice-cloning trained on Johansson's voice?


Posted by: chill | Link to this comment | 05-28-24 9:47 AM
horizontal rule
24

23: The Washington Post was comfortable saying it had found her agent, while accepting the condition of not naming the actress.


Posted by: Minivet | Link to this comment | 05-28-24 9:55 AM
horizontal rule
25

I'm assuming they're lying and used Johansson's voice.


Posted by: Moby Hick | Link to this comment | 05-28-24 9:56 AM
horizontal rule
26

25 before seeing 24, but I still assume they are lying.


Posted by: Moby Hick | Link to this comment | 05-28-24 9:57 AM
horizontal rule
27

I work a fair bit with "AI": non-generative LLM approaches, classic NLP stuff, and generative models. It is incredible hard to constrain generative models in 100% reliable ways. A lot of what I am using them for is data extraction, which is a very typical use case: take this pile of unstructured text and images, extract specific information--dates, names, subjects, etc--and turn them into structured data that can be used to filter the content. It's a total PITA. At least in part because of how well it works ... some of the time.

Also, the "prompting" paradigm, even when you combine it with few-short learning and fine-tuning just feels like a completely wrong-headed approach to solving that whole domain of problems.


Posted by: nattarGcM ttaM | Link to this comment | 05-28-24 10:00 AM
horizontal rule
28

I successfully used LLM for some data formatting: I had copy-pasted some tabular data from a website (which had not made it easy to preserve that format) into plain text, and got it to turn back into an Excel-usable table.


Posted by: Minivet | Link to this comment | 05-28-24 10:17 AM
horizontal rule
29

You should use the Excel import feature so that '1/2' gets entered as 'January 2'.


Posted by: Moby Hick | Link to this comment | 05-28-24 10:44 AM
horizontal rule
30

You should use the Excel import feature so that '1/2' gets entered as 'January 2'.

I note that connects to the point of the article linked in 1 (that a given input can be ambiguous; it is good to give the user multiple options, but trying to just pick one automatically is likely to go wrong)


Posted by: NickS | Link to this comment | 05-28-24 11:02 AM
horizontal rule
31

You must not work at Microsoft.


Posted by: Moby Hick | Link to this comment | 05-28-24 11:04 AM
horizontal rule
32

My favorite Excel importing feature is when they chop off the leading zero from all the zip codes in New England and present them as four-digit numbers.


Posted by: Spike | Link to this comment | 05-28-24 12:19 PM
horizontal rule
33

Noah Kahan should add that to his songs about the trials of living in Vermont.


Posted by: Moby Hick | Link to this comment | 05-28-24 12:50 PM
horizontal rule
34

I think Vermont has only one zip code, so they can just hardcode that in.


Posted by: Minivet | Link to this comment | 05-28-24 3:06 PM
horizontal rule
35

AI should be very recognizable to anyone who teaches, it's the best version of the B student who doesn't actually understand anything but will at least try and write down something vaguely resembling what a correct answer would look like.


Posted by: Unfoggetarian: "Pause endlessly, then go in" (9) | Link to this comment | 05-28-24 3:17 PM
horizontal rule
36

Every town pretty much has it's own zip code.


Posted by: Moby Hick | Link to this comment | 05-28-24 3:32 PM
horizontal rule
37

From Twitter:

"Human toddlers are the real stochastic parrots, but y'all are not ready for that conversation.

Oh no, ChatGPT hallucinated a citation for your paper? My kid hallucinates what she had for lunch every day, the activities we did together 5 minutes ago, and sometimes her own name."


Posted by: Unfoggetarian: “Pause endlessly, then go in” (9) | Link to this comment | 05-28-24 5:08 PM
horizontal rule
38

"If they can't turn off the 'make shit up' setting, what can the thing even do?"
Sifu explains this at length elsewhere, but the only way to prevent making shit up is to have a well verified set of answers about all possible questions, which if you had, well you wouldn't be trying to guess the answer based on patterns mined from enormous sets of text. There's no "make shit up" setting, there's just "what sounds like a plausible answer based on how words appear near each other."


Posted by: SP | Link to this comment | 05-28-24 5:11 PM
horizontal rule
39

38: Do you have a link to that?


Posted by: Bostoniangirl | Link to this comment | 05-29-24 4:47 AM
horizontal rule
40

He's like the wind. You never know where he's been.


Posted by: Moby Hick | Link to this comment | 05-29-24 5:09 AM
horizontal rule
41

35: it has a voice. It's recognizable. Plus, students are dumb sometimes, and leaving in "I am a language learning model and so I am unable to reflect" is a hint.

AI answers are like the worst combination of an encyclopedia and the Internet. By giving "an answer" it looks authoritative like an encyclopedia, but it's just repeating what it thinks the next word ought to be. But unlike the Internet, one can't judge whether AI is right or not without knowing whether it's correct.


Posted by: Cala | Link to this comment | 05-29-24 6:21 AM
horizontal rule
42

I feel like Google has enough money that they could just create a reliable Encyclopedia Galactica, if that's what they want.


Posted by: Cala | Link to this comment | 05-29-24 6:22 AM
horizontal rule
43

Where Halford when you need him?

At least we could get someone who sounds like Halford.


Posted by: ajay | Link to this comment | 05-29-24 6:40 AM
horizontal rule
44

43 you could probably train an AI on his comments to get a more than passable Halford. Come to think of it that might be the its best use case.


Posted by: Barry Freed | Link to this comment | 05-29-24 6:45 AM
horizontal rule
45

Train it and then turn it down just a bit to take the edge off.


Posted by: Moby Hick | Link to this comment | 05-29-24 6:53 AM
horizontal rule
46

And then get the software to read out the comments in Scarlett Johanssen's voice.


Posted by: ajay | Link to this comment | 05-29-24 6:58 AM
horizontal rule
47

If there's no edge then it's not Halford


Posted by: Barry Freed | Link to this comment | 05-29-24 7:05 AM
horizontal rule
48

That's moving into Ship of Theseus territory and I'm not a philosopher.


Posted by: Moby Hick | Link to this comment | 05-29-24 7:09 AM
horizontal rule
49

39- mostly in skeets but this blog from soon after chatGPT was released has a good summary:


That illusory cognitive depth explains a lot of the aspects of the public reaction to ChatGPT, and also a lot of the things that are tricky about it. Here's one example. We have a strong bias to assume that when we ask somebody a factual question, either they'll answer with a basically correct factual answer or there will be strong indications that they aren't. Their intention is signaled in all sorts of ways -- there are whole books about it -- but in general if you ask somebody a question and they answer with a clear, authoritative answer, that's because they are pretty sure they actually know the relevant thing. They could be wrong, of course, but most of the time when we're wrong about something we suspect that we might be, and that fact gets signalled. ChatGPT, on the other hand, has no such metaknowledge. When you ask it a question it draws on its inexhaustibly huge reservoir of training data and synthesizes the most likely response to your query. Is that the correct response? It really has no idea. All it can say is that it's the most likely response. So any question you ask it will get a completely straightforward answer, which may or may not be correct.


Posted by: SP | Link to this comment | 05-29-24 7:37 AM
horizontal rule
50

49 Successful replication of Dunning-Kruger using LLMs


Posted by: Barry Freed | Link to this comment | 05-29-24 7:40 AM
horizontal rule
51

20 was me. I blame AI for erasing my name.

I'm not actually sure what is AI and what isn't.


Posted by: CharleyCarp | Link to this comment | 05-29-24 8:34 AM
horizontal rule
52

You're not.


Posted by: heebie | Link to this comment | 05-29-24 12:47 PM
horizontal rule
53

The thing about AI that I'm starting to worry about is robocalls. I get so many and they are so obviously robots. But what if they were less obviously robots?


Posted by: Moby Hick | Link to this comment | 05-30-24 8:40 AM
horizontal rule
54

I mean, I should stop answering the phone. But people who I don't know do sometimes call me and it's important.


Posted by: Moby Hick | Link to this comment | 05-30-24 8:43 AM
horizontal rule