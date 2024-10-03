« previous post |

Google has created an experimental — and free — system called NotebookLM. Here's its current welcome page:





So I gave it a link to a LLOG post that I happened to have open for an irrelevant reason: "Dogless in Albion", 9/12/2011.

And here's what it showed me next:

That Summary is OK, though it leaves out the main point of the post, which was to discuss Martin Kay's point about the puzzling role of phrasal stress in disambiguating the sentence "Dogs must be carried".

But one of the three options under "Audio Overview" was

What is the relationship between phrasal stress and the interpretation of signs using the "X must be Y" construction?

So I clicked on that option. The result was an automatically-generated podcast-style discussion:

Your browser does not support the audio element.

Both the LM-generated dialog and its audio realization are really impressive. And I'm not the only one who's impressed with NotebookLM's autopodcasts — on ZDNET, David Gewirtz wrote (10/1/2024):

I am not at all religious, but when I discovered this tool, I wanted to scream, "This is the devil's work!"

When I played the audio included below for you to my editor, she slacked back, "WHAT KIND OF SORCERY IS THIS?" I've worked with her for 10 years, during which time we have slacked back and forth just about every day, and that's the first all-caps I've ever seen from her.

Later, she shared with me, "This is 100% the most terrifying thing I've seen so far in the generative AI race."

If you are at all interested in artificial intelligence, what I've found could shake you up as much as it did us. We may be at a watershed moment.

Stunningly lifelike speech and dialog system, yes. Even voice quality variation and laughter at appropriate times.

And some of the content is good — for example the robot podcasters do a good job of explaining the ambiguity under discussion in my blog post:

Your browser does not support the audio element.

But there are still problems. For example, the robots' attempt to explain the phrasal stress issue goes completely off the rails:

Your browser does not support the audio element.

Zeroing in on the system's performance of the stress difference:

Your browser does not support the audio element.

Where did the system get the weird idea that the way to put phrasal stress on the subject of "Dogs must be carried" is to pronounce "dogs" as /ˈdɔgz.ɛs/? Inquiring minds want to know, but are unlikely ever to learn, given the usual black-box unexplainability of contemporary AI systems.

Still, "podcasters" and similar talking-head roles may be among the jobs threatened by AI, either through complete replacement or a major increase in productivity. (And of course, human talking heads get things wrong a fair fraction of the time…)

Note: The original LLOG post should have included audio examples of Martin Kay's stress distinction, but didn't. So just in case it wasn't clear to you, here's my performance of phrasal stress on the subject:

Your browser does not support the audio element.

And on the verb:

Your browser does not support the audio element.

This is the only thing I've tried to do with notebookLM so far — future experiment will probably bring additional triumphs and additional failures.

