AI triumphs… and also fails.
« previous post |
Google has created an experimental — and free — system called NotebookLM. Here's its current welcome page:
So I gave it a link to a LLOG post that I happened to have open for an irrelevant reason: "Dogless in Albion", 9/12/2011.
And here's what it showed me next:
That Summary is OK, though it leaves out the main point of the post, which was to discuss Martin Kay's point about the puzzling role of phrasal stress in disambiguating the sentence "Dogs must be carried".
But one of the three options under "Audio Overview" was
What is the relationship between phrasal stress and the interpretation of signs using the "X must be Y" construction?
So I clicked on that option. The result was an automatically-generated podcast-style discussion:
Both the LM-generated dialog and its audio realization are really impressive. And I'm not the only one who's impressed with NotebookLM's autopodcasts — on ZDNET, David Gewirtz wrote (10/1/2024):
I am not at all religious, but when I discovered this tool, I wanted to scream, "This is the devil's work!"
When I played the audio included below for you to my editor, she slacked back, "WHAT KIND OF SORCERY IS THIS?" I've worked with her for 10 years, during which time we have slacked back and forth just about every day, and that's the first all-caps I've ever seen from her.
Later, she shared with me, "This is 100% the most terrifying thing I've seen so far in the generative AI race."
If you are at all interested in artificial intelligence, what I've found could shake you up as much as it did us. We may be at a watershed moment.
Stunningly lifelike speech and dialog system, yes. Even voice quality variation and laughter at appropriate times.
And some of the content is good — for example the robot podcasters do a good job of explaining the ambiguity under discussion in my blog post:
But there are still problems. For example, the robots' attempt to explain the phrasal stress issue goes completely off the rails:
Zeroing in on the system's performance of the stress difference:
Where did the system get the weird idea that the way to put phrasal stress on the subject of "Dogs must be carried" is to pronounce "dogs" as /ˈdɔgz.ɛs/? Inquiring minds want to know, but are unlikely ever to learn, given the usual black-box unexplainability of contemporary AI systems.
Still, "podcasters" and similar talking-head roles may be among the jobs threatened by AI, either through complete replacement or a major increase in productivity. (And of course, human talking heads get things wrong a fair fraction of the time…)
Note: The original LLOG post should have included audio examples of Martin Kay's stress distinction, but didn't. So just in case it wasn't clear to you, here's my performance of phrasal stress on the subject:
And on the verb:
This is the only thing I've tried to do with notebookLM so far — future experiment will probably bring additional triumphs and additional failures.
Jon W said,
October 3, 2024 @ 2:22 pm
Folks might be interested in Henry Farrell's impression of the same tech at https://www.programmablemutter.com/p/after-software-eats-the-world-what (the whole thing is worth reading):
David McAlister said,
October 3, 2024 @ 3:06 pm
On the issue of dogs being pronounced /ˈdɔgz.ɛs, it seems to me that the word being pronounced is dachshund – the first speaker using a particular dog breed to make their point and the second speaker clearly picks up on that.