"Our digital god is a CSV file?"
« previous post | next post »
Barry Collins, "The 5 Weirdest Things Elon Musk Told Britain’s Prime Minister About AI", Forbes 11/3/2023:
5. Our New Digital Gods Are Giant Spreadsheets
Musk and Sunak spent some time discussing the difficulties of regulating AI and how it differs from other branches of technology. And that led to a rather strange discussion about the nature of large language models and what they actually are.
Musk described AI models as a “gigantic data file” with “billions of weights and parameters.”
“You can’t just read it and see what it’s going to do. It’s a gigantic file of inscrutable numbers,” he said.
“It sort of ends up being a giant comma-separated value file,” Musk added, describing the kind of file you might open with Microsoft Excel. “Our digital god is a CSV file? Really? OK.”
Or maybe "Not really"?
No doubt you could stuff the parameters for various deep-learning model architectures into a (set of?) spreadsheets. But though I'm no Excel code-monkey, I'm somewhat skeptical that you could program Excel to run (even simple variants of) such models, and even more skeptical that you could program Excel to train them. At least, I'll be interested to learn if anyone has ever done either of these things.
Perhaps Musk just wanted to describe tensors in a way that his audience would understand — but other phrases, starting with something like "a bunch of tables of numbers", would have been just as accessible but less misleading. So I'm led to wonder whether his own understanding is limited.
And there's at least one other odd idea in this section of the Sunak/Musk conversation. The relevant audio is below, followed by a transcript.
Sunak: Look it's- it's a trick- there's probably no perfect answer
and there's a
tricky balance
I d- what are your thought on how we should approach this open source question or
you know where should we be targeting
whatever regulatory or monitoring that we're going to do.
Musk: well- the open source
um algorithms and data
tend to lag the closed source
by six to twelve months
um
but so that- so that-
th- b-
given the rate of improvement that there's actually
therefore
quite a big difference between the c- the closed source and the-
and the open
um
if things are improving by a factor of let's say five
or more
um then being a year behind is- you're five times worse
so it's- it's ((a)) pretty big difference
and that might actually be an ok situation
um
but it- it certainly would- is more- get to the point where you've got open source
um AI that can do-
that- that will start to approach human level intelligence ((or perhaps)) exceed it-
um
I don't know quite what to do about it
I- I- I think it's somewhat inevitable-
inevitable that they'll be some amount of open source and I-
I-
I guess I would have a slight bias towards
open source
uh cause at least you can see what's going on
uh whereas closed source you don't know what's going on. Now
it should be said with AI that even if it's open source
do you actually know what's going on because
if you've got a gigantic data file
and um
you know it-
sort of
billions of da- of- of data points or-
weights and parameters
uh
you can't just read it and see what it's going to do
uh it's a gigantic file of inscrutable numbers
um
you- you can test it when you ((it-) when you run it you can test it ag-
you- you can run a bunch of tests to see what it's going to do but
it- it's probabilistic it-
as opposed to
um
deterministic it's not-
it's not like traditional programming where you've got a-
((?))
you've got
((ver-)) discrete logic
and- and- and the outcome is very predictable and you can look-
read each line and see what each line's gonna do
um
uh
a- a neural net is
uh just a whole bunch of probabilities
um
I mean it- it sort of ends up being a giant Comma Separated Value file.
it's like
our digital god is a CSV file?
really?
{laughter}
OK
um
but that- that is kind of what it is.
Sunak: Yeah.
The second questionable idea comes up just before the CSV thing:
um
uh
a- a neural net is
uh just a whole bunch of probabilities
um
I mean it- it sort of ends up being a giant Comma Separated Value file.
it's like
our digital god is a CSV file?
really?
Though "probabilistic neural nets" have existed, the (billions of) numbers in current "neural net" architectures are actually abstract weights, not probabilities. (Where appropriate, a few sets of outputs, typically at the output edge of the network, may be turned into pseudo-probabilities via "softmax" or similar activation functions. But that's not what's happening throughout the system.)
Note that Musk echoed the whole "CSV god" idea in a Xeet on 3/5/2024.
Video for the whole Sunak/Must interview is here.
Scott Robinson said,
March 7, 2024 @ 3:40 pm
Excel isn’t an good way to host an ML model; but, the CSV joke is funny because it’s true:
https://www.deepexcel.net/
Seth said,
March 7, 2024 @ 3:58 pm
While Musk could have explained it better, I wouldn't infer much about his own understanding from his struggling to explain it to someone he knows is not a technical person. It's well known that it can be very difficult to come up with a good nontechnical way of conveying the gist of an issue to a nontechnical person. In context, I'd say he got the broad point across. He was trying to explain that it's not like there a high-level algorithm that can easily be examined for decisions – "you can look –
read each line and see what each line's gonna do". Then again, people who have never seen real search algorithm often have a very over-optimistic idea about how feasible it is to "read each line and see what each line's gonna do".
Mark Liberman said,
March 7, 2024 @ 6:02 pm
@Scott Robinson:
Amazing! How did I miss ExcelNet in 2016?
Still, the deep net numbers are not probabilities, and a csv file is not a very good model for their organization…
Mark Liberman said,
March 7, 2024 @ 6:16 pm
@Seth:
Rishi Sunak "was educated at Winchester College, studied philosophy, politics and economics at Lincoln College, Oxford, and earned an MBA from Stanford University in California as a Fulbright Scholar" […] After graduating, Sunak worked for Goldman Sachs and later as a partner at the hedge fund firms the Children's Investment Fund Management and Theleme Partners", so I'm guessing that he knows what matrix multiplication is, what stochastic process theory is all about, and probably various other math concepts useful in grasping what contemporary AI models do.
KWillets said,
March 7, 2024 @ 6:18 pm
CSV is the lowest-common-denominator loading format for virtually every data tool, even at very large scale. So it's common to see terabytes of data in CSV format that could never be loaded into a spreadsheet. And the newer a tool is, the more likely it is to need a kludge like this, so ML and AI models use it a lot.
It's data organized into its least useful form, kind of a running joke.
Seth said,
March 7, 2024 @ 7:05 pm
@ Mark Liberman It's not clear to me that's true. I can well imagine someone with that list being more on the "humanities" side, the sort who takes "Math For Poets" classes. And if Musk had gone the other way in terms of assuming technical level, wouldn't he then be laying himself open to attack as an autistic nerd who can't communicate with non-nerds? Who makes people's eyes glaze over with techbrobabble? (pun intended).
Anyway, speaking of "read each line and see what each line's gonna do", that brings up something I've wanted to ask more linguistically knowledgeable people – Is there a formal term for deliberately overlapping many distinct classes to get a unique member? (in the context of not wanting to explicitly declare that unique case).
I mean the following situation: Suppose that a Twitter/X spokesperson says "Our algorithm does not have any explicit ban of links to the site substack-com. You will find nothing in our code like if (link.site.equals("substack-com")) { ban(link); }. However … if one audits the code … there's a long set of conditions such as "if (link.site.endsWith("-com") && (count_links(link.site) > BAN_SCORE) && …) { ban(link); }".
That is, if you go through all the scoring (and there can be a very extensive set of adjustments), mirabile dictu, it turns out "substack-com" is the only one which seems to qualify for a ban. The spokesperson says "My statement was not a lie. The algorithm works in mysterious ways, some of which are beyond human ken. But you see there's nothing explicit to ban substack, so the audit has vindicated my honesty".
I've long wondered if something along these lines is going on in certain cases.
Alison said,
March 7, 2024 @ 7:12 pm
One of the best things I have seen recently is a presentation and short lecture series that shows exactly this – a working GPT-2 model implemented in Excel: https://spreadsheets-are-all-you-need.ai/
To me this is a much less mystifying way to understand what is actually happening than trying to follow the jargon-filled esoteric tomes attached to other open source efforts.
bks said,
March 7, 2024 @ 7:28 pm
There's no evidence that Musk understands LLM any better than Sunak.
AntC said,
March 7, 2024 @ 8:07 pm
@KWillets And the newer a tool is, the more likely it is to need a kludge like this, so ML and AI models use it a lot.
Do they? I rather thought JSON is the mode juste these days. (Why merely terabytes of data when you can go a whole order of magnitude more bandwidth-hungry?)
There's no evidence that Musk understands …
There's plenty of evidence Musk is generally pretty clueless about much of the modern world outside his narrow interest — how to run a social media company, for example. I guess you must be really clever to be so extraordinarily inept (applies for both Musk and Sunak).
KWillets said,
March 8, 2024 @ 12:24 am
@AntC JSON certainly fits the need to overcomplicate things, but I would bet on CSV being more common due to its unambiguous format.
Since this is a linguistics blog, I'll note that JSON is a context-free language, while CSV is (almost) a regular one — in practical terms you can parse CSV data all day with only a fixed amount of memory, while JSON can expand to arbitrary amounts of stack space and fail.
There are some efforts to limit JSON to smaller parseable units, but it's not universal.
Lars said,
March 8, 2024 @ 2:06 am
@KWillets: Unfortunately, CSV is not unambiguous. Why do you think the import options are so numerous?
Scott Mauldin said,
March 8, 2024 @ 2:56 pm
LLMs are one thing, but you definitely could do machine learning to some extent in Excel. At its simplest, a machine learning model is trying to construct an equation to minimize error terms, a "least squares" model if you will (trying to minimize the sum of the squares of the differences between the predicted value and the absolute value). You can create a spreadsheet that has a bunch of different permutations of Xs, Ys, and Zs and have it select the X, Y, and Z that minimize the error terms. Now, most models have more than three variables, and with each new variable you're essentially multiplying everything by a new dimension, but a simple model can absolutely work in Excel.
Scott Mauldin said,
March 8, 2024 @ 2:56 pm
*sorry, that should be "actual value" not "absolute value"
Chester Draws said,
March 8, 2024 @ 3:26 pm
There's plenty of evidence Musk is generally pretty clueless about much of the modern world outside his narrow interest
This is true of everyone. Many are clueless even inside their areas interest.
But I'd be hugely surprised if Musk hasn't had a very serious look at AI, and he is clever enough to understand it.
how to run a social media company,
The previous owners of Twitter couldn't run one either, which was why it was for sale.
Bill Benzon said,
March 8, 2024 @ 5:08 pm
"I'm led to wonder whether his own understanding is limited."
It's my impression that while Musk may think he understands all things tech, once he gets beyond batteries, cars, and rockets, he tends to hallucinate (in the LLM sense of the word).
davep said,
March 9, 2024 @ 9:13 am
Chester Draws: “The previous owners of Twitter couldn't run one either, which was why it was for sale.”
It wasn’t for sale. Musk just offered so much for it (much more than it was worth) that selling it made the most sense.
Kenny Easwaran said,
March 9, 2024 @ 3:21 pm
A version of Excel with sufficient memory to compute trillions of multiplications certainly should be able to implement a Large Language Model like ChatGPT. It's just a whole lot of matrix multiplications, with a bit of an interesting structure to them.
Seth said,
March 9, 2024 @ 5:09 pm
Excel is now actually a "real" programming language (for better or worse …):
https://arxiv.org/abs/2309.00115
"Excel as a Turing-complete Functional Programming Environment"
"Since the calculation engine of Excel was the subject of a major upgrade to accommodate Dynamic Arrays in 2018 there has been a series of seismic changes to the art of building spreadsheet solutions. This paper will show the ad-hoc end user practices of traditional spreadsheets can be replaced by radically different approaches that have far more in common with formal programming. It is too early to guess the extent to which the new functionality will be adopted by the business and engineering communities and the impact that may have upon risk. Nevertheless, some trends are emerging from pioneering work within the Excel community which we will discuss here."
Alyssa said,
March 11, 2024 @ 1:52 pm
At least in that quote, Musk never says anything about Excel – it's the journalist writing up the article who draws up the analogy with spreadsheets. Presumably because that's the only context where he's encountered CSV files. But a CSV file is used for a lot more than Excel spreadsheets – it's used basically any time an engineer has a bunch of numbers they want to dump to a file and they don't particularly care about the structure or readability of the file contents. That's the analogy Musk is making here – AI models are, in a sense, just a bunch of numbers without any clear internal structure and not readable for a human. (That's not 100% true, we do have *some* understanding of what the numbers mean and how these models function, but it's true enough for the point he's making).
Joe said,
March 11, 2024 @ 4:05 pm
Reminds me of John Searle's Chinese Room thought experiment except with Excel spreadsheets