Language Log

The mind of artificial intelligence

March 22, 2023 @ 9:39 pm · Filed by Victor Mair under Artificial intelligence, Computational linguistics

Sean Carroll's Preposterous Universe Podcast #230

Raphaël Millière on How Artificial Intelligence Thinks, March 20, 2023 / Philosophy, Technology, Thinking / Comments

Includes transcript of the two hour podcast.

Welcome to another episode of Sean Carroll's Mindscape. Today, we're joined by Raphaël Millière, a philosopher and cognitive scientist at Columbia University. We'll be exploring the fascinating topic of how artificial intelligence thinks and processes information. As AI becomes increasingly prevalent in our daily lives, it's important to understand the mechanisms behind its decision-making processes. What are the algorithms and models that underpin AI, and how do they differ from human thought processes? How do machines learn from data, and what are the limitations of this learning? These are just some of the questions we'll be exploring in this episode. Raphaël will be sharing insights from his work in cognitive science, and discussing the latest developments in this rapidly evolving field. So join us as we dive into the mind of artificial intelligence and explore how it thinks.

[The above introduction was artificially generated by ChatGPT.]

Comments

Maria Comninou (March 20, 2023 at 2:41 pm)

I am always surprised at the ease that humans (mostly male in these fields) are willing to attribute consciousness to algorithms (AI) but deny it to non human animals!

Jim Wade (March 22, 2023 at 4:13 am)

The question about whether an AI machine will ever be able to think is, to me, the most important question to be addressed. This question is the hard problem of consciousness. The inner life of humans is a reality that is unexplainable. Self-aware consciousness is what leads to understanding the meaning of experience. Computers do not understand the meaning of anything. It is the human minds that interpret the findings of the algorithms that give them meaning. Computers do not have AHA moments. Computers are very valuable tools that can vastly expand the capabilities and achievements of human beings, but understanding is the purview of self-aware consciousness.

Selected readings

"ChatGPT-4: threat or boon to the Great Firewall?" (3/21/23) — with a bibliography of previous posts on this subject
"Heart-mind" (9/29/14)

[h.t. Bill Benzon]

March 22, 2023 @ 9:39 pm · Filed by Victor Mair under Artificial intelligence, Computational linguistics

Permalink

6 Comments

Gregory Kusnick said,

March 23, 2023 @ 12:04 am

Carroll's podcast is hands-down my favorite. I highly recommend it (though I have not actually listened to this episode yet).
Bill Benzon said,

March 23, 2023 @ 8:10 am

Here's a passage late in the dialog that relates to a point Syd Lamb made decades ago:

1:14:42.0 RM: Here I'm indebted to, among other people, the work of Diego Marconi who distinguishes between referential and inferential competence. So referential competence is the ability that relates to this idea of relating word meaning to their worldly reference, to whatever they are referencing out there in the world, and this is exhibited by things like recognitional capacities. So if I ask you to point to a dog, you will be able to do that. Or if I ask you to name that thing, and not point to a dog, you'll be able to do that. Or it's also displayed in our ability to parse instructions and translate them into actions in the world such as go fetch the fork in the drawer or in the kitchen, you will be able to do that in the world. So we are able to relate lexical expressions where they're referential with a reference in the world. But that's not the only aspect of meaning; that's the aspect of meaning that the people talking about this stochastic parrot analogy are focusing on. But our ability to understand word meaning also hinges on relationships between words themselves, intra-linguistic relationship.

1:16:07.4 RM: And these are the kinds of relationships that are at display in definitions, such as the ones you find in a dictionary, as well as vice other relationships of synonymy and homonymy that would also underlie our capacity to perform certain inferences in language. And so to illustrate that point, you can consider someone who's perhaps let's say a eucalyptus expert who knows all there is to know scientifically about eucalyptus trees from reading books, back in the city in New York say, going to university and so on, but has never actually been in a eucalyptus forest, versus someone who actually has grown up surrounded by eucalyptus trees and might know very little about the biology of eucalyptus trees or various information about them but has grown around them. So the eucalyptus specialist might have a very high degree of inferential competence when it comes to the use of the word eucalyptus, being able to use it in definitions and to know exactly how different other words, including biological terms, would relate to the word eucalyptus and so on. But perhaps if you put that specialist in a forest that had eucalyptus trees and a very similar tree… My knowledge of this [1:17:49.1] ____.

1:17:49.9 SC: [chuckle] Mine too. Don't worry.

1:17:50.6 RM: It shows that my kind of knowledge of eucalyptus is really, really low. But perhaps there's a tree that looks very much like eucalyptus trees but that expert might not be able to actually recognize which are the eucalyptus trees, which aren't, even though he has all this knowledge. So his actual referential competence when it comes to the use of that word might not be that great. Whereas the person who has grown up around eucalyptus trees might be excellent at pointing to eucalyptus trees, even if it has very, very little inferential competence when it comes to using that term in definitions, for example, or knowing this more fine relations between the word eucalyptus and various other words. So that's just a very toy example. But clearly there are aspects of meaning that are very important in the way we understand and use words that are not just exhausted by this referential relation to the world.

1:18:50.7 RM: And this second aspect, this inferential aspect of meaning, is something that language models are well placed to induce just because they're trained on this big large corpus of text to learn statistical relationships between words. And you might think that insofar as there are this complex intra- linguistic relationship between words, that a model that learns to model the patterns of co-occurrence between words at a very very sophisticated fine-grained level might learn to represent this intra- linguistic relationship.

1:19:31.4 SC: Jacque Derrida famously said, "There is nothing outside the text." Maybe he was standing up for the rights of large language models and their ability to understand things before they ever came along. But it makes sense to me. Look, these corpuses of text, corpi, I'm not sure, of text that the models are trained on are constructed mostly by people who have experience with the world. It would be weird if the large language model could not correctly infer some things about the world. So we're gonna count that on the side of the ledger for a kind of understanding that these AI systems do have, right?

1:20:09.5 RM: Exactly. Yes. And in fact, I think you could even say something about some very limited and weak form of referential competence in these models but maybe that would take us too far afloat. But indeed, I think, insofar as the statistics of language reflects to some extent, at least in some domains, the structure of the world, you can absolutely think that you can latch onto something there about the structure of the world just by learning from statistics of language. One example would be there's this wonderful paper by Ellie Pavlick from Brown University that showed that you can use color terms, color words like orange, red, blue and so on, and you can look at how language models are able to represent these color terms. And I'm simplifying a little bit from the study to not get into too many details, but you can map the representational geometry of the way in which the model represent these terms in a vector space to the geometry of the color space, that is the actual relationship, say the RGB color space, which is a way to just represent relationships between colors. So there is something about the structure of the encodings for word terms in these models that encodes information or is somehow isomorphic to the structure of colors out there in the world, if you think of the RGB color space as one way to represent that. Again, I'm making some simplifying assumption to discuss that kind of research but it's still a very intriguing finding.

This inferential competence follows from Lamb's point, that the meaning of a word is a function of its relationship with other words. Stated that way it seems pretty much the same as Firth's distributional semantics. And maybe it is, but maybe it explains distributional semantics.

As many of you know, Lamb is a first generation computational linguist, from the old old days when it was called machine translation. He favored a linguistic approach derived from Hjelmslev's stratification. He also favored a notation in the form of a relational network – he was one of the first to do so, and has told me he got the idea from work Halliday was doing in 63-64. We see Lamb's notation in more or less full form in Outline of Stratificational Grammar, 1966.

So, Lamb's point exists in the context of an explicitly drawn relational network. I don't know offhand whether the point was there in the 1966 book. It was told to me in the mid-late 1970s by Dave Hays, with whom I was studying. It's there explicitly in Lamb's 1999 Pathways of the Brain.

In any event, by the time GPT-3 first appeared in 2020 I'd been thinking about the success of neural networks in machine translation and had decided it was time I tried to convince myself that these results were not the result of some mystical machine voodoo but an intelligible manifestation of however it is that language works. I made Lamb's point the center of my thinking in GPT-3: Waterloo or Rubicon? Here be Dragons, pp. 15-19. That account substitues tap-dancing for technical detail, but it has served its purpose. The technical detail will have to be supplied by those who have mathematical skills that I lack.
Gene Hill said,

March 23, 2023 @ 11:26 am

Brings to mind one of the earliest adages of computer science. "Garbage in, garbage out"
Grant Castillou said,

March 23, 2023 @ 2:39 pm

It's becoming clear that with all the brain and consciousness theories out there, the proof will be in the pudding. By this I mean, can any particular theory be used to create a human adult level conscious machine. My bet is on the late Gerald Edelman's Extended Theory of Neuronal Group Selection. The lead group in robotics based on this theory is the Neurorobotics Lab at UC at Irvine. Dr. Edelman distinguished between primary consciousness, which came first in evolution, and that humans share with other conscious animals, and higher order consciousness, which came to only humans with the acquisition of language. A machine with primary consciousness will probably have to come first.

What I find special about the TNGS is the Darwin series of automata created at the Neurosciences Institute by Dr. Edelman and his colleagues in the 1990's and 2000's. These machines perform in the real world, not in a restricted simulated world, and display convincing physical behavior indicative of higher psychological functions necessary for consciousness, such as perceptual categorization, memory, and learning. They are based on realistic models of the parts of the biological brain that the theory claims subserve these functions. The extended TNGS allows for the emergence of consciousness based only on further evolutionary development of the brain areas responsible for these functions, in a parsimonious way. No other research I've encountered is anywhere near as convincing.

I post because on almost every video and article about the brain and consciousness that I encounter, the attitude seems to be that we still know next to nothing about how the brain and consciousness work; that there's lots of data but no unifying theory. I believe the extended TNGS is that theory. My motivation is to keep that theory in front of the public. And obviously, I consider it the route to a truly conscious machine, primary and higher-order.

My advice to people who want to create a conscious machine is to seriously ground themselves in the extended TNGS and the Darwin automata first, and proceed from there, by applying to Jeff Krichmar's lab at UC Irvine, possibly. Dr. Edelman's roadmap to a conscious machine is at https://arxiv.org/abs/2105.10461
john said,

March 24, 2023 @ 8:53 am

I enjoyed the point that saying that LLMs “just” predict the next word is like saying that humans “just” maximize their reproductive fitness. Yes, sure, but to get good at that requires some astonishing capabilities.
Vampyricon said,

April 25, 2023 @ 2:38 pm

Maria's comment is weird to me since I'm fairly certain Sean does believe in animal consciousness.

RSS feed for comments on this post

The mind of artificial intelligence

6 Comments

Gregory Kusnick said,

Bill Benzon said,

Gene Hill said,

Grant Castillou said,

john said,

Vampyricon said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta