Thanks, I enjoyed this! And agree with all the other commenters - your father has a fine sonorous voice and doesn't need a menagerie to talk over.
I don't think I buy what I take to be the big point though, that LLMs some how vindicate a Skinnerian/empiricist "blank slate" view of learning. From my unserstanding (a lot of which come from reading this very detailed explainer - https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/ ) it is wrong to characterise LLMs (and other breakthrough "gen-AI" networks) as following pure general purpose learning strategies. The *primitives* they use (gradient descent etc) are indeed general, but they are organised into quite specific architectures with structures that reflect our own understanding of the problem domain they work in.
So:
"OK, so we’re finally ready to discuss what’s inside ChatGPT. And, yes, ultimately, it’s a giant neural net—currently a version of the so-called GPT-3 network with 175 billion weights. In many ways this is a neural net very much like the other ones we’ve discussed. But it’s a neural net that’s **particularly set up for dealing with language**. And its most notable feature is a piece of neural net architecture called a “transformer”.
In the first neural nets we discussed above, every neuron at any given layer was basically connected (at least with some weight) to every neuron on the layer before. But this kind of fully connected network is (presumably) overkill if one’s working with data that has particular, known structure. And thus, for example, in the early stages of dealing with images, it’s typical to use so-called **convolutional neural nets (“convnets”) in which neurons are effectively laid out on a grid analogous to the pixels in the image—and connected only to neurons nearby on the grid**.
The idea of transformers is to do something at least somewhat similar for sequences of tokens that make up a piece of text. But instead of just defining a fixed region in the sequence over which there can be connections, transformers instead introduce the notion of “attention”—and the idea of “paying attention” more to some parts of the sequence than others. **Maybe one day it’ll make sense to just start a generic neural net and do all customization through training. But at least as of now it seems to be critical in practice to “modularize” things**—as transformers do, and probably as our brains also do." - "Inside ChatGPT" from Wolfram link above, my emphasis ** added.
If a 2d "grid" of neurons doesn't sound more akin to a Kantian intuition of Euclidean space than an empiricist blank slate, then I don't know what does! Similarly, the "transformer" architecture that the T in GPT mostly stands for, is oriented to the sequential structure of sentence inputs.
More than that, the successful LLMs I'm aware of all need coaching with human feedback on top of their automated training on huge data sets - so called "reinforcement learning with human feedback" or RLHF. So although LLMs really are astonishing and unexpected from the point of view of pre-2000 linguistic and AI thinking, I think it's just not true to say that we are witnessing pure empirical learning unmediated by any analogue of innate structure or knowledge. It's pretty much the opposite; though as Wolfram notes, we may still hope to generalise further beyond the current special-purpose architectures.
There are definitely others who take a line closer to your father's - see eg. https://slator.com/how-large-language-models-prove-chomsky-wrong-with-steven-piantadosi/ - but a reliable source (linguistics professor older brother) tells me this is very much a minority view... perhaps not surprising in a field that doesn't want to surrender its own relevance?
Thank you, thank you, thank you for having your father on again! He is a global treasure. He could talk on any subject, for any length of time, and it would be fascinating, educational, enlightening, and a pleasure to listen to. I wish he'd record such discourses more often; it would add immeasurably to the sum of human understanding. Is there any way to make him understand that?
I loved his exposition on Chomsky and Skinner and how large language models (LLMs) upended the whole field. Knowing Chomsky only from his ridiculous political commentary, I was always at a loss as to how he managed to build the academic reputation he seemed to enjoy. Having your father assess him as a supremely smart person and his arguments as overpowering the field puts him in a whole different light - even though he may have ended up being wrong after all!
I do have a question about your father's conclusion about the validity of the Chomsky model. Since human beings do not have ready access to the massive data set the LLMs are trained on, and do not possess the computational capability of modern computers, we are unlikely to be unconsciously using the techniques the LLMs rely on; thus, we must be using a different model; why not Chomsky's?
I loved Dr. Berlinski's speculation about what the reality of LLMs might mean for theoretical sciences. It is indeed a provocative theory (why construct complex rules-based models when we can just correlate a massive number of observations to perhaps a more accurate (and certainly speedier) result?)
I would suggest that one, human nature craves patterns and rules and thus may not be able to give up theorizing, and two, blind acceptance of phenomena we don't understand and can't fully explain smacks to me of paganism. Next thing, we'll start praying to the LLM (or even more readily to the eventual GAI) just as we used to pray to the gods of the thunder and the seas. Like the "cargo cult." I don't accept that!
And speaking of which, I wish you could ask your father what effect he sees LLM and GAI having on religion in general, and any particular systems of beliefs in particular.
Please, please convince your father to come onto TCG more often! Hey, how about having a regular "Elephant's Cage" with him?
Finally, a note about sound effects. I started listening to the podcast in the car, without the benefit of reading your introduction, so I didn't understand the experimental nature of the sounds I was hearing. I will confess that I came close to throwing my phone out the window in frustration multiple times when the effects overpowered and obscured your voices. I was hanging on every word, and then the planes would start taking off, or a playground full of toddlers would start screaming, or the like. Even mild effects like birds flapping their wings and computers beeping were distracting and, to me at least, did not add anything to the experience. I think having a distinctive "sound signature" at the beginning or the end of the podcast is entirely appropriate and good for building your brand. Anything else, especially when the guest is so valuable and his discourse is so dense with information, is worse than unnecessary.
Sorry, but the sound effects are not additive, they are subtractive. I listen to lots of podcasts. They tend to fall into two groups: conversational and narrative. The former don't usually have sound effects except for intro, commercial breaks or outros. Narrative podcasts - those that tell a story - sometimes have sound effects, but they also have sound designers. Mastering the tools won't get you very far without knowing how and when to use them. In the case of your wonderful discussion with your father they are totally unneccessary. If your able, I would suggest reposting sans sound effects.
I've posted a version without the sound effects, above. Yes, of course it was interesting, but I wouldn't know any other way of growing up to compare it to, would I?
Anybody willing to give a hiighly podcast-averse subscriber one solid reason to listen to something offering to reveal the "philosophical ramifications" of AI *instead* of the anthropological ones?
1) Don't waste time trying for perfection; even mere better is the enemy of good enough. Freeze the design and go to production. Changes will evolve for the better. It's what organic lifeforms do. Disclaimer: I don't do podcasts very often at all; I much prefer the printed word.
2) I'm not aware that we've built any machines that can think; perhaps you can identify one or two. LLMs are mimics and nothing more. Also, please supply your evidence that your one or two examples actually think. And I don't mean anything like the intrinsically subjective Turing test--hard, objective evidence.
Thanks, I enjoyed this! And agree with all the other commenters - your father has a fine sonorous voice and doesn't need a menagerie to talk over.
I don't think I buy what I take to be the big point though, that LLMs some how vindicate a Skinnerian/empiricist "blank slate" view of learning. From my unserstanding (a lot of which come from reading this very detailed explainer - https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/ ) it is wrong to characterise LLMs (and other breakthrough "gen-AI" networks) as following pure general purpose learning strategies. The *primitives* they use (gradient descent etc) are indeed general, but they are organised into quite specific architectures with structures that reflect our own understanding of the problem domain they work in.
So:
"OK, so we’re finally ready to discuss what’s inside ChatGPT. And, yes, ultimately, it’s a giant neural net—currently a version of the so-called GPT-3 network with 175 billion weights. In many ways this is a neural net very much like the other ones we’ve discussed. But it’s a neural net that’s **particularly set up for dealing with language**. And its most notable feature is a piece of neural net architecture called a “transformer”.
In the first neural nets we discussed above, every neuron at any given layer was basically connected (at least with some weight) to every neuron on the layer before. But this kind of fully connected network is (presumably) overkill if one’s working with data that has particular, known structure. And thus, for example, in the early stages of dealing with images, it’s typical to use so-called **convolutional neural nets (“convnets”) in which neurons are effectively laid out on a grid analogous to the pixels in the image—and connected only to neurons nearby on the grid**.
The idea of transformers is to do something at least somewhat similar for sequences of tokens that make up a piece of text. But instead of just defining a fixed region in the sequence over which there can be connections, transformers instead introduce the notion of “attention”—and the idea of “paying attention” more to some parts of the sequence than others. **Maybe one day it’ll make sense to just start a generic neural net and do all customization through training. But at least as of now it seems to be critical in practice to “modularize” things**—as transformers do, and probably as our brains also do." - "Inside ChatGPT" from Wolfram link above, my emphasis ** added.
If a 2d "grid" of neurons doesn't sound more akin to a Kantian intuition of Euclidean space than an empiricist blank slate, then I don't know what does! Similarly, the "transformer" architecture that the T in GPT mostly stands for, is oriented to the sequential structure of sentence inputs.
More than that, the successful LLMs I'm aware of all need coaching with human feedback on top of their automated training on huge data sets - so called "reinforcement learning with human feedback" or RLHF. So although LLMs really are astonishing and unexpected from the point of view of pre-2000 linguistic and AI thinking, I think it's just not true to say that we are witnessing pure empirical learning unmediated by any analogue of innate structure or knowledge. It's pretty much the opposite; though as Wolfram notes, we may still hope to generalise further beyond the current special-purpose architectures.
There are definitely others who take a line closer to your father's - see eg. https://slator.com/how-large-language-models-prove-chomsky-wrong-with-steven-piantadosi/ - but a reliable source (linguistics professor older brother) tells me this is very much a minority view... perhaps not surprising in a field that doesn't want to surrender its own relevance?
Yeah, it's not surprising. Let's give it a few years and see how the discipline changes.
Thank you, thank you, thank you for having your father on again! He is a global treasure. He could talk on any subject, for any length of time, and it would be fascinating, educational, enlightening, and a pleasure to listen to. I wish he'd record such discourses more often; it would add immeasurably to the sum of human understanding. Is there any way to make him understand that?
I loved his exposition on Chomsky and Skinner and how large language models (LLMs) upended the whole field. Knowing Chomsky only from his ridiculous political commentary, I was always at a loss as to how he managed to build the academic reputation he seemed to enjoy. Having your father assess him as a supremely smart person and his arguments as overpowering the field puts him in a whole different light - even though he may have ended up being wrong after all!
I do have a question about your father's conclusion about the validity of the Chomsky model. Since human beings do not have ready access to the massive data set the LLMs are trained on, and do not possess the computational capability of modern computers, we are unlikely to be unconsciously using the techniques the LLMs rely on; thus, we must be using a different model; why not Chomsky's?
I loved Dr. Berlinski's speculation about what the reality of LLMs might mean for theoretical sciences. It is indeed a provocative theory (why construct complex rules-based models when we can just correlate a massive number of observations to perhaps a more accurate (and certainly speedier) result?)
I would suggest that one, human nature craves patterns and rules and thus may not be able to give up theorizing, and two, blind acceptance of phenomena we don't understand and can't fully explain smacks to me of paganism. Next thing, we'll start praying to the LLM (or even more readily to the eventual GAI) just as we used to pray to the gods of the thunder and the seas. Like the "cargo cult." I don't accept that!
And speaking of which, I wish you could ask your father what effect he sees LLM and GAI having on religion in general, and any particular systems of beliefs in particular.
Please, please convince your father to come onto TCG more often! Hey, how about having a regular "Elephant's Cage" with him?
Finally, a note about sound effects. I started listening to the podcast in the car, without the benefit of reading your introduction, so I didn't understand the experimental nature of the sounds I was hearing. I will confess that I came close to throwing my phone out the window in frustration multiple times when the effects overpowered and obscured your voices. I was hanging on every word, and then the planes would start taking off, or a playground full of toddlers would start screaming, or the like. Even mild effects like birds flapping their wings and computers beeping were distracting and, to me at least, did not add anything to the experience. I think having a distinctive "sound signature" at the beginning or the end of the podcast is entirely appropriate and good for building your brand. Anything else, especially when the guest is so valuable and his discourse is so dense with information, is worse than unnecessary.
Sorry, but the sound effects are not additive, they are subtractive. I listen to lots of podcasts. They tend to fall into two groups: conversational and narrative. The former don't usually have sound effects except for intro, commercial breaks or outros. Narrative podcasts - those that tell a story - sometimes have sound effects, but they also have sound designers. Mastering the tools won't get you very far without knowing how and when to use them. In the case of your wonderful discussion with your father they are totally unneccessary. If your able, I would suggest reposting sans sound effects.
Thanks for the assessment. I've posted a version without the sound effects, above.
I'd also suggest podcasting a monologue without an interlocutor.
Enjoyable. No more sound effects, please. Must’ve been interesting growing up with a father who thinks like this.
I've posted a version without the sound effects, above. Yes, of course it was interesting, but I wouldn't know any other way of growing up to compare it to, would I?
sounds effects so distracting and unnecessary, uh, I'm giving up now, sorry!
I've posted a version without the sound effects, above.
excellent, thanks Claire!
Anybody willing to give a hiighly podcast-averse subscriber one solid reason to listen to something offering to reveal the "philosophical ramifications" of AI *instead* of the anthropological ones?
Just tried to listen but was startled to find i'd wandered into somebody's hushed, private Chopin recital or somesuch!
1) Don't waste time trying for perfection; even mere better is the enemy of good enough. Freeze the design and go to production. Changes will evolve for the better. It's what organic lifeforms do. Disclaimer: I don't do podcasts very often at all; I much prefer the printed word.
2) I'm not aware that we've built any machines that can think; perhaps you can identify one or two. LLMs are mimics and nothing more. Also, please supply your evidence that your one or two examples actually think. And I don't mean anything like the intrinsically subjective Turing test--hard, objective evidence.
Eric Hines
Very interesting discussion. I would leave out the special sound effects though.
I've posted a version without the sound effects, if you prefer.