Since you've framed strong verbal parity as being stronger than the turing test, I think I can make this argument. Verbal parity in this sense is actually AGI. Yeah I know this is not an uncommon opinion, it's what turing thought and has remained popular ever since. But I think it usually comes from this sort of mindset where people assume that an AI trained only on static data could never achieve human level intelligence, that this would require having a body, or at least interacting with and doing experiments on its environment, learning over time as a child does, not really my perspective. Either way though, an AI with verbal parity could actually do this type of stuff automatically. It could consider descriptions of real world problems, command a person to carry out tasks, devise experiments, and change its behavior over time. It couldn't produce the signals needed to control a robot in real time, but it might be able to design something that could, or at least do the physics calculations required to move the robot across a given room. If we encountered an alien species that could do everything we could just as well but also could control swarms of insects with their minds, we wouldn't throw up our hands and declare ourselves inferior, swarming instinct being a now obvious requisite for true general intelligence, we would wave our hands uneasily and say, yeah but we can design things for controlling swarms of insects sort of and if not that then at least we could technically do the laborious calculations required to create a nice, well behaved swarm in a given space. We don't decide to call one dimensional turing machines not turing complete because they're polynomially slower than higher dimensional ones, that's just gross, we smooth everything out, lump them all into the same class and reassure ourselves that this is the way to go based on the math becoming beautiful and interesting and useful when you do this. Weird aside, extremely reasonable people, I believe Scott Aaronson is an example, do try to claim that turing machines which are exponentially slower than regular ones ought not be called complete, see this thing and the resulting fallout: https://blog.wolfram.com/2007/10/24/the-prize-is-won-the-simplest-universal-turing-machine-is-proved/ Anyway, suppose we encounter another naturally evolved intelligent species which is fixed in its environment, no sense other than decent but not great hearing, which gets by singing special songs to the dumb but physically capable animals around it, compelling them to do things and getting them to respond and give up information about what they can sense. As soon as we try communicating by straightforwardly translating words into alien stump song, they begin to pick up english easily and after some time we gauge them about as capable as humans in the domain of language. We likewise wouldn't, I would hope, decide these are inferior and not generally intelligent beings.
I'm not sure I believe the idea of general intelligence can be made sound, but if it can then I think the above is gonna hold. I could see trying to make the definition of strong verbal parity so it's not quite strong enough to do the tasks described above (but still stronger than the turing test!) but I would consider such a thing implausible. Maybe idk the context window is large enough to fit a novel, to engage in hours of conversation, but not long enough to fit a lifespan of interaction. Well then I'd create something that periodically re-summarizes its life in novel form and uses the last 1/3 of the context window for current interactions, that would be a better long term memory than I have at least, maybe not top of the line. The intuition is that once you master what a human can do in X minutes the rest is comparatively easy. I think X=5 has been used before but here we're of course talking much longer.
In the definitions of verbal parity, what does it mean to have "the ability to respond [in some manner]"?
When we generate text with an LM, we don't directly observe abilities-to-respond, we just observe specific responses. So all we can do is gather a finite, if possibly large, sample of these responses. How do we get from that to a judgment about whether the LM "has verbal parity"?
It seems like we could make this call in many different ways, depending on how we construe "ability" here.
Suppose we find that LM responds like a human the vast majority of the time, but every once in a while, it spits out inhuman gibberish instead. Does that count?
Suppose we find that the LM "gets it right" (correct answers, high-quality prose, etc.) about as often as our reference human, but when it does "get it wrong," its mistakes don't look like human mistakes. Does that count?
Suppose we find that the LM displays as much total factual knowledge as a typical human, but it isn't distributed in a human way. For example, it might spend a lot of its total "fact budget" on a spotty, surface-level knowledge of an extremely large number of domain areas (too many for any human polymath to cover). Does that count?
In my opinion, LMs are making fewer mistakes on average as they scale up, but the mistakes they do make are not growing more humanlike at the same rate. So, as LMs get better, there will be a larger and larger gap between their average/typical output and their worst output, and whether you judge them as humanlike will come down more and more to how (or whether) you care about their worst output, and in what way.
- On the Turing Test comparison, note that "passing the Turing Test" and "displaying the ability to pass the Turing Test" are not the same thing (the latter is not even clearly well-defined). A system might pass some instances of the Test and fail others.
- I worked in NLP during the transition period you mention (and still do), and it really was remarkable to watch.
- The Chinese model you link to is a MoE (Mixture of Experts) model, so the parameter count is not directly comparable to GPT (nor to most other modern NNs). MoEs tend to have enormous parameter counts relative to what is possible with dense models at any given time, but they also do far worse at any given parameter count, so overall it's kind of a wash.
If you aren't familiar with MoEs, you can imagine this 173.9T parameter model as (roughly) containing 96000 independent versions of a single small model, each about the size of GPT-2, though much wider and shallower (3 layers). And, when each layer runs, it dynamically decides which version of the next layer will run after it.
These are very valid questions. My sense is that a real operationalisation of verbal parity would take a lot more work than I have space or background for, but there is value in partial operationalisation through informal description, especially when our purpose is futurology rather than a real science.
To be more explicit: my immediate reaction when I read the definitions of verbal parity was "this doesn't seem like a coherent concept."
It's not that I understand what you mean informally, and want to operationalize it for practical use.
Rather, the informal definition actually doesn't make sense to me. And I was hoping that, if we tried to operationalize it, that would tease out the ways the concept is incoherent (or help me understand how it actually does make sense).
(I'm happy to continue discussing or to leave things here, just wanted to clarify my previous comment.)
I agree. I think when you really break it down you're going to get into ambiguity. What do we mean by "as well as"?
Perhaps well enough to fool a human (the Turing option). In that case human like errors are going to be very important, or minimising errors enough that they don't arise. Thus the issue you raise about error rate falling faster than errors are becoming human matters. Also, how reliably do you have to be able to fool the human?
Maybe we mean well enough to fufill the economic function of a human (the capitalist option). Here making human like errors may not be so important, but this is going to be very fragmentary, and vary from job to job.
Maybe we mean as well as a human in the context of specified tasks with quantified scoring (the metaculus option)? If so, we're going to have to see the specific operationalisation to comment further.
That's what I mean by a partial operationalisation. What I present as a univocal thing on closer inspection turns out to be a matter of definition. However, I do think there is value here in sketching roughly what we mean, even if any specific implementation will be subject to counterexamples, or at least this is what my time in philosophy has taught me. Verbal parity is either not fully coherent or not fully explicit, but the same thing can be said of AGI, and that concept is still useful in some contexts.
The one thing I'm struggling to understand and wish you could discuss in more detail is how verbal parity could escalate to verbal superintelligence.
By that, do you mean the superhuman level of knowledge brought about by having immediate access to virtually everything humanity has written, as well as the ability to draw new connections between disparate areas of knowledge?
Would this alone be enough for the verbally-superior AI to create an artificial general intelligence?
(It *does* seem like it would be enough to automate virtually all humanities scholarship, modulo archaeology!)
The relatively easy bit is how understanding verbal parity could reach verbal superintelligence. Just double the speed- technically that's verbal superintelligence. At any rate, there won't be many humans more intelligent than it at that point.
But I think the real core of your question, the intuition that is motivating you is maybe "would verbal superintelligence necessarily trigger a singularity of ever more rapidly increasing superintelligence- at least straight away- or is it instead possible that moderately super intelligent computers couldn't trigger a singularity".
It's quite possible for example that an AGI ten times smarter than a human would make a great contributor to research projects, but wouldn't revolutionise anything unless it got lucky. Even if it did, its revolutions might be like those of a Nobel prize winner. Significant, but not exponentially increasing technological growth.
This is one of the reasons I'm relatively optimistic about the control problem. I think there could be a longish intermediate period of computers smarter than us, but not so much smarter as to be able to overthrow us. This could give us time to get to grips with the problem experimentally. This is accompanied by the thought that maybe the biggest bottleneck isn't intelligence, but experimental resources, and AGI might also be constrained by this. But this is just a private hope of mine. I wouldn't take it too seriously, or interpret it as an argument against urgency!
If I've misunderstood you, sorry. I think it's a a great question, if I've misunderstood you with my interpretation of what you're really getting at let me know.
--What you call "strong verbal parity" is what I've been thinking of as "AGI." (Just yesterday I was telling someone "I think we AGI could arrive before self-driving cars.") However I like your term better because it's more precise & points to a specific set of capabilities that is plausibly sufficient for crazy things to start happening, rather than the more vague/expansive "AGI" that I was using. Also, your term is better because it's new and sciency-sounding instead of that old weird low-status "AGI" stuff that weirdos like Yudkowsky talk about, and unfortunately a lot of people (including smart people) are a lot more biased against weirdness etc. than they'd like to admit.
--Re brain size: It's true that much of the brain is for motor control and boring stuff like that. But probably not 90%, right? Probably at least 50% of the brain is for stuff plausibly important to e.g. being able to do scientific reasoning, have conversations with people, etc.?
I think I'll try out using your terminology and arguments. I feel like the "Master Argument" goes something like:
1. Strong verbal parity 50% likely by 2030 (or something like that)
3. Therefore with 40%+ probability we have until 2030ish to solve AI alignment and/or governance problems. If we want to have e.g. more than 80% confidence that our solutions will arrive in time, we have even less time.
I think Gwern would agree with you. IIRC he wrote an essay on why problem-solving AIs were sufficient to cause AI safety problems without being goal-directed agents, but I don’t have a link handy.
Fascinating but won’t pretend to understand a lot of the concepts and reasonings because I’m not a philosopher. I do have a question though - will AI ever achieve the subtle level of intuition or wisdom we can see yet not define in older wiser people for instance or in extreme experts or extreme talent? Also I wondered if you’ve read ‘AI does not hate you’ by Tom Chivers?
Weakest Verbal Parity, the ability to text or tweet as well as the average human, has long been achieved.
Weaker Verbal Parity, the ability to append jokey comments to serious blog posts as well as the average human, is no doubt being brought closer by my very act of writing this ;)
Since you've framed strong verbal parity as being stronger than the turing test, I think I can make this argument. Verbal parity in this sense is actually AGI. Yeah I know this is not an uncommon opinion, it's what turing thought and has remained popular ever since. But I think it usually comes from this sort of mindset where people assume that an AI trained only on static data could never achieve human level intelligence, that this would require having a body, or at least interacting with and doing experiments on its environment, learning over time as a child does, not really my perspective. Either way though, an AI with verbal parity could actually do this type of stuff automatically. It could consider descriptions of real world problems, command a person to carry out tasks, devise experiments, and change its behavior over time. It couldn't produce the signals needed to control a robot in real time, but it might be able to design something that could, or at least do the physics calculations required to move the robot across a given room. If we encountered an alien species that could do everything we could just as well but also could control swarms of insects with their minds, we wouldn't throw up our hands and declare ourselves inferior, swarming instinct being a now obvious requisite for true general intelligence, we would wave our hands uneasily and say, yeah but we can design things for controlling swarms of insects sort of and if not that then at least we could technically do the laborious calculations required to create a nice, well behaved swarm in a given space. We don't decide to call one dimensional turing machines not turing complete because they're polynomially slower than higher dimensional ones, that's just gross, we smooth everything out, lump them all into the same class and reassure ourselves that this is the way to go based on the math becoming beautiful and interesting and useful when you do this. Weird aside, extremely reasonable people, I believe Scott Aaronson is an example, do try to claim that turing machines which are exponentially slower than regular ones ought not be called complete, see this thing and the resulting fallout: https://blog.wolfram.com/2007/10/24/the-prize-is-won-the-simplest-universal-turing-machine-is-proved/ Anyway, suppose we encounter another naturally evolved intelligent species which is fixed in its environment, no sense other than decent but not great hearing, which gets by singing special songs to the dumb but physically capable animals around it, compelling them to do things and getting them to respond and give up information about what they can sense. As soon as we try communicating by straightforwardly translating words into alien stump song, they begin to pick up english easily and after some time we gauge them about as capable as humans in the domain of language. We likewise wouldn't, I would hope, decide these are inferior and not generally intelligent beings.
I'm not sure I believe the idea of general intelligence can be made sound, but if it can then I think the above is gonna hold. I could see trying to make the definition of strong verbal parity so it's not quite strong enough to do the tasks described above (but still stronger than the turing test!) but I would consider such a thing implausible. Maybe idk the context window is large enough to fit a novel, to engage in hours of conversation, but not long enough to fit a lifespan of interaction. Well then I'd create something that periodically re-summarizes its life in novel form and uses the last 1/3 of the context window for current interactions, that would be a better long term memory than I have at least, maybe not top of the line. The intuition is that once you master what a human can do in X minutes the rest is comparatively easy. I think X=5 has been used before but here we're of course talking much longer.
In the definitions of verbal parity, what does it mean to have "the ability to respond [in some manner]"?
When we generate text with an LM, we don't directly observe abilities-to-respond, we just observe specific responses. So all we can do is gather a finite, if possibly large, sample of these responses. How do we get from that to a judgment about whether the LM "has verbal parity"?
It seems like we could make this call in many different ways, depending on how we construe "ability" here.
Suppose we find that LM responds like a human the vast majority of the time, but every once in a while, it spits out inhuman gibberish instead. Does that count?
Suppose we find that the LM "gets it right" (correct answers, high-quality prose, etc.) about as often as our reference human, but when it does "get it wrong," its mistakes don't look like human mistakes. Does that count?
Suppose we find that the LM displays as much total factual knowledge as a typical human, but it isn't distributed in a human way. For example, it might spend a lot of its total "fact budget" on a spotty, surface-level knowledge of an extremely large number of domain areas (too many for any human polymath to cover). Does that count?
In my opinion, LMs are making fewer mistakes on average as they scale up, but the mistakes they do make are not growing more humanlike at the same rate. So, as LMs get better, there will be a larger and larger gap between their average/typical output and their worst output, and whether you judge them as humanlike will come down more and more to how (or whether) you care about their worst output, and in what way.
I discuss this in more detail in this post: https://www.lesswrong.com/posts/pv7Qpu8WSge8NRbpB/larger-language-models-may-disappoint-you-or-an-eternally
Other stuff:
- On the Turing Test comparison, note that "passing the Turing Test" and "displaying the ability to pass the Turing Test" are not the same thing (the latter is not even clearly well-defined). A system might pass some instances of the Test and fail others.
- I worked in NLP during the transition period you mention (and still do), and it really was remarkable to watch.
- The Chinese model you link to is a MoE (Mixture of Experts) model, so the parameter count is not directly comparable to GPT (nor to most other modern NNs). MoEs tend to have enormous parameter counts relative to what is possible with dense models at any given time, but they also do far worse at any given parameter count, so overall it's kind of a wash.
If you aren't familiar with MoEs, you can imagine this 173.9T parameter model as (roughly) containing 96000 independent versions of a single small model, each about the size of GPT-2, though much wider and shallower (3 layers). And, when each layer runs, it dynamically decides which version of the next layer will run after it.
These are very valid questions. My sense is that a real operationalisation of verbal parity would take a lot more work than I have space or background for, but there is value in partial operationalisation through informal description, especially when our purpose is futurology rather than a real science.
Thanks for the response.
To be more explicit: my immediate reaction when I read the definitions of verbal parity was "this doesn't seem like a coherent concept."
It's not that I understand what you mean informally, and want to operationalize it for practical use.
Rather, the informal definition actually doesn't make sense to me. And I was hoping that, if we tried to operationalize it, that would tease out the ways the concept is incoherent (or help me understand how it actually does make sense).
(I'm happy to continue discussing or to leave things here, just wanted to clarify my previous comment.)
I agree. I think when you really break it down you're going to get into ambiguity. What do we mean by "as well as"?
Perhaps well enough to fool a human (the Turing option). In that case human like errors are going to be very important, or minimising errors enough that they don't arise. Thus the issue you raise about error rate falling faster than errors are becoming human matters. Also, how reliably do you have to be able to fool the human?
Maybe we mean well enough to fufill the economic function of a human (the capitalist option). Here making human like errors may not be so important, but this is going to be very fragmentary, and vary from job to job.
Maybe we mean as well as a human in the context of specified tasks with quantified scoring (the metaculus option)? If so, we're going to have to see the specific operationalisation to comment further.
That's what I mean by a partial operationalisation. What I present as a univocal thing on closer inspection turns out to be a matter of definition. However, I do think there is value here in sketching roughly what we mean, even if any specific implementation will be subject to counterexamples, or at least this is what my time in philosophy has taught me. Verbal parity is either not fully coherent or not fully explicit, but the same thing can be said of AGI, and that concept is still useful in some contexts.
This is very interesting and frightening.
The one thing I'm struggling to understand and wish you could discuss in more detail is how verbal parity could escalate to verbal superintelligence.
By that, do you mean the superhuman level of knowledge brought about by having immediate access to virtually everything humanity has written, as well as the ability to draw new connections between disparate areas of knowledge?
Would this alone be enough for the verbally-superior AI to create an artificial general intelligence?
(It *does* seem like it would be enough to automate virtually all humanities scholarship, modulo archaeology!)
The relatively easy bit is how understanding verbal parity could reach verbal superintelligence. Just double the speed- technically that's verbal superintelligence. At any rate, there won't be many humans more intelligent than it at that point.
But I think the real core of your question, the intuition that is motivating you is maybe "would verbal superintelligence necessarily trigger a singularity of ever more rapidly increasing superintelligence- at least straight away- or is it instead possible that moderately super intelligent computers couldn't trigger a singularity".
It's quite possible for example that an AGI ten times smarter than a human would make a great contributor to research projects, but wouldn't revolutionise anything unless it got lucky. Even if it did, its revolutions might be like those of a Nobel prize winner. Significant, but not exponentially increasing technological growth.
This is one of the reasons I'm relatively optimistic about the control problem. I think there could be a longish intermediate period of computers smarter than us, but not so much smarter as to be able to overthrow us. This could give us time to get to grips with the problem experimentally. This is accompanied by the thought that maybe the biggest bottleneck isn't intelligence, but experimental resources, and AGI might also be constrained by this. But this is just a private hope of mine. I wouldn't take it too seriously, or interpret it as an argument against urgency!
If I've misunderstood you, sorry. I think it's a a great question, if I've misunderstood you with my interpretation of what you're really getting at let me know.
Excellent post!
--What you call "strong verbal parity" is what I've been thinking of as "AGI." (Just yesterday I was telling someone "I think we AGI could arrive before self-driving cars.") However I like your term better because it's more precise & points to a specific set of capabilities that is plausibly sufficient for crazy things to start happening, rather than the more vague/expansive "AGI" that I was using. Also, your term is better because it's new and sciency-sounding instead of that old weird low-status "AGI" stuff that weirdos like Yudkowsky talk about, and unfortunately a lot of people (including smart people) are a lot more biased against weirdness etc. than they'd like to admit.
--Re brain size: It's true that much of the brain is for motor control and boring stuff like that. But probably not 90%, right? Probably at least 50% of the brain is for stuff plausibly important to e.g. being able to do scientific reasoning, have conversations with people, etc.?
I think I'll try out using your terminology and arguments. I feel like the "Master Argument" goes something like:
1. Strong verbal parity 50% likely by 2030 (or something like that)
2. When we have strong verbal parity, it's 80%+ likely that we'll be able to build APS-AI if we try. (I'm referring to Joe Carlsmith's term here https://docs.google.com/document/d/1smaI1lagHHcrhoi6ohdq3TYIZv0eNWWZMPEy8C8byYg/edit# )
3. Therefore with 40%+ probability we have until 2030ish to solve AI alignment and/or governance problems. If we want to have e.g. more than 80% confidence that our solutions will arrive in time, we have even less time.
I don't have my longform thoughts sorted out yet, but I have been making art with gpt-2/gpt-3 for a few years now, and I endorse almost all of this.
I think Gwern would agree with you. IIRC he wrote an essay on why problem-solving AIs were sufficient to cause AI safety problems without being goal-directed agents, but I don’t have a link handy.
Fascinating but won’t pretend to understand a lot of the concepts and reasonings because I’m not a philosopher. I do have a question though - will AI ever achieve the subtle level of intuition or wisdom we can see yet not define in older wiser people for instance or in extreme experts or extreme talent? Also I wondered if you’ve read ‘AI does not hate you’ by Tom Chivers?
Weakest Verbal Parity, the ability to text or tweet as well as the average human, has long been achieved.
Weaker Verbal Parity, the ability to append jokey comments to serious blog posts as well as the average human, is no doubt being brought closer by my very act of writing this ;)