Two commentators on my blog, both of whom I respect greatly, but who I will not name because I haven’t cleared it with them, were having a debate about whether generative AI can be truly creative. People often say that LLM’s can’t be creative, indeed it’s one of the most common skeptical talking points. Now, as one of the debating commentors noted, this can’t be true in a strong sense because: “per research, most 5-grams in unconditional sampling are novel.” In other words, if you take five words an LLM has produced in a row, those five words have probably never been written before anywhere we can find on the internet. But maybe this is mere rearrangement, and there are deeper, truer forms of creativity that LLM’s cannot access.
I think that it’s possible that generative AI has a tendency against certain kinds of deep creativity, but I strongly doubt that there is a clear demarcation between some kind of true creativity, which it can’t do, and 'mere rearrangement’ which it can do. In what follows, I’ll outline my view.
Creativity is when you put things together in a new way, especially if it’s useful, thoughtful, beautiful, or admirable in some way. Depending on the degree of creativity, it can be new for you, or new for the whole world, or something in between. Creativity always recombines existing elements. At least as best as we can tell. Hume, it seems to me, is right on this: “All this creative power of the mind amounts to no more than the faculty of compounding, transposing, augmenting, or diminishing the materials afforded us by the senses and experience.” Although perhaps, contra Hume, some elements are innate rather than derived from experience.
How creative we say something is depends on how new it is and often, at least implicitly, how good it is. Any time you say a new sentence that hasn’t been seen before, that’s a limited kind of creativity, it’s a rearrangement of elements, but it’s probably not very creative on either the originality or goodness spectra.
As far as I can tell, there is no qualitative leap between ‘true’ creativity (brilliant scientific and philosophical theories, great poems), ‘modest’ creativity (a previously unmade wisecrack), and ‘mere’ rearrangement, they are just at different positions on the spectrum of newness and usefulness. All are rearrangements of existing elements.
My best argument is this- spend a lot of time thinking about various forms of creativity, and the recombinations they involve. The longer you think about it, the more it seems like, while some recombinations are better, they all share the same fundamental nature. One set of Lego blocks may be more spectacular and/or useful than another, but they are all agglomerations of Lego blocks. So it is the combinations of ideas that make up creativity.
I recognize this is vague. I am open to the idea that there is a clear jump between creativity and ‘mere’ rearrangement, and I’m missing something, but in considering examples of creativity, it seems more like a gradient of more and more fundamental and profound rearrangements, with no sharp cutoff. While it is not, as often claimed, impossible to prove a negative, it is difficult. There’s always the possibility there’s some jump or differentiation I’m missing. So if you disagree with me about the claim that there is no sharp differentiating line between the theory of relativity and a new sentence, I’d challenge you to outline your alternative. If you are clear enough about what you think the gap is, maybe we’ll even be able to test whether current LLM models can cross it.
Let us be more precise. There are really three spectra that creativity exists on. The “newness” spectrum can be broken down into two subcomponents (originality and what I’ll call radicality):
Originality: The degree to which this hasn’t been done before. Is it a world first, or did you just recreate it?
Radicality: The degree to which the idea ‘fundamentally’ rearranges existing parts, as opposed to making a small change. Radicality is not the same as originality (even a small change may have never before been imagined), but they often overlap. It is a difficult notion to describe precisely.
Goodness: The extent to which the idea is useful, thoughtful, beautiful, etc.
So LLM’s can create novel combinations, and there is no clear qualitative difference between that and any form of creativity you care to mention. However, LLM’s have not yet, to the best of my knowledge, made any stunningly creative ideas in the sciences or humanities, and stunning creativity in the arts is more difficult to assess, so there I am unsure. It is one thing to say the difference is one of degree and another thing to estimate which degree an approach will obtain. What should we expect from future performance by LLM’s? How far can they go along the three spectra of creativity?
I suspect current models probably have a structural problem with creativity, but understanding this limitation more exactly, let alone describing it precisely, is difficult. I suspect training the model to predict the next word in its corpus limits their creativity. We know that LLM writing often tends to cliches.
But plenty of strings have, as their most probable conclusion, something novel. Sometimes novelties are the most likely follow-up to a sequence of words. For example “The Keats House museum archivists have informed the London Review of Books that they have discovered a wholly new poem by the great poet, and we are delighted to reproduce it in full here”. Strings announcing the discovery of new works by great poets are usually followed by truly creative poems. A follow-up lacking those qualities would be improbable. Of course, an LLM will struggle to make a brilliant and original poem- they don’t model poetry well enough to do that, but it is, in some sense, ‘‘‘trying’’’.
Another problem with this argument against LLM creativity is the existence of the RHLF procedure. The RHLF procedure turns the machines from ‘autocomplete on steroids’ (always a misleading metaphor) to a ‘helpful’ ‘virtual assistant’. Having gone through RHLF they are no longer trying to predict the most likely followup in their initial training corpus.
On the mathematics front, it is unclear to me that human-created strings of text can be, or are, ‘out of distribution’ for language in a way that language models are not, at least in any way that matters greatly to the three parameters of creativity we mentioned. We also interpolate.
So current models may have something of a bias against creativity, but it’s hard to know or assess how fundamental this is, and the whole question involves fiendish puzzles. To address them, we probably need experts on the philosophy and psychology of creativity to collaborate with experts on the mathematics of machine learning. I cannot see any clear sense in which LLM’s are ‘incapable’ of creativity though. Please note that the issues dealt with in points 8-11 are very technical, and it would be unwise to rely on my analysis. These points need a much more thorough investigation. If there is some exact division between what humans can do creatively and LLMs can do creatively— and I don’t think there is— it needs to be described in precise terms.
Two very relevant ideas in previous work:
Margaret Boden distinguishes historical creativity (it's new too everyone) from personal creativity (it's new to the person creating it but others have previously done the same). Both could be equally brilliant, equally good signs of the person's ability. The difference is contingent.
Douglas Hofstadter talks a lot about "jumping out of the system". JOOTS is more than just recombination of existing elements. It's a kind of refusal to play the existing game at all. If you ask an AI to create a piece of poetry it will create a piece of poetry. If you ask human poet to create a piece of poetry, they might say, actually this situation calls for a piece of music instead, and give you that.
Creativity can sometimes be monetized, so how about this challenge to decide whether the LLM is exhibiting true creativity?
- The LLM has to submit a patent application for a technical innovation, and the patent has to be approved by the US Patent and Trademark Office (or similar organization).
- The patent rights are sold for $50,000 or more.
(I know nothing about patent law or how patent rights are traded. Maybe someone else could figure out the details of how to operationalize the challenge. Of course, once LLMs start generating valuable patents without need for human input, we can expect patent law to start evolving quite rapidly, with unpredictable outcomes.)