This is not a particularly complicated post. I just wanted to illustrate how far AI has come since I started this Substack just before the launch of ChatGPT in November, 2022. I spend a lot of time in these pages trying to guess at what the future holds, but the future is uncertain. The past is clear. So, let’s see how far we have come in 21 months.
Images and Video
This is the image you got when you prompted “otter on a plane using wifi” into the best image generator (Midjourney) on the day I started this Substack:
This is what you get today, 21 months later (image generated by Flux, an open weights model, and animated with Runway Gen 3):
The time to generate the images are the same. The clip took an additional 75 seconds. And it isn’t just animals. Here is “nursing school leader” then and now (although note that bias issues in image generation remain. I almost always get a female nursing school leader)
Sound
A couple months after ChatGPT’s release, Google caused a huge stir with MusicLM, which could generate songs from text descriptions. This was jaw-dropping to many. Here is MusicLM’s song for this caption: This is an r&b/hip-hop music piece. There is a male vocal rapping and a female vocal singing in a rap-like manner. The beat is comprised of a piano playing the chords of the tune with an electronic drum backing. The atmosphere of the piece is playful and energetic. This piece could be used in the soundtrack of a high school drama movie/TV show. It could also be played at birthday parties or beach parties.
Here is the exact same prompt put into Suno, eighteen months later (It isn’t the full prompt, couldn’t fit in the last couple sentences, starting with “The atmosphere”). This is the very first result.
Eighteen months.
Language Models
I have written a lot about advancement in Large Language Models, so I won’t belabor things too much here, except to say that there have been three distinct levels of LLMs since 2022: GPT-3 class, GPT-3.5 class, and GPT-4 class. I name these after the versions given to the models by the first company to reach these levels, OpenAI. When I started this Substack, only GPT-3 was available (it had been for a couple years), and it was pretty impressive for a start. But soon afterwards GPT-3.5, and then GPT-4, arrived and they represented huge leaps.
A good illustration is what I call the Lem Test based on “The Cyberiad" by Stanisław Lem, published in 1965. The book is a satirical science fiction collection about rival robot makers. In one tale, Trurl, a robotic constructor, builds an electronic bard. His rival Klapaucius challenges the machine to write a verse that seems impossible: “Have it compose a poem- a poem about a haircut! But lofty, noble, tragic, timeless, full of love, treachery, retribution, quiet heroism in the face of certain doom! Six lines, cleverly rhymed, and every word beginning with the letter S!”
Lem’s translator Kandel (who actually came up with the S challenge, the original poem in Polish is different) famously pulls it off:
Seduced, shaggy Samson snored.
She scissored short. Sorely shorn,
Soon shackled slave, Samson sighed,
Silently scheming,
Sightlessly seeking
Some savage, spectacular suicide.
Anonymous internet personality Gwern gave GPT-3 the same challenge, and got results like: “Shearsman swift, sure & sculptor, Scissorman swindler, sophister, Shearsman smart, smirking & satanic, Shearsman sobbing & sleeping in the attic Squire Sprat at Sprink’s barber-shop.” These neither stuck to the rules nor did they make any sense.
Yet Claude 3.5 actually succeeds (no other model does), even if not quite as cleverly as the human writers:
It is worth noting that, compared to the other areas of AI development, LLMs have seemed a bit stuck at GPT-4 level since 2023. Even though GPT-4 has been exceeded by other models, including GPT-4o and Claude 3.5, there has been no giant leaps in ability since GPT-4. The AI companies have been hinting that this will change in the future, so we learn more soon.
Adoption… and impact
When I started this blog there were no AI chatbot assistants. Now, all indications that they are likely the fastest-adopted technology in recent history. A survey of 100,000 knowledge workers in Denmark that concluded in January, 2024 found really high adoption rates, as well as high rates of actual use (and 15% of journalists and marketers had paid Plus subscriptions!)
Similarly, research from The Walton Family Foundation finds teachers, parents and students have adopted AI remarkably quickly. Some of this use by students is cheating, of course, a topic I have discussed before, but students, parents and teachers are finding all kinds of other applications as well.
As for the impact on the wider economy, that would be impossible to tell so early, for reasons that I discussed a few weeks back. Right now, it is individuals who are benefiting from AI, as systems and organizations are much slower to adapt and change to new technologies. But individuals really are benefitting. For example, I happen to like this account by Nicholas Carlini who outlines the many ways he uses AI in his work as well as this story on how Erik Schluntz on how AI let him work despite a broken hand. I also discuss my own uses for AI in my book.
I don’t think anyone is completely certain about where AI is going, but we do know that things have changed very quickly, as the examples in this post have hopefully demonstrated. If this rate of change continues, the world will look very different in another 21 months. The only way to know is to live through it.
That Lem test is sweet. (And kudos for crediting Michael Kandel.)
We would do well to also remember a more cautionary tale in Lem's Cyberiad.
When Trurl invents The Machine that Can Make Anything Starting with N, it passes a few simple tests: making noodles, nymphs, etc. But Klapaucius challenges it to make Nothing. And it does - slowly winking out all the elements of existence - until Klapaucious begs it to stop. The Machine stops, but cannot undo most of this destruction (it can restore only things starting with N). The two inventors look at this new world - shorn of its beautiful plusters and worches - now riddled everywhere with vast nothingness.
With no little shame, the two realize that this is the mostly hollowed out world they are leaving to future generations. "Maybe," Klapaucius groans, "they won't notice'.
I'll take a stab at it (without AI help)
Silky strands, swiftly snipped
silently sorrowful
suboptimal slips
Sundered surface,
sallowed skin
spherical savannah--
salon sculpting sin