62 Comments
User's avatar
Dov Jacobson's avatar

o3's response to the mangled riddle is enlightening. AI glided past the omissions in the telling of the riddle just like it mentally corrects all my typos and grammatic errors. I often wondered where the threshold lies between very lenient listening and misunderstanding. It looks like Riley Goodside has found it.

But well-intentioned misunderstanding is hardly limited to AI. The substitution of our biased expectation of incoming language for its actual intent is a major, perhaps principal, cause of human disagreement.

Of course this is the same mechanism that, when working correctly, (in embeddings, Markov chains, or our glib neurons) forms the engine of communication.

Expand full comment
Derek Hendy's avatar

Actually, on my first reading of the riddle, I also assumed a typo, and before I read on I wondered if Ethan had missed his own typo (“can” for “can’t”) & that *he* was smugly assuming an error on o3’s part. What I find telling is the fact that, with two shots, o3 figured out its error & revised its answer—even its mistake seems almost human.

Expand full comment
Ezra Brand's avatar

Great post, very exciting stuff. One thing that stood out to me about o3 is how naturally snarky it is by default (as can also be seen in one of your screenshots, where it writes: "Yep it really is that simple. The surgeon is the boy's mother."). We've come a far way from the more common bland or obsequious style. Presumably, it was tuned that way, but I wonder if that snarky tone inherently correlates with smarter responses (as it often appears to with people). I haven’t seen any discussion of that aspect yet

Expand full comment
Richard Howes's avatar

Great read and great comments.

I've come to realise, at least for me, that AGI or not AGI isn't really the point. We've always used tools, and those tools have just gotten more sophisticated—hammers and chisels evolved into electric and pneumatic versions, and eventually into bigger and better machines that do the same jobs more efficiently.

What's really important is how quickly these AI LLMs are improving. That means they're getting more useful at an incredible pace. O3, in particular, with its agentic and tool-use capabilities, is far more useful than previous LLMs that came before. The difference between where we were with ChatGPT 3.5 just two years ago and where we are now with O3 is striking.

So, while it doesn't really matter to me whether it's AGI or not, it's clear that LLMs are becoming an integral part of the toolset humans use to get things done. Eventually, humans will have less direct control and will focus on giving AI general goals.

There's no doubt that AI and LLMs will be fundamental tools in virtually every human endeavour for every person in the not-too-distant future. For many of us, myself included, we're already there.

P.S. I've used an AI tool to dictate these comments. It rewrites what I say, removing filler words and reorganising my thoughts. It lets me spew out a vaguely coherent comment reflecting my general thoughts, and then cleans it up to say what I want to say more elegantly. I still do a little editing, but getting to this point takes about a tenth of the time it normally would with typing and editing for clarity and structure. Game changer!

Expand full comment
Frank Tentler's avatar

Many thanks to Ethan Mollick for this insightful piece – and especially for coining the term Co-Intelligence, which has become a foundational concept in my own work. The “jaggedness” he describes in the current state of AI development resonates deeply with what I observe in practice: that we are no longer waiting for fully generalized artificial intelligence, but instead embracing Co-Intelligence as a dynamic, situational practice – a form of present intelligence that continuously emerges from the interaction between human awareness and machine capability.

I choose to work with the best currently available AI – not out of technical limitation, but from the conviction that intelligence is a relational, evolving system. This is not about ideal models, but about cultivating a productive relationship with the fragmentary and the temporary. Humans contribute context, ethics, and intuition; AI offers speed, precision, and pattern sensitivity – and from this interplay emerges a new form of agency that already accomplishes tasks once considered the domain of AGI.

Ethan Mollick’s contributions are not only analytically valuable, but continue to offer powerful impulses to further develop the concept of Co-Intelligence – not as a destination, but as a cultural technique in the making.

Expand full comment
Srivatsan Sampath's avatar

Wonderful post. Agree that model progress including 2.5 and O3 is real. Folks then rightfully focus on the application layer to leverage new capabilities. And yeah, model progress unlocks cool applications – like Cursor really taking off after Claude 3.5 crossed a threshold. Great, we’ve gone from zero to one. The real unlock for scale is from Management.

The status quo is growing number of applications that just boost individuals (with no ability to multiply or scale).Folks either re-inventing the wheel, or not disclosing AI use for fear of being seen as illegitimate, and smart insights getting stuck with a few superstars. I get that Salesforce and others are going to try handwringing to see if that increases AI usage and TFP, but Without some kind of management innovation focused on this, I don’t see how you get real gains.

It sorta reminds me of the light bulb: tech was amazing, transmission lines got it to places (the 'last mile'), but real mass impact for everyone needed management stuff like the assembly line.

I love specific tactics - like your advice in the book to spend at least 10 hours learning (still spot-on advice) or building internal prompt libraries detailing what worked and what didn’t. but we need something bigger. So, what’s the Assembly line/Lean Manufacturing/Agile Workflow for AI?

Expand full comment
Mike's avatar

3 things…

1. You’re making an amazing contribution, thank you.

2. “-"AGI" refers to human-level general competence across domains.”

3. “Amid today’s AI boom, it’s disconcerting that we still don’t know how to measure how smart, creative, or empathetic these systems are. Our tests for these traits, never great in the first place, were made for humans, not AI. Plus, our recent paper testing prompting techniques finds that AI test scores can change dramatically based simply on how questions are phrased.”

I did have a chance to both play with and challenge o3 over the weekend and it brings me back to two things I’m working on related to 2 and 3 above…

In short, Humans can be measured on their performance in levels of hierarchical reasoning using Commons, et al derived MHC (Model of Hierarchical Capability) in relationship to the complexity of a task.

My guess is we have to use a similar measure and while metaphorically AGI is a talking point, it doesn’t say anything really… which is pointed to in most of the comments;)

The next thing noted by you is that following similar bread crumbs noted in developmental reasoning (MHC)… until we teach AI to inquire instead of vomit its ability, its level of vertical complexity (x) along with oblique (x+y) is obfuscated with its preponderance of lateral complexity (y)… albeit at speeds that make formula 1 look like a horse and buggy.

From decades of work with people… trying to meet them where they are… noting Chris Argyris in FLAWED ADVICE and the MANAGEMENT TRAP… the AI gives advice like throwing spaghetti against a wall!

My point in both 2 and 3 is that we will not arrive at AGI… until advice is individualized through the AI working us to meet us where we are.

Nonetheless, economics will dictate the shift as we are honeymoon 1999 (Prince’s party song) as AI is wasting money in the spaghetti business;)

Thank you Ethan!

Expand full comment
Jurgen Gravestein's avatar

“I do think they can be credibly seen as a form of “Jagged AGI” - superhuman in enough areas to result in real changes to how we work and live.” Is that really an intellectually honest take?

Even for Altman’s watered down definition of AGI as ‘highly autonomous systems that outperform humans at most economically valuable work’, we have to conclude that while being extremely impressive, the AI we have today simply has not reached this level yet. That’s not a matter of debate, but a statement of fact.

Why is this so hard for people to admit?

Expand full comment
Ethan Mollick's avatar

Well, obviously it is up for debate, as I discuss in this post!

And I don’t think Altman’s definition is that watered-down, it seems like a very high bar to cross compared to other definitions. It would be intellectually dishonest to not acknowledge that today’s AI, whether you define it as AGI or Jagged AGI or something else, is good enough to have a serious impact over time even if it doesn’t get any better.

Expand full comment
Jurgen Gravestein's avatar

Is that what constitutes as AGI? The locomotive had real impact over time.

Expand full comment
Dakara's avatar

It is too tempting to see AI as what we want it to be, versus what it is, in my opinion.

Most find it incomprehensible that nothing more than statistical pattern matching can perform what these LLMs currently do.

But Anthropic's paper, which I cover here https://www.mindprison.cc/p/no-progress-toward-agi-llm-braindead-unreliable, makes it substantially difficult to make a case that these machines have any "understanding" for what they do.

Expand full comment
Bruce Howard's avatar

The effectiveness of prompting AI to tell it what it is (a specific kind of expert, for example) is fascinating, and made me think of how sports coaches will do the same to help prepare an athlete, even if the casting is somewhat aspirational.

Re: the riddle experiment, would AI do better with a prompt like: "You are an expert in solving riddles, but you should remember to review a question very carefully, and not assume anything tricky is involved"? That made me wonder whether AI is naive. How does it handle efforts at deception (such as conflicting info in photo that you ask it to geo-locate) or outright lies?

As others note here, part of what is really interesting here is how AI can lead us to reflect on the human condition.

Expand full comment
Gnomable's avatar

I made the same mistake as the AI and thought the riddle was a typo. Great article!

Expand full comment
Peter Hughes's avatar

... and the simple version of the correct answer is to restate the 'problem' that one is actually solving (which is what Google search does to me all the time because I write in British English); whereas the best answer is to recognise that one perceives a potential error, and to seek clarification before proceeding: "Did you mean ... or something different?"

Expand full comment
kmunz's avatar

very informative! thank you for sharing. you’re a brilliant writer as well sharing information in a very digestible way.

Expand full comment
Dov Jacobson's avatar

No offense, HeyGen, but that video is way down in the Uncanny Valley. Creepy on so many levels: gestures, expressions, speech rhythms, conversational cadence - and the flat anodyne content. Did anyone watch all six minutes?

Expand full comment
Lisa Perrine's avatar

It’s hard to watch, but I watched in 2 sittings. As humans, we are very sensitive to so many intricate details in faces and speech. All the little details that are imperfect feel like little attacks on our brains.

Expand full comment
Pablo B. Markin, Ph.D.'s avatar

The AGI has most probably already arrived. Given the disruptiveness of that, the goalposts keep getting pushed towards ever more demanding benchmarks which occludes the state of affairs somewhat.

Expand full comment
GwadaDLT's avatar

This research really changes my perspective on what makes AI genuinely valuable. I've been focused on how these models create new things, but I'm now seeing their potential to transform existing businesses through enhanced collaboration.

What I find most compelling is how AI broke down functional silos at P&G - allowing commercial specialists to consider technical aspects and vice versa. In established businesses with deeply ingrained departmental divisions, this cross-functional perspective could be revolutionary.

I'm curious - in your experience, have you seen examples where existing businesses successfully used AI to bridge expertise gaps rather than just automating processes? The "cybernetic teammate" concept suggests a more collaborative relationship that could help established companies evolve without complete reinvention. Particularly, solo entrepreneurs can use AI as a business partner.

Expand full comment
Herrsosa's avatar

Thank you for the great article!

Expand full comment
Joshua Yearsley's avatar

So I also had to reread the riddle a few times before realizing it was a trick. No further comment. :D

Expand full comment
Daniel Stanley's avatar

You're being too diplomatic. 'Jagged AGI' is an oxymoron - the most basic definition of 'General' is clearly failed if the AI still can't do simple tasks that most humans wouldn't. Following through on your argument - the real characteristics of GenAI today make it ever clearer that AGI is really an unhelpful concept and distraction from proper analysis, a marketing device by the AI companies that obscures much more than it illuminates

Expand full comment
Ezra Brand's avatar

Two distinct terminology critiques are being conflated here:

"Jagged AI" (term coined by OP) refers to how current models perform well in some areas and poorly in others.

-"AGI" refers to human-level general competence across domains.

Whether the latter (AGI) is a meaningful concept has nothing to do with whether the former (jagged AI) is a useful descriptor

Expand full comment
Daniel Stanley's avatar

I'm referring to 'Jagged AGI' which he specifically coins in this article. I have no problem at all with the idea of 'jagged' as a way of describing AI in general, it's clearly accurate. My issue is that if its jagged to extent that it fails at basic tasks a human can do, it's in no sense 'AGI'. Therefore 'Jagged AGI' is an oxymoron

Expand full comment
Derek Lomas's avatar

Humans are pretty jagged in their intelligence, too. I’m a smart guy, but according to my wife, I’m not performing well at basic tasks that a human can do 😅

Expand full comment
Daniel Stanley's avatar

sure, but then you're not claiming to be a superintelligence (I hope)

Expand full comment
Ezra Brand's avatar

Ah, I see what you mean. You're right, OP explicitly uses the term "Jagged AGI" a few times (including in the title!), I missed that. Fair

Expand full comment
Muldi's avatar

Ok, in order to avoid the oxymoron let’s call it „Part-time AGI“ or „Schrödinger’s AGI“ (Is it there or not? Depends on the prompt…) 😉

Expand full comment
Daniel Stanley's avatar

partially general, sometimes

Expand full comment