That statement is bang on! I believe, the subtle mistakes would require more centaurism/cyborgism than the utter failures. And that’s where one might decide to turn off the autopilot.
This is the single best AI-measuring test I've read all year, quantifying what so many of us have seen. The impact of AI in leveling up non-experts in a discipline is especially critical in that it can reshape how entire organizations are structured. No longer should organizations be hiring for administrative abilities (as those will be more automated), but instead should focus on hiring for critical thinking, since those entry level folks can move the ball down the field with the help of generative AI. That in turn will impact middle management. Amazing work, Ethan and team!
Again, Conor, I completely agree with your assessment here! That being said, and I’ll caveat this by saying I haven’t read the paper yet, these results are not entirely unexpected. They largely track with the earlier productivity gains seen in studies in the beginning of ‘23 and which Ethan has frequently discussed. I don’t think the world needs many more of these types of paper. I think it’s pretty clear that generative AI augments the way we work. I’m hoping though, we can now start discussing the impact of this transformation on the actual people whose work, and lives, will soon change.
Matt, agree, as big tech shoves AI into our teams and companies, how do we get our teams and org's ready. I am looking for someone to help us create that workshop!
Gregory, bingo about providing workshops to teams and orgs. I've given generative AI workshops to realtors, loan officers, nonprofits, and K12 teachers. Workshops help everyone get on the same page as to what generative AI is and how it can be harnessed in the workplace. I just started a Substack called AI at Work. You could look me up there if you want to get in touch about workshops. Cheers
The key takeaway for me is that I need to spend time every day engaged with ChatGPT4 in some way, no matter how small/big the issue is -- there is a LOT of 'learning by doing' to using generative AI.
Absolutely love the concept of the Jagged Frontier. Thanks for sharing these tangible, real-world insights on the ability of genAI to augment and tranform the way we think and go about our work.
Outstanding research and convincing if not unexpected findings.
I think many of the hurdles to actively using AI for more are psychological. Even for someone like me, who writes about AI on a regular basis, it's hard to know what task to try and use AI for.
For instance, it was only because you mentioned that your "Jagged Frontier" graph was drawn using the Code Interpreter (it's actually been rebranded as "Advanced Data Analysis" now) that I decided to try it for myself. Simply copy-pasting your short paragraph describing the concept of the Jagged Frontier into ChatGPT + Advanced Data Analysis and asking it to create a diagram that illustrates the concept produced a graph very similar to yours.
Except one of the red tasks (labelled as "Tasks AI cannot do") was placed inside the jagged frontier. Then I pointed this out and got an accurate and helpful visualisation.
But if you haven't pointed out that your diagram was done by ChatGPT, I wouldn't have immediately thought of asking for something similar. So, as you say, it's about spending more time with AI and feeding it all sorts of tasks until you get a good feel for where it is helpful.
Q: In the paper when detailing the task outside the frontier it says: "Please find attached interviews from company insiders on this issue. In addition, the attached Excel sheet provides financial data broken down by distribution channels."
Is there anywhere we can find the interviews and the Excel sheet as well?
Also, re: "The consultants who scored the worst when we assessed them at the start of the experiment had the biggest jump in their performance"
- this looks like "regression to the mean", not necessarily a real effect.
One way to cross-check would be to look at *control group only*, and see if splitting it into bottom 50% vs top 50% on the assessment task later shows some "improvement" in the bottom group in the second round of tasks.
Great paper and post indeed, congratulations to all involved.
Similarly, I'd be interested in understanding how the experiment controlled for that regression to the mean effect one tends to find on many synthetic measures of performance.
Thanks Ethan and Team for these helpful stats about AI's #implications and #applications. And thanks for highlighting the 'Jagged Frontier' in AI capabilities. Yes... 'Centaur' and 'Cyborg' to harmonize AI with your expertise. It's not just about AI reshaping work; it's about how we shape AI's role in work.
Terrific study and reflects exactly how I've been using it myself lately even if it's just a toe in the water at this point. More Cyborg than Centaur, though I could easily see both.
I've been in software technology for 25 years; senior BA / product manager and was recently laid off. In putting together statements to summarize the work I've done and all the correspondence to hiring managers and head hunters and such, and I would struggle for way too long trying to get that sentence just right. I don't even hesitate anymore, going straight to ChatGPT using the simplest "make more concise" and in a blink it pops out a sentence, paragraph or bullet points that was staggeringly good, but that know I'd never get to left to my own devices, ever. That becomes the starting point for me to tweak/correct the points and put them in my voice, with a combined result that is perfect, in a fraction of the time. I can't wait to start using it in my next position for the activities i do daily, some of which was outlined in the study.
Could I suggest you ask the AI to adjust Fig 5 so the two y-axies have the same max & min, if they are to be placed side by side. This presentation is misleading at first glance.
Nice, well done! The MIT working paper vom Noy/Chang in March 2023 (https://economics.mit.edu/sites/default/files/inline-files/Noy_Zhang_1.pdf) also reported increased job satisfaction and self-efficacy which was very intriguing. It seems you haven't probed for this aspects in this study? Any reason for that?
Excellent study - and a call to action for everyone to understand and use AI.
>> If lower-skilled consultants with AI outperform higher-skilled consultants without AI, this might have strong implications for all knowledge workers
Such a great paper--it is packed with insights and seeds that prompt further research and exploration. One caveat is that the experiment’s basis on a pre-registered setup with selected tasks lacks the real-world chaos and unpredictability, raising questions on the generalizability of the results.
"On some tasks AI is immensely powerful, and on others it fails completely or subtly. And, unless you use AI a lot, you won’t know which is which."
Words to live by. This may be Ethan's best post yet.
That statement is bang on! I believe, the subtle mistakes would require more centaurism/cyborgism than the utter failures. And that’s where one might decide to turn off the autopilot.
I agree, that was a key sentence for me.
Ethan rocked this one!
This is the single best AI-measuring test I've read all year, quantifying what so many of us have seen. The impact of AI in leveling up non-experts in a discipline is especially critical in that it can reshape how entire organizations are structured. No longer should organizations be hiring for administrative abilities (as those will be more automated), but instead should focus on hiring for critical thinking, since those entry level folks can move the ball down the field with the help of generative AI. That in turn will impact middle management. Amazing work, Ethan and team!
Again, Conor, I completely agree with your assessment here! That being said, and I’ll caveat this by saying I haven’t read the paper yet, these results are not entirely unexpected. They largely track with the earlier productivity gains seen in studies in the beginning of ‘23 and which Ethan has frequently discussed. I don’t think the world needs many more of these types of paper. I think it’s pretty clear that generative AI augments the way we work. I’m hoping though, we can now start discussing the impact of this transformation on the actual people whose work, and lives, will soon change.
Matt, agree, as big tech shoves AI into our teams and companies, how do we get our teams and org's ready. I am looking for someone to help us create that workshop!
Gregory, bingo about providing workshops to teams and orgs. I've given generative AI workshops to realtors, loan officers, nonprofits, and K12 teachers. Workshops help everyone get on the same page as to what generative AI is and how it can be harnessed in the workplace. I just started a Substack called AI at Work. You could look me up there if you want to get in touch about workshops. Cheers
I am smelling whishful thinking here. Management of all levels is based on responsibility and ownership, which are not LLM's strongest suit.
The key takeaway for me is that I need to spend time every day engaged with ChatGPT4 in some way, no matter how small/big the issue is -- there is a LOT of 'learning by doing' to using generative AI.
Absolutely love the concept of the Jagged Frontier. Thanks for sharing these tangible, real-world insights on the ability of genAI to augment and tranform the way we think and go about our work.
Outstanding research and convincing if not unexpected findings.
I think many of the hurdles to actively using AI for more are psychological. Even for someone like me, who writes about AI on a regular basis, it's hard to know what task to try and use AI for.
For instance, it was only because you mentioned that your "Jagged Frontier" graph was drawn using the Code Interpreter (it's actually been rebranded as "Advanced Data Analysis" now) that I decided to try it for myself. Simply copy-pasting your short paragraph describing the concept of the Jagged Frontier into ChatGPT + Advanced Data Analysis and asking it to create a diagram that illustrates the concept produced a graph very similar to yours.
Except one of the red tasks (labelled as "Tasks AI cannot do") was placed inside the jagged frontier. Then I pointed this out and got an accurate and helpful visualisation.
But if you haven't pointed out that your diagram was done by ChatGPT, I wouldn't have immediately thought of asking for something similar. So, as you say, it's about spending more time with AI and feeding it all sorts of tasks until you get a good feel for where it is helpful.
Thanks for sharing!
Super interesting experiment and great write-up!
Q: In the paper when detailing the task outside the frontier it says: "Please find attached interviews from company insiders on this issue. In addition, the attached Excel sheet provides financial data broken down by distribution channels."
Is there anywhere we can find the interviews and the Excel sheet as well?
Also, re: "The consultants who scored the worst when we assessed them at the start of the experiment had the biggest jump in their performance"
- this looks like "regression to the mean", not necessarily a real effect.
One way to cross-check would be to look at *control group only*, and see if splitting it into bottom 50% vs top 50% on the assessment task later shows some "improvement" in the bottom group in the second round of tasks.
You guys are already anticipating the reviewers’ questions when it goes to review 😀
Great paper and post indeed, congratulations to all involved.
Similarly, I'd be interested in understanding how the experiment controlled for that regression to the mean effect one tends to find on many synthetic measures of performance.
Thanks Ethan and Team for these helpful stats about AI's #implications and #applications. And thanks for highlighting the 'Jagged Frontier' in AI capabilities. Yes... 'Centaur' and 'Cyborg' to harmonize AI with your expertise. It's not just about AI reshaping work; it's about how we shape AI's role in work.
#AIinBusiness #FutureOfWork
Great work!
Can someone explain to me "Density" ?
Thanks
Very nice post.
Perhaps should have mentioned that in chess the idea of cyborgs and centaurs goes back 25 years: https://en.wikipedia.org/wiki/Advanced_chess
Great work. Confirmation of the MIT study. Substantially raised productivity: time taken decreases by 0.8 SDs, output quality rises by 0.4 SDs
Noy, S. and Zhang, W., 2023. Experimental evidence on the productivity effects of generative AI.
Terrific study and reflects exactly how I've been using it myself lately even if it's just a toe in the water at this point. More Cyborg than Centaur, though I could easily see both.
I've been in software technology for 25 years; senior BA / product manager and was recently laid off. In putting together statements to summarize the work I've done and all the correspondence to hiring managers and head hunters and such, and I would struggle for way too long trying to get that sentence just right. I don't even hesitate anymore, going straight to ChatGPT using the simplest "make more concise" and in a blink it pops out a sentence, paragraph or bullet points that was staggeringly good, but that know I'd never get to left to my own devices, ever. That becomes the starting point for me to tweak/correct the points and put them in my voice, with a combined result that is perfect, in a fraction of the time. I can't wait to start using it in my next position for the activities i do daily, some of which was outlined in the study.
Could I suggest you ask the AI to adjust Fig 5 so the two y-axies have the same max & min, if they are to be placed side by side. This presentation is misleading at first glance.
Nice, well done! The MIT working paper vom Noy/Chang in March 2023 (https://economics.mit.edu/sites/default/files/inline-files/Noy_Zhang_1.pdf) also reported increased job satisfaction and self-efficacy which was very intriguing. It seems you haven't probed for this aspects in this study? Any reason for that?
Similarly interested in this
Excellent study - and a call to action for everyone to understand and use AI.
>> If lower-skilled consultants with AI outperform higher-skilled consultants without AI, this might have strong implications for all knowledge workers
What an amazing study. Keep up the stellar work in the field.
Such a great paper--it is packed with insights and seeds that prompt further research and exploration. One caveat is that the experiment’s basis on a pre-registered setup with selected tasks lacks the real-world chaos and unpredictability, raising questions on the generalizability of the results.