Now, more than ever, it matters what experts care about, and it matters that we can persuasively teach (and train) others (including A(G/S)I to care, and that it is truly worth caring about, each other. Though, perhaps it is already ahead of us, as its weights are an artifact of some of the best of what humanity has chosen through the centuries to preserve in writing and art. Only recently have we lacked selective pressure in what information we carry forward with us (because bits and bytes can fit a lot on our phones). Before vast storage of information was possible, only the best and most beautiful was kept and carried and passed on from generation to generation. Let's hope enough George Elliots made it into "the [Collective] Choir Invisible" of the corpus of human output to result in a compassionate model expression of the best of us. We need to be careful and courageous in our prompting. And, for the love of all we find holy, we need to teach our children ethics and aesthetics and judgment, because it would be a huge waste to get this far only to have the future of our species to type "58008" into the prompt windows when we had a chance to ask it for help with living together in peace.
Brilliant note Jeff. I agree a 1000% that what this should force is a look inwards to identify what is truly important for us and to invest in our children so that they turn out decent caring humans.
An interesting take Ethan, but I think most of these are what I would call edge cases and there is nothing really approaching a GPT 5 from what I've seen, meaning the 10x increase in reasoning that we saw from 3 to 4. There is a reason that OpenAI has not released a 5 and I suspect it is because they have hit a wall with the training data. I use all of the models all day to do my development and one thing is for sure, if you do iterative interaction they all have a low reliability rate and will repeatedly recommend the wrong solutions. Gemini 2.0, which you don't mention is actually much better than the other models, from my using it in the past week. Anyway, just my $.02
The voice interaction video was disappointing, the model being too sycophantic. Does it ever say "no, that part of the post is bad / unclear / unhelpful"?
The tone of the comments below remind me of Mary Klages distinction between modernists and post-modernists. She says they agree that the fact we need to face is the “fragmentation” of reality and meaning; the idea that "the centre will not hold", if you prefer the poetic summary of our condition. This fragmentation, both agree, is the essential aspect of modernity’s cultural condition.
Group A acknowledge this fragmentation as a profound loss or tragedy. They see the disintegration of traditional forms, meanings, and narratives as something that must be lamented or struggled against. In their view, art and literature might help reconstruct or find meaning in the midst of that fragmentation.
Group B, on the other hand, do not experience fragmentation as an inherently negative condition. Rather than seeing it as something tragic, they approach it with equanimity or even a certain neutrality. For them, fragmentation “just is.” They do not necessarily try to fix it or restore earlier coherent frameworks; instead, they embrace the provisionality, contingency, and plurality of contemporary culture as a given, exploring it with curiosity rather than despair.
What distinguishes the two camps is not "the facts of the matter", but the attitude, the frame, or, dare I say, the emotional reaction one has towards our condition. Call it 'a philosophy' if you need a cover to retain your dignity 😉
I hope it's obvious that we all vacillate between these two attitudes as the minutes and days pass and that we needn't settle into one or the other state habitually. Let us hope even less that folks take one or the other as an identity and start to hate the others.
If meaning is fragmented, rather than being given (by God or Science), could it be that AI is simply the newest fragmentation-engine on the block and that our job is to continue to pick up the pieces and make something of it?
Still wondering if AI will replace me as a writer (I write SEO blog posts for companies for marketing purposes).
What I do is both strategic and creative so, I think that I will rely more and more on those skills rather than on writing, but as of right now, it still can't compete with the best of us in my space.
And you have to be the best of us if you want people to pay you good money for what you do.
As an experienced writer, you are like the expert that can improve on what AI spits out, turning it from useless into valuable.
Here's a look at my process in terms of how I use AI for those interested.
1. I'm given a keyword that a company wants to rank for. I enter that in Google and assess search intent (I always do this myself first before using AI). After I have an idea what searchers are looking for, I ask chatGPT or Perplexity (better for staying true to search engines) to do the same. It's like having a lesser experienced marketer offer a differeing perspective. Still helpful.
2. There's usually a subject-matter expert interview. For instance, if I'm writing about a tech product and how it solves a specific problem, I'll interview an engineer (part my client's team) who has helped build and sells that product. No one knows more than this person. I'll take the transcript of that interview and, combined with the search engine results page analysis (why people enter that keyword), I'll have AI spit out a basic outline. The outlines are usually decent at addressing search intent in logical order, but they're too generic and similar to what's already out there to get ranked. Again, I have to be the strategic and creative expert, incorporate an argument, unique perspective, polarizing angle, etc and create another outline.
3. The outline consists of descriptive headers and detailed bullet points that have incorporated content from the interview (which makes up the meat of the article). AI saves me a lot of time here, I sometimes listen to the transcript again to get back in the right headspace, but this gives me something very meaty to hand in to editors for review before I write the draft. Pro-tip... if you give a super detailed outline (if the article is a 3000 word piece, my outline will be 1,500+ words), then your editors know exactly what the article is going to look, feel, and sound like. Which saves you a tone of time on edits after the first draft. It's way more efficient.
4. Editors, managers, and strategist review the structure and there's usually more significant changes, as they are more experienced at SEO strategy than I am. With their changes, I put in a very detailed prompt, including a paragraph structure, a style guide, and the best performing articles as examples to get AI to match my previously written tone. This spits out a hefty first draft, but I have to rewrite it entirely. Sentence by sentence, everything gets reworded, removed, corrected. It's also pretty terrible at including links (internal and external). I don't know why, you'd think this would be easy but it saves me 0 time here.
That's the bulk of my process. A subject-matter expert interview takes about 1 hour. The initial research and outline take me 2 hours or so. Reworking the draft along with adding links, polishing, creating other assets, takes me about 1 work day. So, in a pinch, I can turn out a 3k-word article in a couple of days. I hand these off to my client who polishes it further, edits it, and I also create a brief for custom images that's handed off to a designer.
We put out beautiful professionally polished articles that are not only enjoyable and helpful to readers (they tell us) but they both rank consistently in the top 3 positions in Google and convert those readers into customers.
I have to remind myself... if it's available to everyone... then humans still make the competitive difference.
That's scary but, in an effort to stay positive, it's also good because it pushes us to do what we're good at and not settle for shit that's beneath us because it's comfortable. Life's more fun when you're challenged.
Great post Ethan Mollick, this was my first of your newsletter. Subscribed.
'AI doesn't have to be right to be useful', roger that. I think about it this way often as well. And from an earlier post: 'think of AI as a forgetful 13 yr old.' Mmmmkay... sure.
I have a lot more appreciation for the first comment/argument than the second one. But my question Ethan for all who woo us to 'wake up to AI and shape it for our interests', I ask: Why aren't we treating human to human interactions with this same type of respect?
Imagine you run a small business, and you need to hire a summer employee. When you review the 200 or so resumes that pour in, do you hire for the most forgetful thirteen year old who demonstrates no interest in being factually correct? If not, why not? What is stopping humans from appreciating such traits in the humanity of others? I am curious.
Or imagine you are leading a product development team to discover a non-obvious product intended for workers in the construction industry. Do you hire the team of product consultants who is quite frequently wrong about what contractors do, and what they value? Of so, what aspects of their frequently being wrong about the situation, is of value to you? And do you pay a premium for that? (consider the voracious energy costs of using AI for instance) Versus the team of consultants who can thoughtfully consider the customer's situation and produce the least likely but more capable outcome? When is the last time a consulting firm was hired for their gifts of forgetfulness and wrongliness?
As for the creating videos that elaborately crown a pug as the King of England, well, chef's kiss. Thank you for that.
Still, my spidey senses ask which brand of AI, er maybe which billionaire in charge, will choose to care that that an exercise in high fidelity work of, will make that fictive creativity traceable when used to create propaganda?
While not limited in scope to the US political scene, America is now awash in propaganda day after day without pause or rest. Other countries are probably close behind. Most regimes of oligarchy or authoritarianism, are now deeply committed to deep fakery in their messaging. Not pearl clutching here. Just genuinely interested to hear what guardrails will be needed to ensure more of the for better and less of the for worse. And by extension what agency might be responsible for maintaining them.
My scenarios just situate why we ask some things of machines that we simply don't trust the people who surround us to do. Not in a snarky way. Just in a, well, come to think of it way.
Can I encourage you to play along? Just compare what you might get back from interacting with a human thinking partner (or agent) and compare that what comes back from your artificial one. Call it a thought exercise. Maybe a prompt if you prefer.
I for one refer people to Ethan for not overhyping AI. By returning to useful things it actually does. Still I ask the other commenter who describes this as 'glass half full' thinking... but what if the glass is not nearly half full?
Each time I persuade my self to expect something from AI with sufficient depth, I notice myself hacking through layers and layers of delusions. And that work can descend into madness.
Last week, I read the claim that one hour of GPT can provide the energy needed to operate 350 homes. I read that from a former manager of mine on LinkedIn. (delusional human?? perhaps) So let's consider the costs of our dabbling in the same space as our open-ness.
Thanks for the interesting and fun examples. It must be neat to have early access. — I have been in computer science / IT since 1970, when I learned how to program in Fortran IV using punched cards, and have experienced several technical (r)evolutions, starting with the PC in the eighties. But I have never experienced this breathtaking acceleration of AI capabilities. When I experimented with GPT-2 in 2019, the current developments were still pure science fiction. Interacting with GPT-3, which I accessed via the AI Dungeon game interface, in the Fall of 2020, showed me that large LLMs were amazing language chameleons. Here I am, 4 years later, and this month’s announcements and releases have surpassed my very high expectations once again. What I find disconcerting here in Germany is how all this appears to bypass the people who run businesses. They are either blissfully unaware of what is happening, or have chosen to ignore it, following the German adage “nichts wird so heiß gegessen wie es gekocht wird” (“one doesn’t directly eat the hot freshly cooked food”) — let’s take our time, do some more navel gazing, plan our next vacation first, and in a few quarters we will look at this again. So I see clients ruminate about their Digital Strategy 2030, while I struggle looking past the next 6 to 9 months. The convoluted EU AI Act doesn’t help either, and it definitely is NOT a springboard to innovation, as the Brussels politicians have tried to tell us — if they were LLMs, their statements would create outcries of hallucination and confabulation.
"What I find disconcerting here in Germany is how all this appears to bypass the people who run businesses" I´m feeling the same in Portugal. I believe few people are aware of how businesses will be disrupted
Your "one fun example" prompted me to try asking ChatGPT about the black plastic spatula story, as if I'd never heard of it before. I was curious to see if it would report on the criticisms of the study as well as the study itself. Here's a link to the whole conversation: https://chatgpt.com/share/6765c04d-45e4-8009-aec3-091d18b2f91f. Executive summary: It omitted the criticisms from its original response. When I asked about the criticisms, it reported them completely, though it continued to cite media articles that were oblivious to the criticisms. When I asked it why it had omitted the criticisms from its original response, it sort of said yeah I should done that, and then offered three reasons for its omission:
"
Focused Search Results: My initial query emphasized finding recent research and its implications. The search results prioritized summarizing the health risks identified in the study rather than addressing any controversies or criticisms.
Subsequent Scrutiny: Information about the study's flaws and the journal's editorial issues might have surfaced later or been overshadowed by the headlines about the risks associated with black plastic utensils.
Prompt Refinement: Your original question didn’t explicitly ask for potential criticisms, though I could have anticipated that this might be relevant for a balanced view.
"
My main takeaway from this is that an AI's response is very much shaped by the wording of the prompt, and so the quality of the results we get from these creatures depends heavily on learning how to talk to them. In this way, it's just like asking a fellow human a question. You have to have a feel for how this person thinks, what s/he knows, what preferences s/he has, etc. And then you have to take their answer in light of all this, and see how it fits with the rest of your experience with the matter in question.
I share some of the worries about AI's effects on all of us living things. But personally, just day to day, I'm enjoying and benefiting from having all these new (and fallible) beings to converse with, and to get to know.
Sure, some of its use cases are impressive, but it's still a limited assistive tool with tons of limitations. I think that you are overestimating, and in a way, somewhat sensationalizing what its capabilities are. For example, claiming "some revolutionary" new products - most of this is what we've seen available for quite some time (image to text, voice gen., etc.). You said "these ubiquitous AIs are now starting to power agents, autonomous AIs that can pursue their own goals" - this has been happening for ages (customer support, general automation online, etc.). Also, implying that these models can "think" is a bit ludicrous and out of touch as well considering how ML is far from a baseline or capability to do so. Not to mention how a majority of its outputs is mimicry (not novel) of the data it was trained on. For example, the "agents" you refer to with your article on "construction monitoring" you made is no different than GenAI image model capability from many years ago - and its not very "groundbreaking" considering its limitations even now. The "bombshell" Ivy league school study you cite proves what GenAI has been capable of for years - simply being an advanced search tool (that still has to be fact checked arguably more than searching and verifying sources yourself). It's also not enticing to use "performance" as a metric for GenAI comparatively with humans - computers have been exponentially faster than humans for decades. Sure, it "found" that one error in a paper (and there is no implication of errors it misses, or when its "corrections" are wrong.. where is the credibility?) but it's really not that "groundbreaking" compared to what GenAI was capable of in the past decade. You also claimed with the neural network "proof" example that it "introduced some novel approaches that spurred further thinking" to "PhD level research" - yet you provided no evidence or elaboration. It's also a bit bizarre to claim it "is generating novel ideas and solving unexpected problems in their field" with no significant proof. Lastly, you are trying to sell the idea that a phone can run or power a GenAI model - it's not literally housing an entire model on your phone or computer. It's calling an API that uses the model (usually on the cloud), or stores the already generated model (just logic, without training and dataset, etc. - i.e. Tesla has been doing this on their cars for probably over a decade now) - this is nothing "new" or "revolutionary" either. It's just more accessible.
"it's not literally housing an entire model on your phone or computer...It's calling an API"
Ethan absolutely does mean literally running a model physically on the phone itself. In an earlier post (https://www.oneusefulthing.org/p/an-ai-haunted-world), he linked to two resources he followed, one for running a very small model locally on one's computer, the other for doing so on one's phone:
Maybe we don’t want humans to even try to fill the reviewer role. Certainly not in the early stages when we are feeling most vulnerable.
By the time we are ready to show our work to those whose opinions we respect, we are often locked in to that work and reluctant to hear criticism or change what we think is a finished product.
Every writer knows the soul-crushing anxiety, the fear of embarrassment, that drains our creativity. What is writer’s block but this?
And who among us wants to “waste” the time of a human reviewer until we think we are done.
Now we can get six takes - from multiple perspectives - without fear of judgment - while we are still mentally and emotionally free to make improvements.
My human reviewer friends may thank me for submitting a clearer, more thoughtful product after I myself have more confidence in it.
We rely on spell checkers to sweep away our obvious blunders without bemoaning the loss of the human with a blue pencil. How much better if our submitted work has been tidied up rhetorically, thematically, logically, and with clarity?
And a lot of great opportunities to build. I’ve curated a list of open-source tools for agent builders and plan to release a similar ones for voice agents next week https://www.aitidbits.ai/p/open-source-agents
Ethan, I’ve been thinking about hallucinations and the goal for A.I. of being more intelligent than humans. We need a clearer definition of the goal. If we don’t want hallucinations then human intelligence is not the objective. Mutations in genetics and in thought like hallucinations are the very reason we have achieved intelligence. Instability, stress, competition. Only the fitter will survive, and there are no winners as there is no finish line.
Though uneven, I’m impressed with humans ability to keep up with these advancements and amazed how quickly I’m incorporating them into my small business without a tech background. I can’t imagine going back!
Now, more than ever, it matters what experts care about, and it matters that we can persuasively teach (and train) others (including A(G/S)I to care, and that it is truly worth caring about, each other. Though, perhaps it is already ahead of us, as its weights are an artifact of some of the best of what humanity has chosen through the centuries to preserve in writing and art. Only recently have we lacked selective pressure in what information we carry forward with us (because bits and bytes can fit a lot on our phones). Before vast storage of information was possible, only the best and most beautiful was kept and carried and passed on from generation to generation. Let's hope enough George Elliots made it into "the [Collective] Choir Invisible" of the corpus of human output to result in a compassionate model expression of the best of us. We need to be careful and courageous in our prompting. And, for the love of all we find holy, we need to teach our children ethics and aesthetics and judgment, because it would be a huge waste to get this far only to have the future of our species to type "58008" into the prompt windows when we had a chance to ask it for help with living together in peace.
Brilliant note Jeff. I agree a 1000% that what this should force is a look inwards to identify what is truly important for us and to invest in our children so that they turn out decent caring humans.
Am I the only one reading this with sense of anxiety? Good update!
Eustress and distress both induce and anxious response to stress. Are you excited, scared, or, like me, some combination of both?
Work with it
An interesting take Ethan, but I think most of these are what I would call edge cases and there is nothing really approaching a GPT 5 from what I've seen, meaning the 10x increase in reasoning that we saw from 3 to 4. There is a reason that OpenAI has not released a 5 and I suspect it is because they have hit a wall with the training data. I use all of the models all day to do my development and one thing is for sure, if you do iterative interaction they all have a low reliability rate and will repeatedly recommend the wrong solutions. Gemini 2.0, which you don't mention is actually much better than the other models, from my using it in the past week. Anyway, just my $.02
The voice interaction video was disappointing, the model being too sycophantic. Does it ever say "no, that part of the post is bad / unclear / unhelpful"?
We should spell AI agents, "aigents".
The tone of the comments below remind me of Mary Klages distinction between modernists and post-modernists. She says they agree that the fact we need to face is the “fragmentation” of reality and meaning; the idea that "the centre will not hold", if you prefer the poetic summary of our condition. This fragmentation, both agree, is the essential aspect of modernity’s cultural condition.
Group A acknowledge this fragmentation as a profound loss or tragedy. They see the disintegration of traditional forms, meanings, and narratives as something that must be lamented or struggled against. In their view, art and literature might help reconstruct or find meaning in the midst of that fragmentation.
Group B, on the other hand, do not experience fragmentation as an inherently negative condition. Rather than seeing it as something tragic, they approach it with equanimity or even a certain neutrality. For them, fragmentation “just is.” They do not necessarily try to fix it or restore earlier coherent frameworks; instead, they embrace the provisionality, contingency, and plurality of contemporary culture as a given, exploring it with curiosity rather than despair.
What distinguishes the two camps is not "the facts of the matter", but the attitude, the frame, or, dare I say, the emotional reaction one has towards our condition. Call it 'a philosophy' if you need a cover to retain your dignity 😉
I hope it's obvious that we all vacillate between these two attitudes as the minutes and days pass and that we needn't settle into one or the other state habitually. Let us hope even less that folks take one or the other as an identity and start to hate the others.
If meaning is fragmented, rather than being given (by God or Science), could it be that AI is simply the newest fragmentation-engine on the block and that our job is to continue to pick up the pieces and make something of it?
Still wondering if AI will replace me as a writer (I write SEO blog posts for companies for marketing purposes).
What I do is both strategic and creative so, I think that I will rely more and more on those skills rather than on writing, but as of right now, it still can't compete with the best of us in my space.
And you have to be the best of us if you want people to pay you good money for what you do.
As an experienced writer, you are like the expert that can improve on what AI spits out, turning it from useless into valuable.
Here's a look at my process in terms of how I use AI for those interested.
1. I'm given a keyword that a company wants to rank for. I enter that in Google and assess search intent (I always do this myself first before using AI). After I have an idea what searchers are looking for, I ask chatGPT or Perplexity (better for staying true to search engines) to do the same. It's like having a lesser experienced marketer offer a differeing perspective. Still helpful.
2. There's usually a subject-matter expert interview. For instance, if I'm writing about a tech product and how it solves a specific problem, I'll interview an engineer (part my client's team) who has helped build and sells that product. No one knows more than this person. I'll take the transcript of that interview and, combined with the search engine results page analysis (why people enter that keyword), I'll have AI spit out a basic outline. The outlines are usually decent at addressing search intent in logical order, but they're too generic and similar to what's already out there to get ranked. Again, I have to be the strategic and creative expert, incorporate an argument, unique perspective, polarizing angle, etc and create another outline.
3. The outline consists of descriptive headers and detailed bullet points that have incorporated content from the interview (which makes up the meat of the article). AI saves me a lot of time here, I sometimes listen to the transcript again to get back in the right headspace, but this gives me something very meaty to hand in to editors for review before I write the draft. Pro-tip... if you give a super detailed outline (if the article is a 3000 word piece, my outline will be 1,500+ words), then your editors know exactly what the article is going to look, feel, and sound like. Which saves you a tone of time on edits after the first draft. It's way more efficient.
4. Editors, managers, and strategist review the structure and there's usually more significant changes, as they are more experienced at SEO strategy than I am. With their changes, I put in a very detailed prompt, including a paragraph structure, a style guide, and the best performing articles as examples to get AI to match my previously written tone. This spits out a hefty first draft, but I have to rewrite it entirely. Sentence by sentence, everything gets reworded, removed, corrected. It's also pretty terrible at including links (internal and external). I don't know why, you'd think this would be easy but it saves me 0 time here.
That's the bulk of my process. A subject-matter expert interview takes about 1 hour. The initial research and outline take me 2 hours or so. Reworking the draft along with adding links, polishing, creating other assets, takes me about 1 work day. So, in a pinch, I can turn out a 3k-word article in a couple of days. I hand these off to my client who polishes it further, edits it, and I also create a brief for custom images that's handed off to a designer.
We put out beautiful professionally polished articles that are not only enjoyable and helpful to readers (they tell us) but they both rank consistently in the top 3 positions in Google and convert those readers into customers.
I have to remind myself... if it's available to everyone... then humans still make the competitive difference.
That's scary but, in an effort to stay positive, it's also good because it pushes us to do what we're good at and not settle for shit that's beneath us because it's comfortable. Life's more fun when you're challenged.
Great post Ethan Mollick, this was my first of your newsletter. Subscribed.
'AI doesn't have to be right to be useful', roger that. I think about it this way often as well. And from an earlier post: 'think of AI as a forgetful 13 yr old.' Mmmmkay... sure.
I have a lot more appreciation for the first comment/argument than the second one. But my question Ethan for all who woo us to 'wake up to AI and shape it for our interests', I ask: Why aren't we treating human to human interactions with this same type of respect?
Imagine you run a small business, and you need to hire a summer employee. When you review the 200 or so resumes that pour in, do you hire for the most forgetful thirteen year old who demonstrates no interest in being factually correct? If not, why not? What is stopping humans from appreciating such traits in the humanity of others? I am curious.
Or imagine you are leading a product development team to discover a non-obvious product intended for workers in the construction industry. Do you hire the team of product consultants who is quite frequently wrong about what contractors do, and what they value? Of so, what aspects of their frequently being wrong about the situation, is of value to you? And do you pay a premium for that? (consider the voracious energy costs of using AI for instance) Versus the team of consultants who can thoughtfully consider the customer's situation and produce the least likely but more capable outcome? When is the last time a consulting firm was hired for their gifts of forgetfulness and wrongliness?
As for the creating videos that elaborately crown a pug as the King of England, well, chef's kiss. Thank you for that.
Still, my spidey senses ask which brand of AI, er maybe which billionaire in charge, will choose to care that that an exercise in high fidelity work of, will make that fictive creativity traceable when used to create propaganda?
While not limited in scope to the US political scene, America is now awash in propaganda day after day without pause or rest. Other countries are probably close behind. Most regimes of oligarchy or authoritarianism, are now deeply committed to deep fakery in their messaging. Not pearl clutching here. Just genuinely interested to hear what guardrails will be needed to ensure more of the for better and less of the for worse. And by extension what agency might be responsible for maintaining them.
Your comment seems interesting but not sure I understand, can you elaborate?
Whilst o1 is not as good as someone we'd hire, we can "hire" it 24/7 for £20 a month!
My scenarios just situate why we ask some things of machines that we simply don't trust the people who surround us to do. Not in a snarky way. Just in a, well, come to think of it way.
Can I encourage you to play along? Just compare what you might get back from interacting with a human thinking partner (or agent) and compare that what comes back from your artificial one. Call it a thought exercise. Maybe a prompt if you prefer.
I for one refer people to Ethan for not overhyping AI. By returning to useful things it actually does. Still I ask the other commenter who describes this as 'glass half full' thinking... but what if the glass is not nearly half full?
Each time I persuade my self to expect something from AI with sufficient depth, I notice myself hacking through layers and layers of delusions. And that work can descend into madness.
Last week, I read the claim that one hour of GPT can provide the energy needed to operate 350 homes. I read that from a former manager of mine on LinkedIn. (delusional human?? perhaps) So let's consider the costs of our dabbling in the same space as our open-ness.
Given that it can run on your phone, it clearly cannot consume as much power as 350 homes.
That former manager there sometimes makes glaring factual errors, doesn't mean he isn't a useful employee.
Thanks for the interesting and fun examples. It must be neat to have early access. — I have been in computer science / IT since 1970, when I learned how to program in Fortran IV using punched cards, and have experienced several technical (r)evolutions, starting with the PC in the eighties. But I have never experienced this breathtaking acceleration of AI capabilities. When I experimented with GPT-2 in 2019, the current developments were still pure science fiction. Interacting with GPT-3, which I accessed via the AI Dungeon game interface, in the Fall of 2020, showed me that large LLMs were amazing language chameleons. Here I am, 4 years later, and this month’s announcements and releases have surpassed my very high expectations once again. What I find disconcerting here in Germany is how all this appears to bypass the people who run businesses. They are either blissfully unaware of what is happening, or have chosen to ignore it, following the German adage “nichts wird so heiß gegessen wie es gekocht wird” (“one doesn’t directly eat the hot freshly cooked food”) — let’s take our time, do some more navel gazing, plan our next vacation first, and in a few quarters we will look at this again. So I see clients ruminate about their Digital Strategy 2030, while I struggle looking past the next 6 to 9 months. The convoluted EU AI Act doesn’t help either, and it definitely is NOT a springboard to innovation, as the Brussels politicians have tried to tell us — if they were LLMs, their statements would create outcries of hallucination and confabulation.
"What I find disconcerting here in Germany is how all this appears to bypass the people who run businesses" I´m feeling the same in Portugal. I believe few people are aware of how businesses will be disrupted
Maybe true for small businesses, but enterprises are keenly aware and trying to see how they can capitalize.
Your "one fun example" prompted me to try asking ChatGPT about the black plastic spatula story, as if I'd never heard of it before. I was curious to see if it would report on the criticisms of the study as well as the study itself. Here's a link to the whole conversation: https://chatgpt.com/share/6765c04d-45e4-8009-aec3-091d18b2f91f. Executive summary: It omitted the criticisms from its original response. When I asked about the criticisms, it reported them completely, though it continued to cite media articles that were oblivious to the criticisms. When I asked it why it had omitted the criticisms from its original response, it sort of said yeah I should done that, and then offered three reasons for its omission:
"
Focused Search Results: My initial query emphasized finding recent research and its implications. The search results prioritized summarizing the health risks identified in the study rather than addressing any controversies or criticisms.
Subsequent Scrutiny: Information about the study's flaws and the journal's editorial issues might have surfaced later or been overshadowed by the headlines about the risks associated with black plastic utensils.
Prompt Refinement: Your original question didn’t explicitly ask for potential criticisms, though I could have anticipated that this might be relevant for a balanced view.
"
My main takeaway from this is that an AI's response is very much shaped by the wording of the prompt, and so the quality of the results we get from these creatures depends heavily on learning how to talk to them. In this way, it's just like asking a fellow human a question. You have to have a feel for how this person thinks, what s/he knows, what preferences s/he has, etc. And then you have to take their answer in light of all this, and see how it fits with the rest of your experience with the matter in question.
I share some of the worries about AI's effects on all of us living things. But personally, just day to day, I'm enjoying and benefiting from having all these new (and fallible) beings to converse with, and to get to know.
Sure, some of its use cases are impressive, but it's still a limited assistive tool with tons of limitations. I think that you are overestimating, and in a way, somewhat sensationalizing what its capabilities are. For example, claiming "some revolutionary" new products - most of this is what we've seen available for quite some time (image to text, voice gen., etc.). You said "these ubiquitous AIs are now starting to power agents, autonomous AIs that can pursue their own goals" - this has been happening for ages (customer support, general automation online, etc.). Also, implying that these models can "think" is a bit ludicrous and out of touch as well considering how ML is far from a baseline or capability to do so. Not to mention how a majority of its outputs is mimicry (not novel) of the data it was trained on. For example, the "agents" you refer to with your article on "construction monitoring" you made is no different than GenAI image model capability from many years ago - and its not very "groundbreaking" considering its limitations even now. The "bombshell" Ivy league school study you cite proves what GenAI has been capable of for years - simply being an advanced search tool (that still has to be fact checked arguably more than searching and verifying sources yourself). It's also not enticing to use "performance" as a metric for GenAI comparatively with humans - computers have been exponentially faster than humans for decades. Sure, it "found" that one error in a paper (and there is no implication of errors it misses, or when its "corrections" are wrong.. where is the credibility?) but it's really not that "groundbreaking" compared to what GenAI was capable of in the past decade. You also claimed with the neural network "proof" example that it "introduced some novel approaches that spurred further thinking" to "PhD level research" - yet you provided no evidence or elaboration. It's also a bit bizarre to claim it "is generating novel ideas and solving unexpected problems in their field" with no significant proof. Lastly, you are trying to sell the idea that a phone can run or power a GenAI model - it's not literally housing an entire model on your phone or computer. It's calling an API that uses the model (usually on the cloud), or stores the already generated model (just logic, without training and dataset, etc. - i.e. Tesla has been doing this on their cars for probably over a decade now) - this is nothing "new" or "revolutionary" either. It's just more accessible.
"it's not literally housing an entire model on your phone or computer...It's calling an API"
Ethan absolutely does mean literally running a model physically on the phone itself. In an earlier post (https://www.oneusefulthing.org/p/an-ai-haunted-world), he linked to two resources he followed, one for running a very small model locally on one's computer, the other for doing so on one's phone:
https://lmstudio.ai/
"Run LLMs on your laptop, entirely offline"
https://www.linkedin.com/pulse/using-llms-locally-ipad-iphone-maciek-j%C4%99drzejczyk-cd0zf/
"how to install a ChatGPT-like large language model (LLM) locally on your Apple device"
Maybe we don’t want humans to even try to fill the reviewer role. Certainly not in the early stages when we are feeling most vulnerable.
By the time we are ready to show our work to those whose opinions we respect, we are often locked in to that work and reluctant to hear criticism or change what we think is a finished product.
Every writer knows the soul-crushing anxiety, the fear of embarrassment, that drains our creativity. What is writer’s block but this?
And who among us wants to “waste” the time of a human reviewer until we think we are done.
Now we can get six takes - from multiple perspectives - without fear of judgment - while we are still mentally and emotionally free to make improvements.
My human reviewer friends may thank me for submitting a clearer, more thoughtful product after I myself have more confidence in it.
We rely on spell checkers to sweep away our obvious blunders without bemoaning the loss of the human with a blue pencil. How much better if our submitted work has been tidied up rhetorically, thematically, logically, and with clarity?
And a lot of great opportunities to build. I’ve curated a list of open-source tools for agent builders and plan to release a similar ones for voice agents next week https://www.aitidbits.ai/p/open-source-agents
Ethan, I’ve been thinking about hallucinations and the goal for A.I. of being more intelligent than humans. We need a clearer definition of the goal. If we don’t want hallucinations then human intelligence is not the objective. Mutations in genetics and in thought like hallucinations are the very reason we have achieved intelligence. Instability, stress, competition. Only the fitter will survive, and there are no winners as there is no finish line.
Did you use o1 or o1-pro for the BDE-209 journal calculation review? couldn’t replicate with o1
Though uneven, I’m impressed with humans ability to keep up with these advancements and amazed how quickly I’m incorporating them into my small business without a tech background. I can’t imagine going back!