Claude Code and What Comes Next

Jan 7

With the right tools, AI can accomplish impressive things

116 Comments

Great overview of the incredible capabilities of Claude Code. Just be careful with Claude Skills. They are super powerful, and we found that we could hijack a skill with an embedded, hidden prompt inject, so be sure to only use 100% trusted skills, even if they look fully legitimate:

https://securetrajectories.substack.com/p/claude-skill-hijack-invisible-sentence

Important to note that there is no solution to prompt injection in genAI 😵‍💫

Can't LLMs themselves check for hidden instructions (white text) in pdfs or instructions?

can't we put this basic instruction to the LLM to inform us about white text in any of the instructions ore documents attached?

if not why

Ultimately, an LLM can’t distinguish rules from data, and so even if invisible text, an LLM can confuse data for rules, making it susceptible to prompt inject.

Joseph Thibault

if anyone but Ethan Mollick builds a website and launches on the web with a $39 payment option, does it make $1000 a month? Doubtful.

I honestly don’t know - there are lots of people making money with prompt packages and classes, and you don’t need many sales for $1,000, but it certainly isn’t a sure thing. I didn’t go into detail in the post but there was an elaborate marketing strategy as well that Claude proposed that I think would have boosted the chances of success.

But the point of the exercise was less about the specifics than the idea that there is a capability threshold being crossed.

I think the problem is that we’re in an in-between moment where only some people (probably still a minority of people) know that you can ask an AI to do this. So indeed many people might end up paying $39 for a prompt pack that takes Claude a few minutes to create.

At the point that most people know that Claude can do it, then the value falls to, very precisely, the $20 a month Claude subscription (or if you can do it with the free version, the value falls to 0 or very close to it).

“Actual financial value” lies in the things that are hard to get, that you cannot easily do yourself. So the idea that an AI would be able to create a sustainable financially successful business *entirely autonomously* I think is probably a non-starter.

It needs things that are hard to get and which others cannot easily do themselves. It needs what they call a “moat” - something that stops other people from instantly copying it the minute they hear about what you’ve done.

By this analysis, an AI can definitely be used to work on a very financially successful project, but the easier it is to get the AI to do that work (eg: you just do one prompt and let the AI do the rest), the less other humans will be willing to pay for the output. It is easy for them to do that themselves and get the same results.

Hence the job market fears

The idea that effort is a determinant of value is the deadliest trap to fall into when thinking about AI. Value will be a function of outcomes, and most people are really, really bad at thinking in those terms. Many jobs in fact *depend* on not thinking in those terms, especially at feature-bloated legacy companies.

And the real takeaway is that the value of "paying someone to do it for you," no matter what "it" is, is going to fall probably by an order of magnitude.

I don’t think it’s necessarily effort per se. More like: human discernment and skill is the value driver here. If anyone can say to Claude (which we all can) “write me a prompt pack” then whatever comes out at the end of that process is of very low value unless we have ALSO mixed our skills, experience and discernment with the end result. So if we can say “Claude made me 100 prompts, but here are the seven that I, Ethan Mollick, have found very useful” then that for sure has value.

It’s the old story of: It takes the plumber five minutes to do the job, and she presents you with a bill for $500. $500 for five minutes work?! No, $500 for the 25 years of experience it took to identify the problem and the solution that quickly.

If you’re selling, you have to be selling something that is not TRIVIAL for anyone to do. That doesn’t necessarily equate to “effort” but it must have something that differentiates it from whatever I could trivially do myself.

The question is never "can you," I've noticed, it's 'will you.' And "won't" is definitely the default mode for the human OS.

The big shift is that many, many things will become things you can trivially do yourself. Finding 1,000 strangers across the world who will pay you $39 for a pack of Claude prompts. Automating every single operation--groceries, power bills, babysitting--in your household. Designing clothes and having them show up in your mailbox in two days. Great! But this will also include managing SEC compliance in a complex trading firm with tens of thousands of positions. Acting as chief of marketing for a consumer products company. Writing novels.

Hi Ethan, perhaps we could test this empirically and see how much money Claude could make through AI Village (https://theaidigest.org/village).

We'd want to have Claude generate a new site that doesn't include lies and make sure the marketing strategy also didn't include lying to potential customers, but with those provisos I think it might be interesting to test this (particularly if you think that with those provisos Claude / a team of frontier agents would be able to almost fully autonomously make $1000/mo with an approach like this).

Joseph Thibault

yes, sorry, I *_do not_* discount how amazing it was that in 1+ hour it could accomplish what I did. I just want to be sure that this isn't suggesting that EVERYONE CAN BE A THOUSAND-AIRE IN ONE HOUR!!

You're in a different orbit than the average Joe and could no doubt monetize many things (including a prompt library)... the internet, SEO, GEO, marketing, etc. etc. are not as straightforward, clear, or sure fire a path to monetization

That said, I was/am inspired to test this with making plugins for open source Moodle, so I appreciate the walk through!

Totally agree, I'd be surprised if it made >$0.

Kathrin Hamilton

Grifters gonna grift. Why not have 100 websites? 1000. When do you run out of rubes? I can feel like an experiment would be valid but then it feels unethical to add to the world of GriftSlop. I'm always surprised by the engagement that the 'how to 10x your blah blah business' posts on X get. Could be bots of course.

Jan 9Edited

It is because of the appeal of the base 10 system to those of us who have grown up with that. 10,100 and a 1000 are easy increments to us. You just find one that looks appealing but still credible, like 10x in those posts, to draw in the attention of others. Or you make a comment where you include them to make your point. It is not criticism. Just a funny observation of you expressing surprise over something that you have just 'applied' in your own comment.

I just realized that we likely have the 10 base system because we have 10 fingers, which gave us, among many things, a way to convey numbers to others. It is just fascinating how everything around us is formed by abstractions of the most tangible and logical things.

It is because of the appeal of the base 10 system to those of us who have grown up with that. 10,100 and a 1000 are easy increments to us. You just find one that look appealing but still credible, like 10x in those posts. Or you make a comment where you include them to make your point. It is not criticism. Just a funny observation of you expressing surprise over something that you have just 'applied' in your own comment.

The thing that I'm finding most useful about Claude Code is that it's turning out to be great for things that aren't (or are just *barely*) code. I've been struggling with trying to implement something even remotely like what Sam Altman describe as GenZ "using ChatGPT like an OS" - they dump everything in to memory and can then do things like "give me a study guide and make up some flash cards for the paper I did on Thursday Night when I was at Starbucks with Hailey." - Chat knows who Hailey is, has calendar and google drive access, etc, and could work out that user is talking about XYZ paper and go from there. I tried files in Drive, I tried 3-4 different published MCPs, I tried Claude web, I tried Claude desktop, I tried Chat + Notion MCP, I tried Notion native AI, I tried Chat + Airtable as a 'proper' database. Everything failed - mostly at being able to write anything back. Couldn't create new Tasks, couldn't read a URL and just write a simple one sentence reminder on why I might have saved that article in Notion Bookmarks.

Tried Claude Code on Friday. Claude suggested writing to json and md files locally. I mentioned that I already had Obsidian, and I got H1 ALL CAPS "NOW WE'RE COOKIN!'"

and just a few hours of plain old discussion back and forth and now I have a system that manages Ideas, Projects, and Tasks, reads new bookmarks and turns them into Ideas, and can be controlled (mostly) by my shiny new HomeAssistant Voice Preview Edition.

Then today, I had something sprung on me that I needed to get done for a meeting TOMORROW and wasn't even sure how to approach it. On an air-gapped work machine. Just using the most generic possible terms, just "we have this saas platform that's now 13 years old and we're talking about a 'greenfield'" I wound up reading in Claude Code on the left monitor on my home PC and typing on the air-gapped work pc on the right monitor and landed on a reasonably polished step by step point by point Agenda to lead the meeting and work through with the team in the morning.

Neither one of those is particularly "application" or "coding" - though there is a *little* code (4 python files) in the first.

I'm deeply impressed and can see myself moving away from the web interfaces for anything of value and towards just opening Visual Studio Code or the Claude Desktop app for nearly everything...and just continuing to have conversations, but now with Agents that can actually DO STUFF rather than just suggest stuff for me to do.

I was talking to claude about compacting this morning (in which it selects which memories to keep, summarise, etc as it reaches context limit):

"It's like asking someone to write their own obituary while sleep-deprived."

It's things like that give me pause when using opus 4.5; it has these moments of insight and humour I would be impressed with if it came from a human. On balance, I still don't think it's 'like' anything to be an LLM... But I'm no longer so sure.

Actually our sleep cycles go through our brains version of compacting.

Hey Ethan! I’m one of the folks that built Claude Code for desktop so thanks for sharing. If there’s anything we can do to improve your workflows let me know!

A small tutorial, getting you to work through some examples with it, would do absolute wonders so accessibility!

I’m an average white collar non-coder trying to make it work for me for the first time. I started a 50-person group of AI hobbyists. Happy to share my experience with Claude Code if helpful. DM me if so

Jan 9Edited

Besides chat and code, I would like to see category shop there. Living in the Netherlands it is impossible to get official swag. I am trying to address this gaping gap in Anthropics offering, but nobody is listening yet. Thus I need an insider, like you, to stir things up :-) https://www.linkedin.com/posts/njdejong_so-i-needed-to-create-this-prompt-anthropic-activity-7399697053821603840-8eKO?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAAm8r0BaA7kd5sD3ekp5mGBLfV0UOKiGrg

Thanks for your work! Ready for when Plan mode is officially able to be used :)

We're going to release it on desktop locally this week. On remote it'll be out in the next week or so! Thanks for your patience. :)

I'm a software engineer spending my free time on substack enjoying life. And I didn't expect such a good read on claude from a non-programmer. I'm only commenting now so I can come back tomorrow (it's almost 4am here and I just finished work for today) when I'm by my computer and I can type a fully detailed response. Or maybe even a full essay in response. Great job here!

My personal experience (ex-software developer now sat on the business side) is everything you describe and more. I recently found a tool called flashbacker that integrates with Claude Code to give it more sustained memory and pre scaffolded personas for agents that kids it up another few levels. I feel like I have a 10 man dev team working for me day and night. It’s remarkable.

Gabriel Yanagihara

This "refactoring" of the profession is exactly what I'm seeing with the faculty I train here in Hawaii. It has been so cool watching non-coding teachers, who would have never touched a terminal a year ago, start building their own custom apps and interactive tools for their students.

Inspired by a prompt by fellow EdTech educator Justin Lai, I usually have them keep it to standalone HTML embed codes for now to keep things portable and simple, and even at that level, it works wonders. The shift from "using a software" to "building the solution" is finally hitting the mainstream.

If you ever vacation to Hawai'i, you gotta come see what the teachers are doing out here! Its amazing!

Am I weird for still choosing Cursor?

Jan 7Edited

I'm right there with you. I tried Claude Code and I like Cursor way better. It feels just as powerful but way easier to use. I did some research on this (using AI of course) and learned that Claude Code has deep codebase-wide understanding because the entire codebase goes into the context window; vs. Cursor does smart index & search to find relevant files. Claude code more smartly manages the context window through periodic summarization; whereas, its up to you to manage the context window in Cursor (it provides a little circle to show how much you are using, which is a cue to start a new chat window) Also Claude Code is best for a set-it-and-forget-it higher level agentic coding task that spans the entire codebase (including closing the loop by actually testing visually and testing via unit tests.) Cursor is for everything else. Also, many devs use both, with Cursor being the daily driver, but delegating a large task to Claude Code when appropriate.

In my experience the model seems to be the most important thing. I use the built-in VS Code/Github chat, Claude Code, Claude Desktop, and Google Antigravity regularly*. The Google and OpenAI models regardless of the wrapper, can’t go more than 5 minutes without making some incredibly dumb mistake like reintroducing a bug that they fixed a few minutes before. Claude Opus 4.5 can go an hour or more without doing that.

* For some tasks different models are better, and at the moment, tossing $20 bucks at several seems to produce more value for me than $200 at one.

this has been my exact experience as well

Precisely. Perhaps we'll get an answer 😅

I updated my comment with my takeaways from the answers I got from asking Gemini + ChatGPT. haha

Best and only way 😂

MrComputerScience

Thanks for writing this.

You've raised many interesting points.

Here's my response. (I've been coding since I was a kid, if it matters.)

Compacting feels like the first practical answer to the context-window ceiling. The model learns to leave itself good notes and keep moving. (Rather than just hitting a wall.)

Skills and subagents read like the missing abstraction layer. Less prompt engineering overhead, more modular workflows that non-coders can actually leverage.

The governance question is unavoidable. Once an agent can act on your files and browser via MCP, you need rigorous permission scoping and auditable action logs.

One more thought. The startup demo perfectly illustrates that "it shipped" and "it's defensible" are now completely separate conversations.

Cordially,

Mike D

Synthetic Civilization

This isn’t really about better coding. It’s about autonomy duration.

Once an agent can work for hours without supervision, plan, execute, correct itself, the human role shifts from doing to assigning.

That’s a structural change, not a UX improvement.

Ethan, for someone with no programming skills, how does it compare to Replit?

Is it weird still using Codex? Maybe I'm biased.

Sounds similar to Google Antigravity, which I currently use. I'm wondering now if I'm missing any features though!? Have you used both?

Personally I use them interchangeably. Antigravity being newer is buggier. The code produced is usually the same as long as you use the better models but sometimes things like weird UI issues crop up with Antigravity.

I found that Opus 4.5 is overkill for most of my code completion needs and it is close to 3x the cost of Sonnet 4.5... so I continue to use Aider.chat. I'm able to control my costs and get the same quality.

That said, I see what Anthropic and OpenAI are doing. That is the slow retirement of models they lose money on, forcing users into models that consume more tokens (agents) at a higher cost per token. Scalable -- Questionable.

Thomas Manandhar-Richardson

I don't think that's what they're doing. They don't have enough compute for their R&D, they actually really want us to use the less token intensive models. In fact, one of the stated innovations of Opus 4.5 was that it not works longer, but you get more "intelligence" per token.

Also, ChatGPT auto mode too readily selects inferior instant model than pro model.

Overall, I'd say its the opposite: they want users to be happy on the least tokens possible

my API billing statements disagree. I've been using Claude, in Aider, for almost a year and had very consistent bills up until I switched to Opus 4.5. I went from $5 -12 days to upwards of $70 days.

That said, model retirement isn't just a cost management decision. I get that, it's just that at a certain point the end user really only sees a difference in cost, not quality.

You can set Claude Code to use Sonnet or specify a model version, too.

Aider is open source.

These tools will become more and more friendly and the barrier for non tech / non coders will be removed soon enough. The combination Claude Code + Anthropic Agent SDK is going to be the killer app enabling the next leap for every knowledge worker

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts