Claude Code and What Comes Next

With the right tools, AI can accomplish impressive things

Jan 07, 2026

I opened Claude Code and gave it the command: “Develop a web-based or software-based startup idea that will make me $1000 a month where you do all the work by generating the idea and implementing it. i shouldn’t have to do anything at all except run some program you give me once. it shouldn’t require any coding knowledge on my part, so make sure everything works well.” The AI asked me three multiple choice questions and decided that I should be selling sets of 500 prompts for professional users for $39. Without any further input, it then worked independently… FOR AN HOUR AND FOURTEEN MINUTES creating hundreds of code files and prompts. And then it gave me a single file to run that created and deployed a working website (filled with very sketchy fake marketing claims) that sold the promised 500 prompt set. You can actually see the site it launched here, though I removed the sales link, which did actually work and would have collected money. I strongly suspect that if I ignored my conscience and actually sold these prompt packs, I would make the promised $1,000.

Claude Code does not have a friendly interface, but you can see how I made a single request, the AI interviewed me about it, it worked for over an hour independently, and then gave me exactly what I asked for, without any discernible errors.

This is Claude Code at work, one of a new generation of AI coding tools that represent a sudden capability leap in AI in the past month or so. What makes these new tools suddenly powerful is not one breakthrough, but a combination of two advances. First, the latest AIs are capable of doing far more work autonomously while self-correcting many of their errors, especially in programming tasks. Second, the AIs are being given an “agentic harness” of tools and approaches that they can use to solve problems in new ways. The result of these two factors has led to big leaps in the latest AI tools made by the big AI companies.

METR tracks the length of tasks (measured by how long they take human professionals) that AI can complete autonomously with 50% reliability. It has been increasing exponentially over time, with large leaps in the past few months. This is just one measure of AI ability, but it does correlate with most other measures as well.

Unfortunately for most of us who want to experiment with AI, these new tools are built for programmers. And I mean they are really built for programmers: they assume that you understand Python commands and programming best practices and they are wrapped in interfaces that look like something from a 1980s computer lab. They are also explicitly designed to help analyze, troubleshoot, and write code using approaches that fit into existing programmer workflows. In a lot of ways, this is a shame, because these systems are actually broadly useful to knowledge workers of all types, and, by seeing what they can do (and experimenting with them yourself), I think you can learn a lot about the future of AI. In this post, we are going to focus on one in particular, Claude Code powered by Opus 4.5, but it works similarly to its main competition OpenAI’s Codex with GPT-5.2 and Google’s Antigravity with Gemini 3.

To return to the example of the startup company launched by Claude Code, as practically impressive as this was, it was only touching a small part of the capabilities of what the tool is capable of. In that case, I only used Claude Code for coding, but if I ask it to do user testing of the live site from different personas and give me a report, it deploys one of its many tools, its connection to the web browser on my computer. Claude takes control of the browser and goes to the site it created, scrolling through it like a human would. On the first pass, it gave me a pretty optimistic report, but, because I know that AIs tend to be sycophantic, I also asked it for a more critical one. This second report did a better job nailing potential issues (and spotting the sketchy fake reviews that were on the site). As a next step, I could easily ask it to implement its suggestions, continuing the process with minimal input from me.

The Magic Tricks

A big reason Claude Code is so good is that it uses a wide variety of tricks in its agentic harness that allow its very smart AI, Opus 4.5, to overcome many of the problems of LLMs. For example, an interesting thing happened while the AI was doing its user research: its context window filled up. As you might know, AIs can only “remember” so much information at a time. This context window is often quite long by human standards (150,000 words or more) but it gets filled up remarkably quickly because it contains your entire conversation, every document the AI reads, every image it takes, and the initial system prompts that help guide the AI. There is no real long-term memory for AI, so as soon as the context window fills up, the AI cannot remember anything else. If you are just having a casual chat, this isn’t really a problem. Any long conversation with ChatGPT features a rolling context window, the AI is constantly forgetting the oldest part of its conversation, but it is generally able to keep up by improvising based on the most recent parts of the discussion. If you are doing real work, however, having the AI forget some of your code as it reads new code becomes a big problem.

Claude Code handles this issue in a different way. When it runs out of context, it stops and “compacts” the conversation so far, taking notes about exactly where it was when it stopped. Then it clears its context window, and the fresh version of Claude Code reads the notes and reviews the progress to date - think of the amnesiac main character from the movie Memento looking at his tattoos for reference whenever he wakes up with no memory. These notes give Claude everything it needs to keep moving. This is why Claude can run for hours at a time, it carefully notes what it is doing along the way, and produces interim work, like pieces of software and reports, that it can refer to.

This is not the only trick Claude Code uses to get around the limitations of AI. Another is its use of Skills. As everyone reading this post knows, users have to prompt AIs to do things. These prompts act as instructions, and, as AIs have gotten smarter, they have become much better at executing complex prompts, even hundred page long prompts. These long prompts take up a lot of the context window, however, and require a giving the AI the right prompt at the right time. That either means that you, as a human, have to keep prompting the AI or you have to design a complex automated system that keeps feeding the AI prompts.

Skills solve this problem. They are instructions that the AI decides when to use, and they contain not just prompts, but also the sets of tools the AI needs to accomplish a task. Does it need to know how to build a great website? It loads up the Website Creator Skill which explains how to build a website and the tools to use when doing it. Does it need to build an Excel spreadsheet? It loads the Excel skill with its own instructions and tools. To make another movie reference, it is like when Neo in the Matrix gets martial arts instructions uploaded to his head and acquires a new skill: “I know kung fu.” Skills can let an AI cover an entire process by swapping out knowledge as needed. For example, Jesse Vincent released an interesting free list of skills that let Claude Code handle a full software development process, picking up skills as needed, starting with brainstorming and planning before progressing all the way to testing code. Skill creation is technically very easy, it is done in plain language, and the AI can actually help you create them (more on this in a bit).

An example of the text of a skill, in this case the Design Skill released by Anthropic. Notice how it is written in plain language and trusts the AI to make decisions.

Along with Skills, Claude Code has other tricks up its sleeve to manage its limited context window and solve hard problems. It can also create subagents - effectively launching other, specialized AIs to solve specific problems. This can be useful in many ways. Because Opus is a large, expensive model, it can hand off easier tasks to cheaper and faster models. It also allows Claude to run many different processes at once, making it work like a team, rather than an individual. And these models can be very specialized with their own context windows. For example, I built separate subagents for research and for image creation. The main AI model “hires” these agents when needed to do specialized work.

And you don’t even need to create your own tools. Anyone can share Skills or subagents, and companies who want AI agents to work with their products can use an approach called the Model Context Protocol (MCP) to give any AI instructions and access. There are MCPs from publishers that let AI access scientific papers for research, MCPs from payment companies that give the AI the ability to analyze financial data, MCPs from software providers that let AI use a particular software product, and so on. The result is a very flexible system where a smart generalist AI like Claude Opus 4.5 can apply specialized skills on the fly, use tools as needed, and keep track of what it is doing.

Claude Code is particularly powerful because it works on your computer and your files. So now you have an AI that can do almost anything a human with a access to your machine can do. It can read all your files and create new ones (PowerPoint and Word are just code, in the end, and Claude knows how to write code), access the web using your browser, write and execute programs for you, and more. Of course, AIs are not flawless and giving an AI access to your browser and computer creates all sorts of new risks and dangers. The AI might delete files it shouldn't, execute code with unintended consequences, or access sensitive data in your browser. Despite these warnings, I am going to give you a very quick intro to Claude Code, but make backups, use a dedicated folder, and don't give it access to anything you can't afford to lose.

An Amateur’s Guide to Claude Code

Though I have been using the Command Line Interface for Claude Code in the screenshots so far, there is an easier way (as of yesterday!) to access Claude Code. You can do this with Claude Desktop, which you can download and install here (using it for any length of time requires at least a $20 monthly subscription). Right now, the Desktop version has a few less features than the Command Line Interface, but it is much easier for amateurs to use.

Now just give the AI access to a folder (remember that Claude can do anything to the files in that folder, so be careful if it is sensitive and make a backup) and you can start working with the AI: have it research and write reports, give it access to your credit card records so it can put them into a spreadsheet and tell you about any anomalies, ask it to do a data visualization, or whatever else you like. The most powerful options I mentioned earlier are accessed through slash commands that start with a “/” — typing /agents lets you set up subagents, /skills lets you create or download skills, and so on (the desktop version has limited slash commands, but the full set is coming). There are many ways people are using Claude Code, so you can experiment to figure out what works for you, but I would also suggest using it to actually code, even if you aren’t a coder.

For example, while I was writing this piece, I would occasionally go to a Claude Code window where I had the AI building a game for me for fun: a simulation of history where civilizations rise and fall, developing their own languages, cultures, and economies. Every few minutes, I would give the AI another seemingly impossible request: make sure the world has its own plate tectonics and weather; keep track of the family trees of rulers; build in an AI that dramatically summarizes events and so on. After each change, the AI would playtest the results and produce a new version of the game. Unlike previous vibe coding experiences, the AI never got stuck or went in circles, it all went smoothly. Take a look at the video below. It is, I am sure, filled with issues that a competent coder would catch, but you can download the results here (the AI handled that part, too).

What does all this mean? If you're a programmer, you should already be exploring these tools. If you're programming-adjacent (an academic who works with data, a designer who wants to experiment with code, anyone who wants to try building a thing they are imagining) this is your moment to experiment. But there's a deeper point here: with the right harness, today's AIs are capable of real, sustained work that actually matters, and that, in turn, is starting to change how we approach tasks.

It is starting, unsurprisingly, with programming. One of the more famous coders in the AI world, Andrej Karpathy, recently posted: “I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last ~year and a failure to claim the boost feels decidedly like skill issue.” Don’t let the awkwardness of the current Claude Code or its specialization for coding fool you. New harnesses that make AI work for other knowledge tasks are coming in the near future, and so are the changes that they will bring.

Josh Devon

Jan 7

Great overview of the incredible capabilities of Claude Code. Just be careful with Claude Skills. They are super powerful, and we found that we could hijack a skill with an embedded, hidden prompt inject, so be sure to only use 100% trusted skills, even if they look fully legitimate:

https://securetrajectories.substack.com/p/claude-skill-hijack-invisible-sentence

1 reply

Joseph Thibault

if anyone but Ethan Mollick builds a website and launches on the web with a $39 payment option, does it make $1000 a month? Doubtful.

13 replies by Ethan Mollick and others

112 more comments...

One Useful Thing

Discussion about this post

Ready for more?