29 Comments
User's avatar
Marshall Kirkpatrick's avatar

Not the paper clips, Ethan!! You *started* with the paperclips?? haha noooo

Expand full comment
Kuhan's avatar

I wonder if this is done on a large scale if there will be any impact on advertising online -

A million agents Googling things means that a lot of ads would be served to agents, who don't share the same top-of-mind recall, thoughts and feelings about brands and products. Even if ads are clicked on by AI agents, the agents don't represent the complexity of a human's emotion or thoughts which might send different signals back to the ad serving platforms.

I imagine it's unlikely to hit a relevant scale, but I can see a shift in the performance of ads as companies are now spending (even a little) money to be seen by some AI agents cruising around the internet using our desktops

Expand full comment
Bailey W's avatar

Oh, man… I anticipate the ad campaign hacking

Expand full comment
Abhishek Sharma's avatar

nice perspective

Expand full comment
Matt's avatar

… it’s going to want a glass of data. And if you give a glass of data, it’ll probably ask for a neural network to wash it all down!

Expand full comment
Cornelia's avatar

The next level of agency will be the philosophical explanation of why a glass that is half full with data is better than a can of NN that is half empty

Expand full comment
Clay Farris Naff's avatar

LLMs have become like tropical fish in a tank -- alluring, even fascinating, but after many hours of watching them swim around you start to get bored. Then this. It's like a fish jumped out of the tank and started preparing dinner in the kitchen. Sure, it has trouble finding the grater, and yeah, maybe it messes up the recipe a bit, but ... wow! Just wow!

Expand full comment
Howard Aldrich's avatar

Your experience is one more indication that naive users -- those who just type in simple prompts [zero shot prompting] -- are going to be left in the dust. The industry needs to find ways of teaching new users how to interact with their AI partners, rather than just try to "use them." In the end, the agents are only as smart as their prompters, right? I say this b/c industries in which agents play a big role -- commercial real estate, the acting business, the high end commercial art market -- have VERY knowledgeable agents who ARE smarter than their principals.

Expand full comment
mendo's avatar

where can I learn prompting / how to interact with AI?

Expand full comment
Howard Aldrich's avatar

Read Ethan Mollick's book!

Expand full comment
rictic's avatar

What's the pricing model? I haven't seen anything explicit, so I infer that it's via the main API and you're charged per token of input and output?

How much did an hour of paperclipping cost?

Expand full comment
Aidan Dunphy's avatar

Ethan, have you tried getting an agent to coach another agent? For example, when the one performing your task gets stubborn or makes mistakes, maybe its 'coach' agent could pick up on this and intervene.

Expand full comment
Grant Hillebrand's avatar

Is there any indication of the energy cost of these transactions? When big data-centre supported tools run slow, it's possibly a sign of some serious computation (to be expected), with a proportional energy consumption?

Expand full comment
Shamit Bagchi's avatar

This is dangerous territory!

Expand full comment
Claire Broadley's avatar

This is the best demonstration of ‘computer use’ I’ve seen. Thank you for sharing it!

Expand full comment
XY's avatar

I’m excited and terrified at the same time of AI’s progress 👀

Expand full comment
Paul A. Jones's avatar

I wonder if, as agents grow, a new HTML standard will be developed that creates a separation of UI from the presentation layer, explicitly for agents. By creating a separate interface for agents, web based tools and interfaces could be differentiated from human users. That differentiation could be useful to speed up UI processes, it could also be a premium feature that requires subscription to get the efficiency benefits.

I can imagine something like an API interface that never reaches the presentation layer and contains useful meta information that guides the agents to desired functions and avoids displaying anything more than is essential for the current task.

Agents should probably have their own browser that allows them to read the HTML as well as view graphical output. I imagine AI developers will develop their own custom browsers that can accomplish much of the same, but having a formalized standard would be better for both agent developers and platform developers.

Expand full comment
mendo's avatar

could you please provide the results of that or at least some more info /ilustration of the output?

" I had it research stocks and it did a good job of assembling a spreadsheet of financial data and giving recommendations, but they were fairly surface level indicators, like PE ratios. It was technically capable of helping, and did better than many human interns would, but it was not insightful enough that I would delegate these sorts of tasks"

Expand full comment
Luis Silva's avatar

Interesting test! When you called the AI stubborn or persistent, aren’t we anthropomorphizing it? I think avoiding that might be key to analyzing its behaviors from a more technically grounded perspective. LLMs aren’t actually persistent or stubborn; rather, their predictions can become erratic or repetitive, much like a program stuck in a for loop

Expand full comment