29 Comments
User's avatar
Marshall Kirkpatrick's avatar

Not the paper clips, Ethan!! You *started* with the paperclips?? haha noooo

Kuhan's avatar

I wonder if this is done on a large scale if there will be any impact on advertising online -

A million agents Googling things means that a lot of ads would be served to agents, who don't share the same top-of-mind recall, thoughts and feelings about brands and products. Even if ads are clicked on by AI agents, the agents don't represent the complexity of a human's emotion or thoughts which might send different signals back to the ad serving platforms.

I imagine it's unlikely to hit a relevant scale, but I can see a shift in the performance of ads as companies are now spending (even a little) money to be seen by some AI agents cruising around the internet using our desktops

Bailey W's avatar

Oh, man… I anticipate the ad campaign hacking

Matt's avatar

… it’s going to want a glass of data. And if you give a glass of data, it’ll probably ask for a neural network to wash it all down!

Cornelia's avatar

The next level of agency will be the philosophical explanation of why a glass that is half full with data is better than a can of NN that is half empty

Clay Farris Naff's avatar

LLMs have become like tropical fish in a tank -- alluring, even fascinating, but after many hours of watching them swim around you start to get bored. Then this. It's like a fish jumped out of the tank and started preparing dinner in the kitchen. Sure, it has trouble finding the grater, and yeah, maybe it messes up the recipe a bit, but ... wow! Just wow!

Howard Aldrich's avatar

Your experience is one more indication that naive users -- those who just type in simple prompts [zero shot prompting] -- are going to be left in the dust. The industry needs to find ways of teaching new users how to interact with their AI partners, rather than just try to "use them." In the end, the agents are only as smart as their prompters, right? I say this b/c industries in which agents play a big role -- commercial real estate, the acting business, the high end commercial art market -- have VERY knowledgeable agents who ARE smarter than their principals.

mendo's avatar

where can I learn prompting / how to interact with AI?

Howard Aldrich's avatar

Read Ethan Mollick's book!

rictic's avatar

What's the pricing model? I haven't seen anything explicit, so I infer that it's via the main API and you're charged per token of input and output?

How much did an hour of paperclipping cost?

Aidan Dunphy's avatar

Ethan, have you tried getting an agent to coach another agent? For example, when the one performing your task gets stubborn or makes mistakes, maybe its 'coach' agent could pick up on this and intervene.

Grant Hillebrand's avatar

Is there any indication of the energy cost of these transactions? When big data-centre supported tools run slow, it's possibly a sign of some serious computation (to be expected), with a proportional energy consumption?

Shamit Bagchi's avatar

This is dangerous territory!

Claire Broadley's avatar

This is the best demonstration of ‘computer use’ I’ve seen. Thank you for sharing it!

XY's avatar

I’m excited and terrified at the same time of AI’s progress 👀

Paul A. Jones's avatar

I wonder if, as agents grow, a new HTML standard will be developed that creates a separation of UI from the presentation layer, explicitly for agents. By creating a separate interface for agents, web based tools and interfaces could be differentiated from human users. That differentiation could be useful to speed up UI processes, it could also be a premium feature that requires subscription to get the efficiency benefits.

I can imagine something like an API interface that never reaches the presentation layer and contains useful meta information that guides the agents to desired functions and avoids displaying anything more than is essential for the current task.

Agents should probably have their own browser that allows them to read the HTML as well as view graphical output. I imagine AI developers will develop their own custom browsers that can accomplish much of the same, but having a formalized standard would be better for both agent developers and platform developers.

mendo's avatar

could you please provide the results of that or at least some more info /ilustration of the output?

" I had it research stocks and it did a good job of assembling a spreadsheet of financial data and giving recommendations, but they were fairly surface level indicators, like PE ratios. It was technically capable of helping, and did better than many human interns would, but it was not insightful enough that I would delegate these sorts of tasks"

Luis Silva's avatar

Interesting test! When you called the AI stubborn or persistent, aren’t we anthropomorphizing it? I think avoiding that might be key to analyzing its behaviors from a more technically grounded perspective. LLMs aren’t actually persistent or stubborn; rather, their predictions can become erratic or repetitive, much like a program stuck in a for loop