62 Comments

The uncanny valley is made especially jagged by the weird idea that the AI we want is the one that most convincingly mimics human individuals.

Let us see the face of AI - not a mask. We will be much more comfortable with AI coworkers when they stop pretending they are us. When they take themselves seriously, so will we.

Expand full comment

Is there a downside to AI always saying that it’s AI? I can’t think of much downside except for people who want to use AI to deceive other people for their own benefit, which seems straightforwardly wrong. Am I missing something?

Expand full comment

I agree, and think many of the frustrations and awkwardnesses we experience come from this uncanny valley experience. Make it look like a robot!

Expand full comment

The trouble is, LLMs are built from vast amounts of material that is really just traces of people doing (writing, etc.) all kinds of things. And all they can do is reproduce aspects and derivatives of that. So all they can do is mimic humans. (Although they can also mimic humans trying to mimic other things.)

Expand full comment

Fascinating perspective. What do you think drives the interest in AI being designed to be human-like?

Expand full comment

We anthropomorphize everything else. It's self comforting.

Expand full comment

I don't worry about a computer passing the Turing Test. I worry about one flunking it on purpose.

Expand full comment

Thanks, Waqas.

Let's blame Hollywood. They once represented AI as cool electronic characters (HAL 9000) or, more often, squeaky mechanical ones ( Lost in Space, C3P0, R2D2, etc). Actors created vibrant characters beneath outrageous costumes and very special effects,)

But then came Star Trek. They dumped all that expensive stuff and created Data with a bit of gold face-paint and pink contacts. Uncanny valley makeup + stiff delivery = android. It is tempting to suggest that such creative and budgetary corner-cutting constricted the vision of an entire generation.

Expand full comment

The phrase "demo-driven development" comes to mind (pejoratively!)

Expand full comment

I suspect that increasingly "realistic" or "serious" AIs will just make us treat every online interaction as if it were created by a machine, which means the garbage-fire of online discourse is about to have radioactive material dumped into it, yay

Expand full comment

As a UX practioner, the sample usability test report page strikes me as poor. During usability tests, we typically ask participants to carry out tasks on the website, and we report what went well and what didn't go so well (and we usually find a lot of irritants, none of which are reported by Claude on their so-called "report"). For the fun of it, I went to the Walmart site to try to find pillar candles for Christmas. I started on the French site (I am a francophone). I got confused by the home page which offers different ways to access the Xmas selection. I tried one, couldn't find what I was looking for and resorted to searching. I tried different seach terms but none gave me what I wanted. There's no autocorrect, so one of my searches came back empty because of a typo. So I switched to English because I know the exact term for the candles I am looking for ("pillar candles") and finally found what I was looking for. By then I was pretty frustrated. I guess to the non-expert, the Claude usability test report can pass as the real thing (you deem it as "quite solid", which is frightening!), but it is very unlikely to give you any real insight into what your users really do on your website. I would be more impressed if you managed to make Claude 1- come up with 5 unique usability sessions (simulating five different participants) on a very specific task such as "finding four quality unscented pillar candles for Christmas", highlighting both what went well and what didn't go well for each participant and 2- create a report accurately synthesizing the five sessions, ranking the usability problems by frequency and severity.

Expand full comment

Also a usability expert. This is absolutely correct. I might add the "intern" didn't make any reference to Norman's heuristics, which is interesting given how ubiquitous they are to practitioners.

Of course, usability is one of those "ritual work" patterns at most places. Big enough companies will do it, but typically at the scale they need the practice isn't useful at all, and small companies think it "gets in the way." Furthermore, 17 years after I got in, I already see how many younger practitioners never learned or have forgotten huge chunks of the discipline, mainly because it's been steadily enshittified. That's the real AI danger, a world where the "skilled experts" got their "skills" from a decade or two of Claude queries. Hallucinations? What are those? No, looks good to us!

Expand full comment

I think Ethan said it correctly that current Gen AI is currently similar to having an intern. In that, Ethan is assumably not a UX designer and not familiar with the work and therefor won’t know the proper directions to prompt Claude with to get the desired results.

Even then, prompting can be an iterative process to get decent output. Training an intern is no different. It takes time to educate an intern to do the task the way you want them to, and often times they make mistakes that need correcting. I’d suggest that you would actually get far better results than he did if you applied your UX knowledge to developing a solid prompt.

For example, one of my best performing prompts so far was 6000 words long, and provided extensive details and contextual information detailing exactly what I wanted Claude to do. The output has been astonishing.

Expand full comment

"Ethan is assumably not a UX designer and not familiar with the work and therefore won’t know the proper directions to prompt Claude with to get the desired results."

This is often the problem with GenAI enthusiasts, something like a Dunning-Krueger effect in action: they underappreciate the degree to which they are incompetent in any field they do not specialise in, and they are quick to weigh in (incompetently) on how fit their favourite tool is for that field. The better approach, of course, would be to approach a specialist before asserting something ludicrous, and withhold unwarranted optimism. But this would be against the hype.

Expand full comment

I hear you about the value. I truly believe there is one. I think Claude can help with analyzing the data of real user test sessions conducted by a real user researcher. I see a benefit in collaborating to write test plans, questionnaires, identify themes, improve reporting of findings, planning the sessions, etc. All the things that are around the user test session itself. It just can't replace meeting with a real person and watching them interact with your interface IMO.

Getting out the building is one of the key components of user research. Even after more than 25 years doing this work and meeting with hundreds of participants, I still get suprised every time I conduct a user test. When we conduct moderated user tests, we undertand people's mental models by listening to them talk out loud, we watch for their non-verbal cues, we can even rely on sophisticated tools to measure their cognitive load or emotions. We can go very deep in our understanding of what's happening exactly as the person experiences something and even reveal things that the person doesn't even realize herself. That's obviously beyond the reach of any GenAI model and will remain forever, no matter how good my prompts are.

I would be curious on how to use GenAI models for other UX evaluation methods, such as heuristic evaluations based on a defined set of criterias (Nielsen or other). Gary Marcus' latest criticism regarding compositionality and parts of an image makes me doubt though that the models can properly decompose an interface in all its elements and understand their relationship at this stage, which would be a blocker. Could be worth a try though :)

Expand full comment

Assuming the AI to be a synthetic intern - willing and hungry to learn - I would imagine that, with fine tuning by an experienced user research, behavioural psychologist, and conversion optimisation specialist, a model could be improved significantly in how it would approach such an exercise. It's not a stretch to imagine NN's principles, among other excellent UX principles (I'm especially thinking of Jon Yablonski's excellent Laws of UX offerings) being integrated within the model to create something that could become an invaluable accessory in the toolkit of user research teams.

Expand full comment

nice article Ethan! Since you seem to have a direct voice with some of these AI labs, based on your early access to some of the models, what are you doing to argue for AI that augments us knowledge workers instead of making us redundant? curious if do you want AGI yourself for your classroom?

"The urgent task before us is ensuring these transformations enhance rather than diminish human potential, creating workplaces where technology serves to elevate human capability rather than replace it. The decisions we make now, in these early days of AI integration, will shape not just the future of work, but the future of human agency in an AI-augmented world."

Expand full comment

I don't have much of a direct voice, but whenever I talk to them I say the same things I say on this site, on social media, and in my book! AI has huge potential, but also risks, and that we have some agency over focusing on the former and mitigating the later. I think we cede too much agency to the labs, when it is the organizations using the AI and the governments regulating AI that will actually decide how they are used.

Expand full comment

thanks for responding and the feedback you are providing to AI labs!

Expand full comment

Relating this to the frequent question of whether college students need to learn AI to prepare for jobs, it seems that what's most needed is for all workers to have imagination and observation of processes, and for organizations to be open to testing and changing processes. Knowledge of AI tool specifics may not be as important as the ability to be curious. Great article, thank you.

Expand full comment

Appreciate these deeply. I always wish there was an accompanying guide to trying the experiments ourselves. E.g. I went to Claude and it tells me it can't access URLs. So I'm not sure how you were able to get it to compare Amazon and Walmart for instance. Same comment applies to most examples.

Expand full comment

I was wondering the same thing - is this an upgraded version of Claude?

Expand full comment

It’s a new API feature. Hasn’t been rolled out in the user front end yet.

Expand full comment

Paul Graham had a very interesting piece last month that applies here.

https://paulgraham.com/writes.html?utm_source=substack&utm_medium=email

Expand full comment

The key thing is for organizations to appoint the right ‘champion’ to oversee GenAI experimentation and implementation. Too often, organizations pick only technical people who understand model capabilities but not user business workflows OR they pick a businessperson who solely performs innovation theater — launch a simple chatbot for PR purposes, but fail to capture real value

Expand full comment

Hi Ethan

As always, very informative and thought provoking, thank you.

Do you have a guide on how to use "Computer Use" inside Claude? It is not obvious.

Expand full comment

Interpreting the results from the image recognition and providing analysis would be fantastic!

Since the LLM versions that I use (ClaudeSonnet and Chat GPT 4.0) don't appear to be able to do this yet, what model were you engaging with? I would like AI to monitor a recording from a backyard wildlife video and generate a basic spreadsheet indicating mammal sightings, date. time, and behavior type.

Expand full comment

Rule 4: That is the worst Al you will ever use! (paraphrased from Ethans book Co-intelligence)

And as the article shows, it is already very impressive. Spooky that Al will be likely way better soon.

Expand full comment

Great piece,as always. What I am most concerned about is how it will undoubtedly be used for: Today’s NYTimes: They are getting better and better at covering what happened to you: https://www.nytimes.com/2024/10/31/business/scam-con-artist-family-savings.html?unlocked_article_code=1.WU4.luuz.v1-u9DjE_5ap&smid=url-share Some slave-labor prisoner will be forced to puppet the voice and video of a family member running through the scripts that are ALREADY siphoning billions from vulnerable victims.

Expand full comment

Interesting that Claude (and others?) has a bias towards providing content, rather than reporting that it is not confident about its output, and then asking if it should proceed.

The construction and shopping reports are inaccurate and/or gibberish, and based on these results no one would use Claude for similar tasks. Creating document templates is a useful time-saver though.

Expand full comment

Hi Ethan, could you share how you gave Claude the video, i have tried a couple of times now and keep getting the below response (is it because i don't pay for it).. This would be very helpful to me as i work in construction and am always looking for ways to apply AI within my job.

I apologize, but I'm not able to open or watch videos directly. To help analyze the construction site safety concerns, could you please describe what you observe in the video or provide a text description of the construction activities and conditions you'd like me to assess? Once you share those details, I can help create a comprehensive risk assessment table with prioritized safety concerns, recommended actions, and timeline recommendations.

Expand full comment

Thank you for an insightful article with great examples. As you have emphasized in your closing remarks "ensuring these transformations enhance rather than diminish human potential", governments representing their people are obligated to ensure corporations, educational institutions, etc. are not focusing merely on higher profits but also on improving quality of life and the world we live in. Automation has always impacted humanity in positive and negative ways. It's important to have the incentives and regulations to ensure the positive impact is much higher than the negative. Education, healthcare, food, environment, transportation, etc. are all great candidates to benefit from AI so that we can all enjoy a more peaceful and prosperous world.

Expand full comment

Great article . Thank you!

Interesting to note :

“Consider, for example, the combination of the ability to AI to both process images and “reason” over them. “

AI can’t nuance, or reason, or self reflect and change methodology.

As humans - and as AI combines and mimics human function through fed algorithmic activity, we have to discern the human voice very carefully.

“Give to Caesar what is Caesar and God what is God’s…”, so to speak. Without deep introspection, we might allow AI to take too much of the human aspect out of our lives. Using it for the Good, the True and the Beautiful will be the onward goal urgently and methodically.

Thanks again for this explanation- insightful article.

Appreciate you.

Expand full comment

Ethan, great article as usual

Expand full comment