We can start to see, dimly, what the near future of AI looks like.
When Stiegel and Shuster first conceived of Superman, he couldn’t fly at all. He could leap but not fly. However, when, high speed trains got really fast in the late 1930’s, when they went above 120 MPH, the ‘Man of Steel’ needed more powers to compete. It made me wonder what new intelligences WE will have to develop to compete with AI? I suppose that’s what we’re going to find out like it or not.
The majority of the population have steadfastly ignored the advances in AI in this first phase of the generative AI era. I think this is interesting. I agree with Ethan that the future impact of AI depends on 'us' not the technology, which even for us techno-sociologists, makes it difficult to predict.
I get amazed every time I read one of your posts. Your publication has become my main source about multiple topics related to AI.
When I first interacted with ChatGPT 3.5 and 4, it was amazing. The coolest thing in technology in my life. Rivaled only by when I started using the internet as a university student in the late 90s. Now that I’ve played around a lot more, the hype for me at least has cooled tremendously. These systems are still very fragile, for lack of a better word. They are good at their original task, but in all the efforts I’ve seen to extend them beyond just chatting based on their pretrained data, they falter. Bing seems a lot less “smart” to me than ChatGPT, prone to misunderstanding and getting things wrong, and I always use the Bing creative setting. I think that’s because these models were not built to search the internet on the fly, or to access wolfram on the fly, etc. It will take quite a bit more tinkering and probably new models for them to be able to do these things well, and not function as tech demos which is what they basically are now. That is not to say the original ChatGPT 4 vanilla model still isn’t amazing, and mind blowing. But the extensions that I thought would come easily, seem like the won’t. It will still be some time before Microsoft copilot, etc, revolutionize the workplace in a user friendly manner.
Great post, and I agree that the implications of GPT-4 alone will take a decade or more to unfold. If anyone is intrigued by the Catalan mummy manuscript, I wrote more about my experiments with AI translation of historical texts here: https://resobscura.substack.com/p/translating-latin-demonology-manuals
Also Richard Sugg’s “ Mummies, Cannibals and Vampires: the History of Corpse Medicine from the Renaissance to the Victorians” (2011) is a great introduction the strange but true history of medicinal mummies.
The advent of voice-enabled AI that can understand accents, mixed languages, and noisy environments marks a transformative moment. While this leap in conversational AI offers the promise of more natural, effective human-machine interactions, it also ushers in ethical complexities. The personal touch of voice could make AI companionship more emotionally engaging, but that raises questions: are we ready for AI that not only understands what we say but also how we feel when we say it? And at what point does this "conversational intimacy" risk becoming manipulation or emotional dependency?
What online dating needs, from the reports of my friends who use it, is an automated tool that reduce the pipeline to a more manageable candidate set. I hear otherwise it can be a full time job finding a prospect!
It seems there was one thing about vision that could have been more emphasized. Namely, vision in relation to autonomous vehicles. Could that be an elephant in the room? Of course, there would be the ability to read any kind of sign. Not just ordinary traffic signs or location signs, but also billboards, business signs, etc. And that's just the beginning.
A properly equipped vehicle could recognize landmarks, comment on unusual circustances (e.g. a highway patrol car parked beside the road ahead), notice unusual hazards (such as road flooding or an animal about to cross the road). Or it could be instructed to follow directional information the driver had been given to find a friend's house - or use photos it had been shown to look for important features. ("Turn right 100 yards beyond the large oak tree across from the brown split-level house.") Basically almost anything a human driver could (or should) see.
Vision-equipped AVs would be a lot safer.
Ethan, this is a very good piece. I don't have a ton to add other than to add my voice to what you're saying: a new paradigm is here, and we're not sure where this will lead us. Thanks for inspiring some thought today!
Asking Bing to improve an image it generated... Honestly, I never thought about that.
I mean.. it's insane! And the result looks better! Gonna try on some images I generated with Midjourney...
Thanks for this!
I watched/listened to a pod on YT several months ago called “Diary of a CEO” the man interviewed was Mo Gawdat, I don’t remember the interviewer’s name, but all his pods have the same title.
Mo, an Egyptian man, was the CEO at Google for 7 years, and has written a book re: AI called “Scary Smart”.
He’s easy to understand, definitely Scary, and worth a look.
Thanks Ethan, excellent summary/update. I find your posts very helpful and insightful.
Does anyone else find themselves fluctuating between diametrically opposed positions on A.I.? I’m kinda the expert on this in my workplace. I’m using it loads and it’s making my life easier. But at the same time some of the implications of A.I. make me want to disconnect completely and pretty much go off grid.
I suppose that’s just a concentrated version of how I feel about a lot of tech...
Fascinating. What is Pi? You suggested downloading it to try chatting with it. I was terrified and fascinated at the same time. I googled what it was and came up with a crypto currency option and a gaming option. I don’t think that was what you were talking about...
Excellent article. I totally agree with you.
We work with lots of people who prefer filling out paper forms. I'm wondering with the visual capabilities if we will be able to offer that as an option along with the option to voice their answers?
Excellent as always but GAAAAA! Recognizing Hershey Park is not amazing if you're broadcasting your IP address. You say "GPT-4 with vision was able to do this despite not being connected to the internet", which is impossible. Do you mean you submitted the photo when back at Penn?