A compendium of things I didn't think AI should be able to do
As someone who used to deliver innovation and entrepreneurship programs in higher education institutions, I became increasingly disillusioned with the apparant lack of imagination and creativity in the startup ideas being proposed by students. In a way, this possibly reflects the outcomes of our sclerotic education system (aka, Sir Ken Robinson's "Do Schools Kill Creativity?"). I totally "get" the potential of LLM's as a tool for associative thinking by cobbling together and connecting apparently disassociated concepts. AI can serve to create the conditions for the flourishing of imagination if used wisely. Exciting times for innovation ... maybe.
I'd love to try to tease apart how LLMs make these connections. Where are they in the latent layers? What, if any, reasoning is used at all?
On the use of Kant, for example, Bing doesn't seem to use his essay on Perpetual Peace. I actually read that in Berlin as a student of International Relations. Kant would have approached MAD using his ideas on nation states, not ethics.
Bing can seem to make a cogent argument here, but doesn't have or use context with which to measure which of a thinker's ideas applies best, because it lacks categories. (Bing is not a Kantian!)
It's using a kind of free-association based on word-relationships more so than conceptual categories. In fact I believe its facility with concepts likely comes out of linguistic tropes more than logical distinctions.
I don't know what your prompting was, but I'm assuming it finds Kant from ethics, and within ethics, linguistic relationships close to (proximate to) terms, phrases, statements found in texts on MAD? (We could ask Bing about MAD and Kant's views on the nation state-I think we'd likely get a very different argument.)
What's interesting here is the manner in which logical and conceptual reasoning appear as effects of language. Bing's reasoning is still hallucinatory, imaginitive, inventive. But I don't think conceptual. It will appear to be intelligent when it's not. It'll appear to be educated when it's not. It'll test our intelligence, insofar as it forces us to measure and judge whether its reasoning is simply fanciful or indeed insightful.
I'd be curious to know what students of philosophy/humanities are learning about it. Have you seen anything? I haven't run into anything similar to what you're doing here.
I found this posting inspiring and helpful. I have been using ChatGPT since the first release in November but posts like this help uncover fresh insights to explore further. Very supportive.
That LLMs "are basically word prediction engines" is true, but reductive. There's a lot more going on that people haven't been investigating. I've been looking into how ChatGPT tells stories. Here's the abstract of an article I've posted online:
I examine a set of stories that are organized on three levels: 1) the entire story trajectory, 2) segments within the trajectory, and 3) sentences within individual segments. I conjecture that the probability distribution from which ChatGPT draws next tokens follows a hierarchy nested according to those three levels and that is encoded in the weights off ChatGPT's parameters. I arrived at this conjecture to account for the results of experiments in which ChatGPT is given a prompt containing a story along with instructions to create a new story based on that story but changing a key character: the protagonist or the antagonist. That one change then ripples through the rest of the story. The pattern of differences between the old and the new story indicates how ChatGPT maintains story coherence. The nature and extent of the differences between the original story and the new one depends roughly on the degree of difference between the key character and the one substituted for it. I conclude with a methodological coda: ChatGPT's behavior must be described and analyzed on three levels: 1) The experiments exhibit surface level behavior. 2) The conjecture is about a middle level that contains the nested hierarchy of probability distributions. 3) The transformer virtual machine is the bottom level.
I’m struck at what an interesting place we are with the consumer facing AI right now. Its capabilities are a process of discovery through user creativity.
Titanic, emojis? lol Ethan, I liked your creativity.
You write, regarding its answers on four technologies to save the Roman empire: "I am sure there are mistakes, but this is pretty impressive."
I think it's actually *riddled* with mistakes and misconceptions. If you go to the Marginal Revolution blog, you'll see some excellent comments about the mistakes it made.
Here is what I wrote on the Marginal Revolution blog. The first sentence is a quote from a previous commenter:
"It’s interesting to me how much the GPT answers read like a bright, but credulous and untutored student in their senior year of high school trying to summarize a lot of information in an “authoritative” voice but lacking the context or critical thinking skills to demonstrate real expertise."
Yes, it's fascinating to me as a retired mechanical/environmental engineer to read the GPT answers. I asked ChatGPT questions about electrostatic precipitators in which its answers showed that it fundamentally didn't grasp the physical situation.
It would be fascinating (to me!) to have an engineer ask the GPT further questions on each of its four technology scenarios, to see if it could "recognize" its mistakes, as ChatGPT was able to "recognize" its mistakes in its responses to me on electrostatic precipitator questions.
For example, I'm pretty sure that a primary reason that steam engines were first developed in Britain (rather than Italy) is that Britain had lots of coal. If wood is used to fuel these hypothetical piston engines the GPT wants Rome to build to "save the Empire", then deforestation is massive:
Further, as has already been pointed out, a key with any piston engine is the incredibly small gap between the piston and the cylinder. That makes a huge difference in the efficiency of the steam engine. Per wonderful Wikipedia:
"Watt worked on the design over a period of several years, introducing the condenser, and introducing improvements to practically every part of the design. Notably, Watt performed a lengthy series of trials on ways to seal the piston in the cylinder, which considerably reduced leakage during the power stroke, preventing power loss. All of these changes produced a more reliable design which used half as much coal to produce the same amount of power."
So it would be extremely interesting (to me!!!) to ask the GPT what fuel the Romans would burn to power these hypothetical piston engines, and to also ask whether Romans had metallurgical and manufacturing technologies capable of actually building the hypothetical piston engines.
P.S. I could go on and on. ;-) I'm not an expert, but I think the idea of adding vinegar to water to kill microbial contamination is probably ridiculous. Full strength vinegar doesn't even work very well as a disinfectant:
I too was surprised at how human-like the AI was. But I was pretty bummed out when it didn’t know what the square root of 64 was. When I corrected the AI and said the answer was 8, it said, “If you knew the answer, you should not have asked.” 🤔
Great stuff, nice work!
This is a very useful post, and I've learned a lot from it.
I am puzzled by this statement: "the stories were great." The stories do not seem great at all to me. That excerpt is "great"? It seems bare adventure, emotionally and morally meaningless.
Interesting read. I like your exploration into this technology. I saw this the other day, might be a fun topic for you to explore:
How to use the 'JAILBREAK' version of ChatGPT: Simple trick lets you access an unfiltered alter-ego of the AI chatbot
A site for ChatGPT's prompt words
Are we using different Bings? Every attempt I've made so far has either been lame or a dead end.
I want to emphasize that this post and your others are fascinating, and bring out important aspects and benefits of using Bing AI. However, I think it's equally important and interesting to understand that answers to technical questions that appear to be "brilliant" can actually be complete BS to someone who understands that underlying physics/chemistry/biology involved. Here's one of several comments I made at the "Marginal Revolution" blog regarding Bing AI's suggestions to save the Roman Empire:
The first sentence in quotation marks were the words of a previous commenter on Marginal Revolution:
"I understand that some commenters may have reservations about LLMs, but I encourage everyone to focus on the topic at hand - how to use LLMs effectively."
In this case, Ethan Mollick asked Bing a question. He was impressed by the answer, but only because Ethan Mollick doesn't know enough to see the massive number of inaccuracies and misconceptions in the answer.
As Gary Marcus might say, Ethan Mollick got pure BS as an answer, but counted it as gold or silver, because on this matter Ethan Mollick can't tell BS from gold or silver.