A FAQ of sorts
I realized revealing the answer now will ruin the survey, so I'll add it here in a few hours.
I love that closing sentence: "The only thing I know for sure is that the AI you are using today is the worst AI you are ever going to use."
Also, I wanted to address an earlier point about the internet being a finite source of content for AI training, and using AI-generated content to bypass that. There's a potential phenomenon called model collapse that might occur if LLM output becomes too strongly the primary source of information that subsequent generations are trained on. Paper here: https://arxiv.org/abs/2305.17493
but TLDR version: the probable gets overrepresented, and the improbable (but real) slowly gets erased. Based on the probabilistic way that these large models work, this makes a lot of sense-- but a probable reality and an actual reality are two extremely different things.
LLMs and LMMs (large multimodal models) are likely to improve for quite a while yet, but it's quite possible that it will not be a linear or even exponential direction upwards. There will probably be some hidden valleys of performance loss that we might not notice until we solve them with novel architectures (if we ever even notice them at all!)
So I'll close with a sentiment that echoes yours: "The only thing I know for sure is that the AI you are using today is the worst AI you are ever going to use - but the same thing might not be true in the future."
People always overestimate their own perceptive abilities. I remember when preparing for Kabul, we had to look at still shots of video footage just before a suicide attack so that we could see that no matter how perceptive you think you are, you are not going to spot the suicide bomber except through sheer dumb pin-the-tail-on-the-donkey luck.
Hey Teachers! Want a foolproof way to detect AI cheating students? Sit them down with a pencil and paper, write an open question on the chalkboard, and have them compose an answer. WATCH ‘EM SQUIRM.
The 1st and 4th were the easiest to eliminate, because they look like classic genAI image. I chose the 2nd because one of the students is sitting on her coat - genAI unlikely to add such detail :D but without reference of other pictures, I could not detect that 3rd is generated.
Regarding LLMs I notice the one I reach for most has been Llama 2 70b which is available free at https://labs.perplexity.ai/. A quick search shows it is comparable to GPT-4 in many respects. I find it responds very quickly. Bard is so slow to the point it breaks my flow. Using Bing has a bit more friction and slightly slow. I have just signed up for Claude so will be interesting to see how it compares. Thanks.
I've been meaning to post a 'thank you' here since started reading your comments back in the spring. Since then, I read your posts as soon as they hit my inbox. Along with your articles, they have been a major influence in how I redesigned my teaching for this academic year (I am an academic in a British 'Russell Group' university, where I teach mathematics and quantum mechanics to undergraduates in chemistry and other natural sciences). I'm finding AI excellent of these subjects. Under anything other than expertly-prepared prompts, current AIs are poor at doing the work I ask the students to do. They are excellent, however, at tutoring students at how to approach the material, how to engage with it, how to access their own work.
Declaration (as opposed to 'disclaimer'): every word on this post is 100% genuinely mine. AIs did not contribute to it in any way! ;D
I recently asked chat gpt 4 to help me think of a question a human has never asked. It could not do it... it just couldn’t...just a variation on the same grand themes of our short time...
Not much to do with the post, but I just wanted to share about a strange bug I have found on ChatGPT. Whenever I type "Frank Van Dyke," it gives me the error message, "This prompt may violate our content policy." It was just meant to be a fictional name that I made up for my teaching material, but ChatGPT stubbornly refuses to accept any prompts with it. I searched the internet but did not find any controversial figures associated with the name. It's been a mystery to me.
GPT4 did not get worse, it is just different. So you might need you to change the prompt. In my case it actually gets better. More or less the same prompt seems to me to return better results over time. How should you manage this as a coder? In traditional development (more or less) once it is tested and it works, it works. What is the right testing approach when you use a LLM API?
Great insights … humility is one of your absolute best virtues …in helping us travelers experience the AI journey 👏🏿👏🏿👏🏿
Without zooming and searching for details, I moved relatively quickly to #2 (from left). It would look pretty much like the photo I would take myself (with a relatively cheap, 1 year old phone).
I immediately rejected #1 and #4: they are perfectly lit, and the blurred background looks more like a studio shot than a classroom snapshot. And as for #3 (the two girls), I quickly realised it was AI-generated: the window behind the girls (I think it should be a window) made me think it couldn't be a real photo.
BTW: Because of my relatively bad English i had "Deepl Write" look over my comment - exceptthis very last sentence (did you realize it?)
"training AIs on data that the AI makes up." If AI makes up the data, what good is it?
Open to collaborate Ethan? -Quentin
Such questions go to show that we do need to take AI and its future implications seriously, as well as be more aware of its general developments as well as pertaining concerns. This is because its eventually, if not already, going to be a crucial part of our modern lives.