What AI can do with a toolbox... Getting started with Code Interpreter [Now called Advanced Data Analytics]
Democratizing data analysis with AI
Everyone1 is about to get access to the single most useful, interesting mode of AI I have used - ChatGPT with Code Interpreter Advanced Data Analytics (the name has been updated, I am not going to change the post beyond this first instance of the old name). I have had the alpha version of this for a couple months (I was given access as a researcher off the waitlist), and I wanted to give you a little bit of guidance as to why I think this is a really big deal, as well as how to start using it.
Code Interpreter continues OpenAI’s long tradition of giving terrible names to things, because it might be most useful for those who do not code at all. It essentially allows the most advanced AI available, GPT-4, to upload and download information, and to write and execute programs for you in a persistent workspace. That allows the AI to do all sorts of things it couldn’t do before, and be useful in ways that were impossible with ChatGPT.
Specifically, it gives the AI a general-purpose toolbox to solve problems (by writing code in Python), a large memory to work with (you can upload files up to 100MB, and those can be in compressed form) and integrates that toolbox into the AI in ways that play to the strengths of Large Language Models. This helps address a number of problems that previous versions of ChatGPT had:
It allows the AI to do math (very complex math) and do more accurate work with words (like actually counting words in a paragraph), since it can write Python code to address the natural weaknesses of Large Language Models in math and language. And it is really good at using this tool appropriately, as you can see below.
It lowers hallucination and confabulation rates. When the AI directly works with Python code, the code helps keep it “honest” sinces Python generates errors if the code is not correct. And as the code manipulates the data, rather than the LLM itself, there are no errors inserted into the data by the AI. This isn’t perfect, the AI still hallucinates (it often seems to think it can see the graphs it can generate, which this mode of ChatGPT cannot), but these errors are less common, and less likely to impact the code or data itself.
It makes the AI much more versatile. A remarkable number of problems can be solved with code, and GPT-4 is very good at figuring out when to use Code Interpreter in novel and interesting ways. For example, I asked it to prove to a doubter that the Earth is round with code, and it provided multiple arguments, integrating the text with code and images.
You don’t have to code, because it does all the work for you. All the major LLMs write code, but you have to run and debug it yourself, even though the AI helps. For people who never really used Python before (like myself) this was annoying, and involved going back and forth with the AI to correct errors. Now, the AI corrects it own errors and gives you the output.
It gives you more of those AI Moments. Anyone who has worked with GPT-4 has probably encountered at least a few Moments where it felt like there was, indeed, a ghost in the machine. I know it is an illusion, and that LLMs are in no way sentient or thinking, but those Moments are a thrilling, and sometimes unnerving, glimpses of possible futures with smarter AIs. Code Interpreter provides the most “that’s weird” Moments per use of any AI systems I have played with. I have been collecting a number of examples, such as when I asked the AI to invoke various emotion states with code or “show me something impossible to do with code and demonstrate it."Below, you can see what happened when I asked the AI “Using the tools available to you to draw, create an entirely new meme by creating an image. Make it relevant to your experience as an AI working with humans.”
So, take that as an invitation to play with this new tool. As one entry point, here is a getting started guide to using Code Interpreter with data.
How to use Code Interpreter with Data
Code Interpreter is an impressive data scientist. I have been using it extensively over the past months, and it is operating at a very advanced level, automating a lot of the complexity of quantitative analysis, and capable of very sophisticated approaches to data. As one way of of illustrating this, I started with a fun dataset, a public domain list of superheroes and their powers. You can download it if you want to try these steps with me.
It is easy to upload data, even compressed data like a ZIP file, by hitting the plus button. You should include an initial prompt with the data, but it can be pretty minimal, I literally used Here is some data on superhero powers, look through it and tell me what you find and got good results. If you have a data dictionary, you can just paste that in, too. The AI is good at figuring out the meaning and structure of the data from context alone.
You will note that Code Interpreter is much less about prompt crafting than about having a conversation with the AI. Treat it like an analyst, and talk with it. In fact, there are only two real exceptions where prompt-crafting seems to matter. First, the AI sometimes forgets it can do things (like make GIFs or 3D plots) and you may need to encourage it (“you are able to make a GIF, please try”). Second, you will want the AI to improve on its own work. Just asking it to “run further tests on that result” or “make this graph even nicer” will often work.
Now that we have the data loaded, we can have GPT do the worst part of any data analysis job: data merging and cleaning. It will handle this all automatically in a quite sophisticated way, but I find it usually helps to ask directly, as if I was directing a human data analyst. You will also note something really important about the way the system works - it is relentless, usually correcting its own errors when it spots them. It notices, for example, that columns are misnamed and fixes that issue. Impressive as this is, I would still recommend double-checking the results and process, rather than blindly trusting the AI.
Now, on to an analysis. The AI seems knowledgeable about analytical approaches - it is worth reading the exchange below to see what I mean. I prompted I am interested in doing some predictive modelling, where we can predict what powers a hero might have based on other factors. how should we approach this? and it built a Random Forest classifier - cool! But you can also see why it is important to have expert human oversight, since I would diagree with its decision to impute missing data by using the means for numerical data. I would have dropped the data instead, but I could ask the AI to change its approach, or discuss alternate options.
The AI is capable of many other analyses (it is “just” writing Python code, after all) but I was often impressed by its ability to select analytical approaches that made sense. For example, here is a network analysis of superpowers, that came from me just prompting Could you conduct another really sophisticated and interesting analysis:
But some of what makes Code Interpreter most impressive is that it “reasons” about data in ways that seem very human. When asked about the results of the network analysis, it came to interesting conclusions: the set of powers that heroes commonly had were visual in nature (because they were from comic books), fit certain archetypes, and were best suited to building continuing adventures. A neat way to integrate data and story together!
The level of interactivity continues for visualizations, you can go back and forth with the AI asking for improvements and changes. For example, I prompted Create an interactive dashboard with at least 6 insightful charts, including one in 3D. Make the dashboard beautiful. It produced a dashboard, but not exactly what I wanted. So I was able to just ask for changes in English: make this better. include more names, etc. You will also notice that it gave me a downloadable file for the interactive dashboard (you can try it at the link), which I just put in a web browser and it worked - downloadable outputs are another neat trick of Code Interpreter.
And a few more experiments I have done over the past months: visualizing the song of the summer with a 3D interactive plot, building interactive maps, interpreting the Iliad, causal analysis, making animated GIFs from data, analyzing Magic the Gathering, racing bar charts, and a lot more besides.
A sign of things to come
This is just scratching the surface of Code Interpreter, which I think is the strongest case yet for a future where AI is a valuable companion for sophisticated knowledge work. Things that took me weeks to master in my PhD were completed in seconds by the AI, and there were generally fewer errors than I would expect from a human analyst. Human supervision is still vital, but I would not do a data project without Code Interpreter at this point.
But it is just as clear to me that humans are not going to be replaced by Code Interpreter. Instead, the AI does what we always hope automation will do - free us from the most annoying, repetitive parts of our job so we can focus on the good stuff. By simplifying the process of analysis, I can do more and deeper and more satisfying work. My time becomes more valuable, not less, as I can concentrate on what is important, rather than the rote. Code Interpreter represents the clearest positive vision so far of what AIs can mean for work: disruption, yes, but disruption that leads to better, more meaningful work. I think it is important for all of us to think about how we can take this same approach to other jobs that will be impacted by AI.
If you subscribe to ChatGPT Plus, it should be available to you in the next week.
Thanks for this interesting post. As always you give us lots to think about.
I was most struck by this comment at the end: "But it is just as clear to me that humans are not going to be replaced by Code Interpreter. Instead, the AI does what we always hope automation will do - free us from the most annoying, repetitive parts of our job so we can focus on the good stuff. By simplifying the process of analysis, I can do more and deeper and more satisfying work. My time becomes more valuable, not less, as I can concentrate on what is important, rather than the rote. Code Interpreter represents the clearest positive vision so far of what AIs can mean for work: disruption, yes, but disruption that leads to better, more meaningful work. I think it is important for all of us to think about how we can take this same approach to other jobs that will be impacted by AI."
Given that you seem quite impressed by the software in this (its most basic) level, and knowing that the goal of OpenAI is to create a meta-human intelligence with tools like ChatGPT and Code Interpreter as the means to that end, why are you assuming the AI will not replace the more meaningful work as well?
Ethan, thanks for the encouragement and cautions. You have been an inspiration to our work in the UW-Madison community and my work with our industry consortium members. As a policy, I suspect it remains a valid concern about sharing non-public information with chatgpt and chatgpt-plus. I understand that one can “switch off training in ChatGPT settings (under Data Controls) to turn off training for any conversations created while training is disabled” but worry about uploading data sets that would otherwise be private or sensitive. How should we think about using these new tools for data manipulation in light of the concern for privacy or IP? (Source on OpenAI policy https://help.openai.com/en/articles/5722486-how-your-data-is-used-to-improve-model-performance)