LLMs, sigmoids

2023-02-20

Recently Tom Scott released a video entitled I tried using AI. It scared me. The gist is that he tried using chatgpt, and realized it could write code better than he could. Perhaps not impressive, he’s a teacher and celebrity by trade and education. But in it he had one way of looking at technology that I thought was interesting; the sigmoid curve.

Sigmoid

Briefly, you can take any kind of technology and chart its innovation with an S-shaped curve (a sigmoid curve). At the beginning very little happens, but it reaches an inflection point where a huge amount of progress is made very quickly, which then tapers off at the end, where few improvements are made, before ultimately the technology stagnates.

Pretty much everything has followed this curve - 3d printers took off in the early 2010s, but a printer from 2015 is about the same as a printer today. Smartphones took off around the same time, but very little has changed in the last 5-6 years. Firearms had a longer curve, but ultimately everyone today is still using the same basic designs made somewhere between 1911 and 1960. You can look anywhere and see that "progress" is not a linear path, but a set of separate technologies that operate in bursts.

The transformer

The transformer was invented in 2017, and it took 2-3 years before anyone actually applied it in an interesting way. OpenAI made DALL-E and GPT (yes, of ChatGPT and Codex), the two main uses of it. Write a prompt and get an image, or write a prompt and the rest of the text will be filled in. These are the two things you’ve seen everywhere recently, because imitators have sprung up en masse. They’re firmly democratized tools now. And they’re improving by leaps and bounds over the course of a few short months.

It’s a huge tool, something that’s hard to argue isn’t going to be a major tool for programmers and artists (seemingly musicians too) from here out. There are some luddite attempts to pretend that it’s "not art", which have been popular on social media. A few art sites have pandered to these. But this might as well be hand animators struggling against using photoshop or mocap; the tool is here to stay, it’s only going to get better, your purism does nobody any favors.

I want to highlight this brief exchange between two redditors, a decade ago, in response to Pixar’s odd brag about not using motion capture;

I’ve learned over time, even with my own artwork, that people do not care about the novelty of the process unless it has a large ‘WOW’ factor to it. […] all the viewer cares about is the final product (in this case: a movie.)

Every industry has some (sometimes arbitrary) rules about purity of craft

Well said. Get with the times, "art twitter".

But all of these are based off that one new technique, the transformer. If we ask ourselves "where on the sigmoid curve for the transformer are we?" I think we’re nearer to the end than the beginning.

The Future

(note, i fully own that this is a complete speculation, but i have a hard time envisioning it going any other way).

I think we’re seeing the end of the "new things" a transformer unlocks. Prompt-based asset creation is what it does. Sure, GPT can do a ton of things, and do them all extremely well, but there is a limit to what it can do.

My guess is that a couple other fields will pick up the transformer and have some surges of progress with it - self-driving cars seem like a natural fit. Perhaps some board or video game bots can be made to understand the chaotic recent timeline of the game, and infer how they should proceed. Music/foley/vo is due for some DALL-E or GPT equivalent, which is going to be a whole stir fry of controversy from several angles. But other than that, nothing as impactful as GPT or Stable Diffusion.

Instead, i’d guess that we’re done breaking new ground and the next year or two will see rapid improvements to its integration. Copilot is a huge step forward (despite not being nearly as good as chatgpt at code), simply because it’s smoothly integrated into an existing workflow. With more speed, more control over prompting it, better visibility into the wider project, and more data, it’s hard to imagine a world where LLM’s are not used in-editor in a radical way. The same is probably true of image synthesis, probably to solve "blank canvas" syndrome and set up compositions quickly.

The tools will become more integrated, quicker, and workflows will be built around them so that they’re easier to use. But what we’ve seen is what they do.

My concrete prediction is that in 2-3 years’ time, we’ll be firmly used to using these tools, and it won’t be exciting in the slightest anymore; the progress will have largely stopped. I also predict that censorship and centralization of these tools will become a heavy issue; with projects like Stable Diffusion attacked from all angles due to its offline nature. These are powerful things, with an immense possibility for data harvesting, media manipulation, narrative promotion, etc. Institutional powers will be extremely interested in cutting off more and more avenues for these things to be used against them. But ultimately, narrower projects that aim to replicate Codex will probably succeed. Just not GPT itself, as a general purpose thing.

What will code look like?

I have to linger on this; i’m a developer by trade. It’s hard to overstate how powerful Copilot is, simply because it writes correct code very quickly and precisely when it is supposed to. After seeing how much better CGPT is than Copilot, i expect that prompt engineering will quickly become a natural part of the job. Not just for code, but for clarifying requirements into a design (doing a "design review" with chatgpt is a lot like doing a design review with a coworker), creating schemas, whipping up api specs, and writing the first implementation (including build and packaging tools) – something I recently did for a new service in the course of about 30min.

We’d be remiss if we didn’t cover how this impacts new coders. I learned to code by "trial, error, and google", having not gone to college. Learners today, instead of crafting google searches to get what they will, can craft GPT prompts to get what they want in their editor. This beats copy-pasting something from stackoverflow handily, since it writes code for your situation (rather than a general situation). New coders won’t be learning the complexities of any particular algorithm (thank god), or even any particular library. They’ll be focused on the correctness of the code to the intention. Which will probably deeply impact SDET’s; since a lot of their job becomes almost trivial (which admittedly it already was); since developers will move their focus to be more like speedy reviewers, rather than speedy typists.