Be Better Than an LLM

I find it useful for deconstructing the existential and intellectual angst I've accumulated in this age of genAI, to imagine what it means to compete with LLMs on their own benchmarks.

Take coding, for example. Machines compete well at LEET code competitions, which are a bit trivially bounded like chess, so the superiority of machines here is neither terribley surprising or upsetting. But what about more realistic competetitions, where you build out certain features in an existing codebase? Humans still win, although genAI has improved more than one might've expected.

To really compete well at such tasks, you would need a great IDE with go-to-definition and inline error displays. You'd do well to have a library of trivial snippets– it might be effective to leverage modern deep learning to search through publically licensed code which does what you want it to. Then you can use the IDE to rename variables, to make changes as needed– but you'd have actually done a more precise (and more ethical vis-a-vis licenses) version of what coding agents are doing.

I find it quite useful to think of current genAI coding tools as a bad version of this more worthy goal. When I go to write a shell script, I have to ask the LLM to regurgitate some hallucinatin prone version, rather than call up the functionally-equivalent version that was already published and MIT licensed. This is just new and improved code snippets.

What does it look like to be better than an LLM at writing? This is the easiest, really. Have a connection between your voice and your message beyond simple conformace of superficial "style" to message. Use voice as more than an "ice cream flavor" of your writing. Write with more than the techniques of corporate blogs crossed with SEO recipe writing. Have some expressive innovation, some uniqueness, and use your world model to ground their assertions and build durable trust with your audience. Its wild to think industrial society has invented yet again the roundup-ready, infertile tomatoes; so superficially crafted to maximize some large-scale efficiency that all subtle qualities are ghosts

Its interesting that in some domains it really doesn't make a difference. For a photoshop request like "remove the balloon from this photo", multi-modal foundation completely satisfy the request. Likewise for lazy blog post covers, basic questions about well-known and oft-repeated facts, gimmicky style or tone translations, for things that are more or less banal from the start.