What are LLMs good for today?
Introduction
Unless you've been living under a rock, you've no doubt heard at this point of large language models, a remarkably polarizing technology. It is a technology whose most evangelic boosters claim will revolutionize productivity (and possibly the world order) and whose most severe detractors claim is essentially good for nothing and a waste of time and energy (both human and resource wise). The point of this post is to detail my own experience using LLMs as a professional Site Reliabilty Engineer. When I do use LLMs, at the time of writing, I generally use either Claude Code from the CLI or via the VS Code integration, or simply use the chat interfaces available from various providers.
Background
At the moment, in my current role the problems I primarily work on include: developer productivity tooling (DevEx), system architecture, and platform infrastructure. I will be writing from this perspective.
Project Scaffolding
Generating scaffolding for greenfield projects is a use-case for which LLMs have demonstrated some value. Note my use of the word greenfield, if there's an established pattern for project structure or strict requirements you are likely better off ensuring you have high quality, curated, project templates and project components available for your developers.
Fuzzy Search
Fuzzy search queries where you would know the answer when you saw it, especially in cases where your query is incomplete and may contain irrelevant or incorrect details.
There have been several cases where I was unable to find an answer to a question via Google, but I found it by conversing with LLMs. I'm reasonably proud of my Google-fu and am generally better at surfacing information than many of my peers, but I will occassionaly hit a dead end.
This isn't an example related to programming, but I think it is especially illustrative of one of the strengths of LLMs so I will include it anyway. Recently I had a vague memory pop into my head of a particular scene or sub-plot from some piece of media. This is essentially all the information I had in my head:
I am pretty sure it's from a book.
I believe I read it in middle school or high school.
The scene I am remembering was a sub-plot, not the main point of the book, it was basically unrelated to the main storyline in the book.
The scene as I can recall it: There was an old woman who was trying to wean herself off her pain medicine becuase she wanted to die beholden to nothing and no one. She would have a child come over and read to her, each day she would wait longer and longer to take her pain medicine.
I would challenge anyone to take that little bit of rather vague information and determine what the book was using Google or conventional searches alone. It is certainly possible, but I find it unlikely it can be done in the 10 seconds it took for Claude Sonnet 4 to, using only that context provided above, correctly identify the book as To Kill a Mockingbird and the relevant character as Mrs. Henry Lafayette Dubose.
Code Translation
I have found that translating small pieces of code between two languages, when those languages are well represented in the training data for that LLM, is generally very successful.
Throwaway Scripts
This is self-explanatory, but generating small one-off shell or Python scripts, especially for interactions involving third-party APIs, is a frequent reason I would reach for an LLM. The more narrowly focused the better.
Classification
Traditional classifier machine learning models are always an option, and are likely more cost effective in terms of operational costs, but their training and deployment requires specialized knowledge your average developer may not have, not to mention data pipelines and processing. LLMs may be used as a general purpose text and image classifier for a variety of use cases with surprising accuracy.
When are they not useful?
To conclude, I'd like to note some situations where I find LLMs to not be especially useful. Part of my work involves supporting software engineers and helping them work out issues in the SDLC. Generally, if they are bringing a problem to my attention, it likely means they have already exhausted Google and LLM queries. In my experience the problems that are most often escalated have to do with interconnections between systems rather than the systems themselves. The more edges there are, the more to consider, the more difficulty the LLM has and the more opportunities there are for failure. In a complex system, even with queries enriched with architectural context, the LLM may generate many possible sources of the issue, but I very often see the suggestions acting as red herrings, leading my developers astray.