Pull requests are for humans

This is a post I originally wrote internally at Human Made, but I think is worth sharing publicly.

Both in private repositories and in open source repositories, there’s been a huge rise in the use of AI as LLMs have become more powerful. While many of us are using AI constantly to write code (myself included), there’s also been a large rise in pull requests generated partially or entirely with AI – that is, the descriptions, comments, and reviews, rather than the code.

When we’re writing pull requests and comments, we are humans interacting with each other. Although we might be working on automating chunks of what we do, our engineering work still comes back to being human.

In quite a few pull requests recently, I’ve had an experience of feeling like I’m just interacting with someone else’s agent indirectly, rather than with a person. Not only is this a waste of time for everyone involved, it also feels terrible to be on the receiving end of – why bother investing into properly reviewing if no one’s really listening?

Reviews have always been a multi-pronged tool, with a few related but separate goals. The obvious one here is to ensure the quality of the code we’re shipping. However, an important secondary benefit is that it is one of the primary ways we build and share our engineering knowledge across the team. Feedback loops are vital for any sort of improvement, and pull requests are a way in which we directly integrate feedback into our work. As a reviewer, I don’t mind investing a lot of time into writing code reviews, because it’s a way for us to grow as a team. When I’m just interacting with someone else’s agent though, that’s clearly not worth it.

This creeping pervasiveness of delegating to AI manifests in a few pernicious ways:

AI-generated pull request descriptions – These descriptions regurgitate and summarise what’s already in the code tab of the pull request. If I needed a summary of the changes, I can already ask my own agent to do that for me – what I want to see in a pull request is the decisions made in the pull request and other context not included in it. Why did you pick a certain approach? What does it look like in the broader project context? How does this affect user behaviour?
The unfiltered AI-to-PR pipeline – The most egregious case of this is the unfiltered pipeline where features go directly from a prompt into a pull request. In almost every case, these have had no human involvement, and act to just create a burden on the reviewer or project. (This is much more the case in the open source world than our internal projects.)
The back-and-forth of a review – Aside from the most obvious cases, much of review comes down to engineering discussion – checking trade-off assumptions, questioning approaches, and probing the edges. I expect a bunch of feedback to be explained or dismissed rather than necessarily causing a change, as we collaborate towards a shared understanding. Feeding those comments directly into an agent introduces a false certainty that isn’t intended in the feedback.

These aspects all contribute to a degradation of the review process and drastically diminish the point of doing pull request reviews at all. They also undermine the actual engineering that we’re doing, moving us closer towards being code factories – which is exactly the part that AI can do well.

Abdicating our review process to AI is tempting to free up our time, but ultimately harms the overall process and the value we’re bringing as engineers to our projects.

In the open source space in particular, when someone is volunteering their time to review your pull requests, using AI to create pull requests and handle this back and forth goes beyond degrading this process – it’s outright rude.

Leave a Reply Cancel reply