How to commit refactoring changes
In an earlier post, we discussed how to find time for refactoring. But we didn’t cover what to do with the refactored code.
Assuming that we don’t want to throw the changes away, we should commit the refactored code. We also said, that ideally, refactoring should not be done alone, it should be part of another change, like introducing a new feature.
Does that mean that we should commit all the modified code together and raise a big pull request?
Definitely not.
Let’s see the different options we have.
One commit, one PR
Still, let’s look at what consequences it has if we commit all our changes together.
I am having a hard time finding a good reason to do so! But if I must name one is that if people who don’t want us to spend our time on refactoring look at such a pull request they won’t immediately see what we did. But there is little chance that our product or engineering management would check our PRs for that.
On the other hand, if we put all our changes in one PR, it will be unnecessarily difficult to review it. As a reviewer, I expect separate things from a refactoring and a functional change.
From refactoring changes, I expect that it doesn’t change the behaviour and I might not even look for new tests, because I might assume that the code is already tested. Otherwise, how would you dare refactor it and it would have merged in the first place?
On the other hand, for a functional change, I do expect that you add new tests and also that you explain in the commit message what is the new feature about.
If something goes bad, with separate commits, it’s easier to go back step by step and check where the bug was introduced.
Pros:
Hardly any
Cons:
Too big commit
Mixed types of changes
Difficult to debug and roll back
Two or more commits, one PR
What if we create at least two commits? At least one for the refactorings and at least one for the functional changes?
I say at least one because maybe even all the refactorings should not be committed altogether. Why would we commit together the simplification of some loops with the modernization of some unrelated classes? We should commit them separately.
The same is true for feature code! We might want to break down a new feature into several commits.
But what is really important is that each commit is meaningful and leaves the repository in a buildable state. Otherwise, we still cannot rollback and debug easily!
This is not only about logical separation. With several small commits, we give our reviewers better chances to perform a thorough review.
You probably heard the old adage that if you submit a 100-line PR, you might get a few comments, but if you share a 10-line PR, you’ll get a dozen. You might feel else, but getting more code review comments is useful. With several smaller commits, we give the reviewers the chance to review the changes separately.
It’s worth mentioning that bigger repositories tend to squash all the commits into one when you merge pull requests to keep the git history shorter - this is called squash merging. Still, the commits can be checked later one by one if you can go back to the pull request or branch, but not on the master branch.
Pros:
A more logical structure
Easier debugging and rollback
Cons:
Still some mixed changes
Reviewers might take it as one big change
Buildability for each commit is not guaranteed
Two or more commits, two PRs
We can go further and submit those commits separately in different pull requests. This might slow you down a bit and there is a higher chance of getting some questions about why you did in the first place. You can still reply though that these changes are the prerequisites of your upcoming pull request introducing the functional changes.
On the other hand, given that you build and test your code as part of your continuous integration pipelines, it’s guaranteed that both the refactoring and the functional steps can be built separately and they pass the tests.
This is a stronger guarantee for granular debugability and easy rollbacks.
Also, you don’t rely on your reviewers’ decisions to review different types of changes separately. After all, it takes effort to select only specific commits to review. By separating your commits into different PRs, you relieve them from this task.
Squash merging policies will not mix your refactoring changes with functional changes.
Pros:
Different kinds of change will be reviewed separately
Guaranteed buildability of refactoring and feature changes separately
Logical structure
Cons:
Slower process
Possible questions about the reason for refactorings
Conclusion
We’ve seen that if you perform some refactoring before introducing functional changes as I suggested in How to find time for refactoring, you have several options to submit the changes for a review.
You might not care about any separation of concerns and you just dump everything into one commit. With that, you don’t help future you or your reviewers. It’s better to break down those changes. If they are not too big or if you are pressed to go fast, you might simply submit several commits into one pull request. But if you want to ensure the best experience for your reviewers and make sure that your refactoring changes are valid on their own, opt for different pull requests.