One of the principal joys of version control is the freedom to experiment without fear. If you make a mess of things, you can always go back to a happier version of your project. We describe several methods of such time travel in link to come. But you must have a good commit to fall back to!
Using a Git commit is like using anchors and other protection when climbing. If you’re crossing a dangerous rock face you want to make sure you’ve used protection to catch you if you fall. Commits play a similar role: if you make a mistake, you can’t fall past the previous commit. Coding without commits is like free-climbing: you can travel much faster in the short-term, but in the long-term the chances of catastrophic failure are high! Like rock climbing protection, you want to be judicious in your use of commits. Committing too frequently will slow your progress; use more commits when you’re in uncertain or dangerous territory. Commits are also helpful to others, because they show your journey, not just the destination.
Let’s talk about this:
use more commits when you’re in uncertain or dangerous territory
When I’m doing something tricky, I often proceed towards my goal in small increments, checking that everything still works along the way. Yes it works? Make a commit. This is my new worst case scenario. Keep going.
What’s not to love?
This can lead to an awful lot of tiny commits. This is absolutely fine and nothing to be ashamed of. But one day you may start to care about the utility and aesthetics of your Git history.
The Repeated Amend is a pattern where, instead of cluttering your history with lots of tiny commits, you build up a “good” commit gradually, by amending.
Yes, there are other ways to do this, e.g. via squashing and interactive rebase, but I think amending is the best way to get started.
Start with your project in a functional state:
- R package? Run your tests or
R CMD check.
- Data analysis? Re-run your script or re-render your
.Rmdwith the new chunk.
- Website or book? Make sure the project still compiles.
- You get the idea.
Make sure your “working tree is clean” and you are synced up with your GitHub remote.
git status should show something like:
~/tmp/myrepo % git status
On branch main
Your branch is up to date with 'origin/main'.
nothing to commit, working tree clean
Imagine we start at commit C, with previous commit B and, before that, A:
Make a small step towards your goal. Re-check that your project “works”.
Stage those changes with and make a commit with the message “WIP”, meaning “work in progress”. Do this in RStudio or in the shell (Appendix A):
git add path/to/the/changed/file
git commit -m "WIP"
The message can be anything, but “WIP” is a common convention. If you use it, whenever you return to a project where the most recent commit message is “WIP”, you’ll know that you were probably in the middle of something. If you push a “WIP” commit, on purpose or by mistake, it signals to other people that more commits might be coming.
Your history now looks like this:
* above signifies a commit that exists only in your local repo, not (yet) on GitHub.
If you called
git status, you’d see something like “Your branch is ahead of ‘origin/main’ by 1 commit.”, which is also displayed in RStudio’s Git pane.
Do a bit more work. Re-check that your project is still in a functional state. Stage and commit again, but this time amend your previous commit. RStudio offers a check box for “Amend previous commit” or in the shell:
git commit --amend --no-edit
--no-edit part retains the current commit message of “WIP”.
Don’t push! Your history now looks like this:
but the changes associated with the
WIP* commit now represent your last two commits, i.e. all the accumulated changes since state C.
Keep going like this.
Let’s say you’ve finally achieved your goal. One last time, check that your project is functional and in a state you’re willing to share with others.
Commit, amending again, but with a real commit message this time. Think of this as commit D. Push. Do this in RStudio or the shell:
git commit --amend -m "Implement awesome feature; closes #43"
Your history – and that on GitHub – look like this:
As far as the world knows, you implemented the feature in one fell swoop. But you got to work on the task incrementally, with the peace of mind that you could never truly break things.
Imagine you’re in the middle of a Repeated Amend workflow:
A -- B -- C -- WIP*
and you make some changes that break your project, e.g. tests start failing.
These bad changes are not yet committed, but they are saved.
You want to fall back to the last good state, represented by
In Git lingo, you want to do a hard reset to the
Your local files will be forcibly reset to their state as of the
With the command line:
git reset --hard
which is implicitly the same as
git reset --hard HEAD
which says: “reset my files to their state at the most recent commit”.
This is also possible in RStudio. In fact, the RStudio way makes it easier to selectively reset only specific files or only certain changes. Click on “Diff” or “Commit”. Select a file with changes you do not want. Use “Discard All” to discard all changes in that file. Use “Discard chunk” to discard specific changes in a file. Repeat this procedure for each affected file until you are back to an acceptable state. Carry on.
If you committed a bad state, go to link to come for more reset scenarios.
Amending a commit is an example of what’s called “rewriting Git history”.
Rewriting history that has already been pushed to GitHub – and therefore potentially pulled by someone else – is a controversial practice. Like most controversial practices, lots of people still indulge in it, as do I.
But there is the very real possibility that you create headaches for yourself and others, so in Happy Git we must recommend that you abstain. Once you’ve pushed something, consider it written in stone and move on.
I told you not to!
But OK here we are.
Let’s imagine you pushed this state to GitHub by mistake:
A -- B -- C -- WIP (85bf30a)
and proceeded to
git commit --amend again locally, leading to this state:
A -- B -- C -- WIP* (6e884e6)
I’m deliberately showing two histories that sort of look the same, in terms of commit messages. But the last SHA reveals they are actually different.
You are in a pickle now, as you can’t do a simple push or pull. A push will be rejected and a pull will probably lead to a merge that you don’t want.
You have two choices:
If you have collaborators who may have pulled the repo at commit
WIP (85bf30a), you have to regard that particular history as being written in stone now. If there is any very precious work that only exists locally, such as a specific file, save a copy of that to a new file path, temporarily. Hard reset your local repo to
git reset --hard HEAD^) and pull from GitHub. GitHub and local history now show this:
A -- B -- C -- WIP (85bf30a)
If you saved some precious work to a temporary file path, bring it back into the repo now; save, stage, commit, and push. GitHub and local history now show this:
A -- B -- C -- WIP (85bf30a) -- E
If you have no collaborators or you have reason to believe they have not pulled, you can rewrite history, even on GitHub. You might as well make sure your local commit has a real, non-“WIP” message at this point. Force push your history to GitHub (
git push --force). GitHub and local history now show this:
A -- B -- C -- D
In both cases, you’ve made the changes you want and your local repo and the GitHub remote are synced up again. The history is nicer in the second case, but that’s a secondary issue.
There are many different ways to rewrite history and rescue some of these situations, but we find the approaches described above to be very approachable.