Introduction
Git is a crucial skill to have whether you’re just a hobbyist or you aim to become a professional web developer. It’s the “save” button on steroids and allows for seamless collaboration. There really aren’t all that many commands for you to learn, but sometimes the real difficulty of Git comes from visualizing what’s happening.
In this lesson, we’ll help with the visualization by diving deeper than just the $ git add .
and $ git commit
and $ git push
commands you’ve mostly been using. We’ll cover topics such as Remotes, Pointers, and Changing Git History. This will expand your understanding of what’s actually going on under the hood with Git.
It is very important to take a look at all of this before progressing any further with the curriculum. The project work is becoming more and more complex, so using a disciplined Git workflow is no longer optional. Hopefully after going through this lesson you’ll be much more comfortable changing your Git history and have a better understanding of Git as a whole.
Lesson Overview
This section contains a general overview of topics that you will learn in this lesson.
- History changing Git commands
- Different ways of changing history
- Using remotes to change history
- Dangers of history-changing operations
- Best practices of history-changing operations
- Pointers
Changing History
So let’s say you’re comfortable writing good commit messages and you’re working with branches to have a good Git workflow going. But nobody is perfect, and as you’re writing some beautiful code something goes wrong! Maybe you commit too early and are missing a file. Maybe you mess up one of your commit messages and omit a vital detail.
Let’s look at some ways we can change recent and distant history to fit our needs. We’re going to cover how to:
- Change our most recent commit
- Change multiple commit messages
- Reorder commits
- Squash commits together
- Split up commits
Getting Set Up
Before we get started with the lesson, let’s create a Git playground in which we can safely follow along with the code and perform history changing operations. Go to GitHub, and as you have in the past create a new repository. Call it whatever you’d like, and clone this repository to your local system. Now, let’s cd
into the repository we just cloned, and create some new files! Once you’re in the repository follow along with the following commands. Look them up if you’re confused about anything that’s happening.
$ touch test{1..4}.md
$ git add test1.md && git commit -m 'Create first file'
$ git add test2.md && git commit -m 'Create second file'
$ git add test3.md && git commit -m 'Create third file and create fourth file'
Changing The Last Commit
So if we look at the last commit we made Uh-Oh!, if you type in git status
and git log
you can see we forgot to add a file! Let’s add our missing file and run $ git commit --amend
$ git add test4.md
$ git commit --amend
What happened here is we first updated the staging area to include the missing file, and then we replaced the last commit with our new one to include the missing file. If we wanted to, we could have changed the message of the commit and it would have overwritten the message of the past commit.
Remember to only amend commits that have not been pushed anywhere! The reason for this is that git commit --amend
does not simply edit the last commit, it replaces that commit with an entirely new one. This means that you could potentially destroy a commit other developers are basing their work on. When rewriting history always make sure that you’re doing so in a safe manner, and that your coworkers are aware of what you’re doing.
Changing Multiple Commits
Now let’s say we have commits further back in our history that we want to modify. This is where the beautiful command rebase
comes into play! We’re going to get deeper into the complexities of rebase
later on in this lesson, but for now we’re going to start out with some very basic usage.
rebase -i
is a command which allows us to interactively stop after each commit we’re trying to modify, and then make whatever changes we wish. We do have to tell this command which is the last commit we want to edit. For example, git rebase -i HEAD~2
allows us to edit the last two commits. Let’s see what this looks like in action, go ahead and type in:
$ git log
$ git rebase -i HEAD~2
You should notice that when rebasing, the commits are listed in opposite order compared to how we see them when we use log
. Take a minute to look through all of the options the interactive tool offers you. Now let’s look at the commit messages at the top of the tool. If we wanted to edit one of these commits, we would change the word pick
to be edit
for the appropriate commit. If we wanted to remove a commit, we would simply remove it from the list, and if we wanted to change their order, we would change their position in the list. Let’s see what an edit looks like!
edit eacf39d Create send file
pick 92ad0af Create third file and create fourth file
This would allow us to edit the typo in the Create send file
commit to be Create second file
. Perform similar changes in your interactive rebase tool, but don’t copy and paste the above code since it won’t work. Save and exit the editor, which will allow us to edit the commit with the following instructions:
You can amend the commit now, with
git commit --amend
Once you're satisfied with your changes, run
git rebase --continue
So let’s edit our commit by typing git commit --amend
, fixing the typo in the title, and then finishing the rebase by typing git rebase --continue
. That’s all there is to it! Have a look at your handiwork by typing git log
, and seeing the changed history. It seems simple, but this is a very dangerous tool if misused, so be careful. Most importantly, remember that if you have to rebase commits in a shared repository, make sure you’re doing so for a very good reason that your coworkers are aware of.
Squashing Commits
Using squash
for our commits is a very handy way of keeping our Git history tidy. It’s important to know how to squash
, because this process may be the standard on some development teams. Squashing makes it easier for others to understand the history of your project. What often happens when a feature is merged, is we end up with some visually complex logs of all the changes a feature branch had on a main branch. These commits are important while the feature is in development, but aren’t really necessary when looking through the entire history of your main branch.
Let’s say we want to squash
the second commit into the first commit on the list, which is Create first file
. First let’s rebase all the way back to our root commit by typing git rebase -i --root
. Now what we’ll do is pick
that first commit, as the one which the second commit is being squash
ed into:
pick e30ff48 Create first file
squash 92aa6f3 Create second file
pick 05e5413 Create third file and create fourth file
Rename the commit to Create first and second file
, then finish the rebase. That’s it! Run git log
and see how the first two commits got squashed together.
Splitting Up a Commit
Before diving into Remotes, we’re going to have a look at a handy Git command called reset
. Let’s have a look at the commit Create third file and create fourth file
. At the moment we’re using blank files for convenience, but let’s say these files contained functionality and the commit was describing too much at once. In that case what we could do is split it up into two smaller commits by, once again, using the interactive rebase
tool.
We open up the tool just like last time, change pick
to edit
for the commit we’re going to split. Now, however, what we’re going to do is run git reset HEAD^
, which resets the commit to the one right before HEAD. This allows us to add the files individually, add, and commit them individually. All together it would look something like this:
$ git reset HEAD^
$ git add test3.md && git commit -m 'Create third file'
$ git add test4.md && git commit -m 'Create fourth file'
Let’s start by looking a bit closer at what happened here. When you ran git reset
, you reset the current branch by pointing HEAD at the commit right before it. At the same time, git reset
also updated the index (the staging area) with the contents of wherever HEAD is now pointed. So our staging area was also reset to what it was at the prior commit - which is great - because this allowed us to add and commit both files separately.
Now let’s say we want to move where HEAD points to but don’t want to touch the staging area. If we want to leave the index alone, you can use git reset --soft
. This would only perform the first part of git reset
where the HEAD is moved to point somewhere else.
The last part of reset we want to touch upon is git reset --hard
. What this does is it performs all the steps of git reset
, moving the HEAD and updating the index, but it also updates the working directory. This is important to note because it can be dangerous as it can potentially destroy data. A hard reset overwrites the files in the working directory to make it look exactly like the staging area of wherever HEAD ends up pointing to. Similarly to git commit --amend
, a hard reset is a destructive command which overwrites history. This doesn’t mean you should completely avoid it if working with shared repositories on a team with other developers. You should, however, make sure you know exactly why you’re using it, and that your coworkers are also aware of how and why you’re using it.
Working With Remotes
Thus far you’ve been working with remote repositories each time you’ve pushed or pulled from your own GitHub repository while working on the curriculum’s various projects. In this section we’re going to cover some slightly more advanced topics, which you might not have yet encountered or had to use.
git push --force
Let’s say you’re no longer working on a project all by yourself, but with someone else. You want to push a branch you’ve made changes on to a remote repository. Normally Git will only let you push your changes if you’ve already updated your local branch with the latest commits from this remote.
If you haven’t updated your local branch, and you’re attempting to git push
a commit which would create a conflict on the remote repository, you’ll get an error message. This is actually a great thing! This is a safety mechanism to prevent you from overwriting commits created by the people you’re working with, which could be disastrous. You get the error because your history is outdated.
You might perform a brief query and find the command git push --force
. This command overwrites the remote repository with your own local history. So what would happen if we used this while working with others? Well let’s see what would happen when we’re working with ourselves. Type the following commands into your terminal, and when the interactive rebase tool pops up remove our commit for Create fourth file
:
$ git push origin main
$ git rebase -i --root
$ git push --force
$ git log
Huh, that’s interesting, we don’t see our fourth file on our local system. Let’s check our GitHub repository, is our file test4.md there?
No! We just destroyed it, which in this scenario is the danger - you could potentially destroy the work of those you’re collaborating with! git push --force
is a very dangerous command, and it should be used with caution when collaborating with others. Instead, you can fix your outdated history error by updating your local history using fetch
, merge
, and then attempting to push
again.
Let’s consider a different scenario:
$ touch test4.md
$ git add test4.md && git commit -m "Create fifth file"
$ git push origin main
$ git log
We look at our commit message and realize oops, we made a mistake. We want to undo this commit and are once again tempted to just force the push. But wait, remember, this is a very dangerous command. If we’re ever considering using it, always check if it’s appropriate and if we can use a safer command instead. If we’re collaborating with others and want to undo a commit we just made, we can instead use git revert
!
git revert HEAD
git push origin main
Remember when we were working with HEAD, aka the current commit we’re viewing, while rebasing? What this would do is it would revert the changes to HEAD! Then we would push our new commit to whichever branch we’re working on, which in this example is main even though normally our work would most likely be on a feature-branch.
So now that we’ve learned about the various dangers of git push --force
, you’re probably wondering why it exists and when to use it. A very common scenario in which developers use git push --force
is updating pull requests. Collaborative work is covered more in depth in a separate lesson, but the take-away from this section should be that the --force
option should be used only when you are certain that it is appropriate. There are also less common scenarios, such as when sensitive information is accidentally uploaded to a repository and you want to remove all occurrences of it.
It is worth giving special mention to git push --force-with-lease
, a command which in some companies is the default option. The reason for this is that it’s a fail-safe! It checks if the branch you’re attempting to push to has been updated and sends you an error if it has. This gives you an opportunity to, as mentioned before, fetch
the work and update your local repository.
Dangers and Best Practices
Let’s review the dangers we’ve addressed so far. I know, I know, it’s scary stuff - but we have to be mindful or our coworkers might end up hating our guts! If you look back through this lesson you’ll see a common thread. amend
, rebase
, reset
, push --force
are all especially dangerous when you’re collaborating with others. These commands can destroy work your coworkers have created. So keep that in mind. When attempting to rewrite history, always check the dangers of the particular command you’re using and follow these best practices for the commands we’ve covered:
- If working on a team project, make sure rewriting history is safe to do and others know you’re doing it.
- Ideally, stick to using these commands only on branches that you’re working with by yourself.
- Using the
-f
flag to force something should scare you, and you better have a really good reason for using it. - Don’t push after every single commit, changing published history should be avoided when possible.
- Regarding the specific commands we’ve covered:
- For
git amend
never amend commits that have been pushed to remote repositories. - For
git rebase
never rebase a repository that others may work off of. - For
git reset
never reset commits that have been pushed to remote repositories. - For
git push --force
only use it when appropriate, use it with caution, and preferably default to usinggit push --force-with-lease
.
- For
Branches Are Pointers
While the focus of this lesson was more advanced tools for changing Git history, we’re going into another advanced topic that might be hard for some to understand - Pointers. You’ve already learned about branches in the Rock Paper Scissors revisited lesson and how these hold multiple alternate reality versions of our files. Now we’re going to discuss what that actually means under the hood, and what it means for branches to be pointers.
Before we dive into branches, let’s talk about commits. If you recall this Git basics lesson from foundations, they were described as Snapshots. If it helps, think of this in a very literal sense. Every time you type in git commit
, your computer is taking a picture of all the file contents that have been staged with git add
. In other words, your entire tracked workspace gets copied.
So what is a branch? Based off of your exposure, you might be visualizing a branch as a group of commits. This actually isn’t the case! A branch is actually a pointer to a single commit! Hearing this, your first thought might be “Well if a branch is just a finger pointing at a single commit, how does that single commit know about all the commits that came before it?” The answer to this question is very simple: Each commit is also a pointer that points to the commit that came before it! Wow. This might be a lot to take in, so let’s take a moment to absorb that fact.
Now that you’ve had a second to gather your thoughts and attempt to wrap your head around this concept, it might help to go back and look at a concrete example of pointers we used in this lesson. Let’s think back to our use of git rebase -i HEAD~2
. If you can remember, this command lets us edit the last two commits. Do you have any guesses on how Git knew which two commits to edit? That’s right, by using pointers! We start at HEAD, which is a special pointer for keeping track of the branch you’re currently on. HEAD points to our most recent commit in the current branch. That commit points to the commit made directly before it, which we can call commit two. That’s how git rebase -i HEAD~2
starts with a HEAD pointer, and then follows subsequent pointers to find which two commits to edit.
You might be feeling overwhelmed at this point, so let’s recap what we’ve learned. A branch is simply a pointer to a single commit. A commit is a snapshot, and it’s a pointer to the commit directly behind it in history. That’s it!
Assignment
- Read through GitHub’s documentation on merge conflicts
- It’s only a matter of time until you run into one (if you haven’t already)! While merge conflicts might seem intimidating, they’re actually very simple. Take your time with this resource and make sure you look at the two different ways the documentation suggests resolving merge conflicts - on GitHub itself, and on your command line. While you might not need this right now, keeping the source of this documentation in the back of your mind will prove invaluable for when you eventually run into a merge conflict and aren’t sure where to find a simple solution.
- Read think-like-a-git
- Take your time with this resource as well, it’s very well written and will be very helpful in solidifying your understanding of Git.
-
Read the chapter on Rebasing covered by git-scm for an even deeper dive into Rebasing.
- Read the chapter on Reset covered by git-scm for a deeper dive into
git reset
.
Additional Resources
This section contains helpful links to related content. It isn’t required, so consider it supplemental.
- Read this Git Cheat Sheet if you need a reference sheet.
- Watch this video about Rebase & Merge for an example of how to use both rebase and merge.
- Read the chapter on Branches covered by git-scm if you want an even deeper dive into Branches.
Knowledge Check
This section contains questions for you to check your understanding of this lesson on your own. If you’re having trouble answering a question, click it and review the material it links to.
- How can you amend your last commit?
- What are some different ways to rewrite history?
- What is a safe way to push history changes to a remote repository?
- What are the dangers of history-changing operations?
- What are best practices of history-changing operations?
- Explain what it means for branches to be pointers.