An introduction to Git submodules
Over on the Monome online community lines, we’ve been discussing breaking up the monolithic git repository used for the Eurorack modules’ firmware. This is just a brief overview of Git submodules to see if they are a good fit for us.
A submodule allows you to keep another Git repository in a subdirectory of your repository. The other repository has its own history, which does not interfere with the history of the current repository. This can be used to have external dependencies such as third party libraries for example.
The Git submodule man page (emphasis mine)
Let’s work through a simple example to see how it works…
Setup
First, we’ll make our shared library repo lib
and add some commits.
mkdir lib
cd lib
git init
touch a
git add a
git commit -m "added a"
git mv a b
git commit -m "mv a b"
cd ..
Now, let’s create a repo repo1
(that will use the library via a submodule).
mkdir repo1
cd repo1
git init
cd ..
This gives us the following files (via tree
):
.
├── lib
│ └── b
└── repo1
Let’s add the submodule to repo1
as the directory common
.
cd repo1
git submodule add ../lib common
git commit -m "added submodule"
cd ..
And again, the output of tree
:
.
├── lib
│ └── b
└── repo1
└── common
└── b
Now we have a repo repo1
that contains a submodule in the directory common
from repo lib
.
Making changes to lib
Let’s make a change to lib
to demonstrate what submodules allow us to do.
cd lib
git mv b c
git commit -m "mv b c"
cd ..
Again, consider the output of tree
. Notice how repo1/common
still contains b
.
.
├── lib
│ └── c
└── repo1
└── common
└── b
Let’s clone repo1
to repo1_clone
.
git clone repo1 repo1_clone
That gives us:
.
├── lib
│ └── c
├── repo1
│ └── common
│ └── b
└── repo1_clone
└── common
Hmm, why hasn’t repo1_clone/common
got any files in it?
cd repo1_clone
git submodule update --init
cd ..
That’s better1:
.
├── lib
│ └── c
├── repo1
│ └── common
│ └── b
└── repo1_clone
└── common
└── b
The important thing to note is that both repo1
and repo1_clone
still contain b
rather than c
. This is because Git tracks a particular commit for the submodule rather than a branch or HEAD
. You need to explicitly update the parent repo to track a new commit. This is great as it means changes to the upstream repo of the submodule are not forced upon us.
The output of git log --pretty="format:%H %s"
in the lib
repo is:
1e5717c64e1150ca1da08521a24d8469c2bdde00 mv b c
86a5b293fa8f860730cd96c11b29b5f03fc2a60e mv a b
3c163ca8fcf336907e1b2a121f25bd550a71e5e3 added a
The output of git submodule status
in repo1
is:
86a5b293fa8f860730cd96c11b29b5f03fc2a60e common (heads/master)
Notice how the commit SHA for the common
entry in git submodule status
matches the second entry in the log for lib
. This shouldn’t come as a surprise, as our submodule hasn’t been updated to the latest changes in lib
yet.
Updating submodules
Let’s update repo1
to incorporate the changes made to lib
.
cd repo1/common
git pull
cd ..
git status
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: common (new commits)
no changes added to commit (use "git add" and/or "git commit -a")
The output of git status
is telling us that we have modified common
, we need to commit those changes to repo1
2.
git add common
git commit -m "update common"
cd ..
Now the output of tree
is:
.
├── lib
│ └── c
├── repo1
│ └── common
│ └── c
└── repo1_clone
└── common
└── b
The only thing left to do is to update repo1_clone
to reflect those changes:
cd repo1_clone
git pull
git submodule update
The git pull
updates repo1_clone
to match repo1
, but won’t update the common
directory. Assuming that common
is clean, you can run git submodule update
to update common
to the correct commit of lib
.
The final output of tree
is:
.
├── lib
│ └── c
├── repo1
│ └── common
│ └── c
└── repo1_clone
└── common
└── c
Taking things further
The common
directories in repo1
and repo1_clone
are normal Git repos, that are cloned from lib
. You can do all the normal things inside them that you would in any other Git repo: branch
, checkout
, commit
, pull
and even push
. So if the work you’re doing on the lib
repo is best done in the context of repo1
, you can make your changes and commits in repo1/common
—you just need to remember to commit the directory common
to repo1
when you want repo1
to be updated to reference the new commits you’ve made to lib
.