Published:
18 January 2012

What Submodules Are

Submodules are repositories inside another repository.

The use case is pretty obvious: You are working on a Symphony project and need a few special extensions to implement a certain feature. Luckily, the official Symphony 2 repository has taught us how to deal with it: Simply add the repository-URL to our project as a submodule and we end up with a fresh copy of the newest version of that extension:

> git submodule add git://github.com/nilshoerrmann/subsectionmanager.git extensions/subsectionmanager
  Cloning into extensions/subsectionmanager...

  ...

> git submodule update --init
  Submodule 'extensions/subsectionmanager' (git://github.com/nilshoerrmann/subsectionmanager.git) registered for path 'extensions/subsectionmanager'

But what's happening behind the curtains?

Git does three things:

  1. It creates the directory you've specified
  2. Clones that repository into that directory
  3. Finally, it appends the just added URL and path to the .gitmodules file

Yes, it's doing a full clone with all the usual stuff: the .git folder, its full history, a workspace, branches and even a remote called origin. But everything is kept entirely separate from our own superproject.

And because it's a proper repository itself you can execute all the usual commands in the submodule as well: pulling, pushing, checkouts etc.

To see what I mean go into the submodule and compare the log to the one of your superproject:

> git log
> cd extensions/subsectionmanager
> git log

What submodule update Does

Now that we have added and committed the submodule to our project one particular question arises: What happens when one of my teammates pulls the changes I just made? What if the submodule has been updated in the meantime, will she get the newer version?

The answer lies in the commit diff in wich we've added the submodule:

+Subproject commit f5e13a069e3533fae5b7b5781ab1b545bb0183cb

That looks strikingly familiar: It looks like a commit-ID. But oddly enough none from your own project. Instead, it comes from the submodule: Apparently, Git not only adds the URL and path to .gitmodules but also remembers what commit has been checked out in the submodule during the commit.

Knowing that makes it obvious to understand what Git has to do if one of your teammates wants to enable the submodule you've just committed:

For each new or changed submodule

  1. Create the path, if necessary
  2. Clone the repository there, if necessary
  3. Checkout that specific commit in the submodule

Number 3 is the important part here. Once somebody on your team has decided what revision of a submodule you want for your project all of your teammates will end up with that exact same revision.

To do all that, you only have to use one single command Git provides:

 > git submodule update --init

The annoying part is that Git won't do it automatically after you did a pull. You have to do it manually.

What submodule update Doesn't Do

Wait, update? Doesn't that update all my submodules to the newest version?

No, no, no! Let go of that thought. In fact, git submodule update does quite the contrary: It iterates through all submodules and checkouts the revision it thinks they should be at.

It makes sure none of your submodules is too old or too new.

How To Update Submodules

So, git submodule update does not update my submodules but reverts changes made in them. How do we update submodules then?

Simple: Since all of your submodules are fully functional repositories themselves you can do a git pull origin master there:

> cd extensions/subsectionmanager
> git pull origin master
> cd ../..

You'll note though that after doing that your submodule and your project are kind of out of sync:

> git diff

  ...

 -Subproject commit f5e13a069e3533fae5b7b5781ab1b545bb0183cb
 +Subproject commit 63cf99ac7854c4e80c1dd18ec84461f5fc1eef14

Your project thinks the extension should be at f5e13a0... while in reality it is at a newer version 63cf99a.... This behaviour is correct and intended.

After updating the extension you'll naturally have to do some testing and maybe some more updates. Only if you've decided the new extension is worth an update you go ahead and "tell the project" by committing the new submodule-version:

> git add extensions/subsectionmanager
> git commit -m "Newest version of SSM"

So, updating extensions is a deliberate and very precise process and doesn't happen at random.

If you now think "updating all 100 extensions will take ages!" here is a gem for you:

> git submodule foreach git pull origin master

This will do do exactly what it sounds like: It will iterate through each and everyone of your extensions and run git pull origin master there. Be warned though: sideeffects may occur. :-)

But if at one point you're not sure you really want to update all those extensions you can simply shove all submodules back into in rank and file using

> git submodule update --init

Problems You May Encounter

The most common problem you may have to face is the fact that all submodule remotes must be accessible and readable by everyone in your team.

Read-Only Access

For example you may be using a submodule that you've written yourself. When you add that submodule to the project, you have to pay attention that you use an URL that's accessible by everyone on the project:

> git submodule add git@github.com/nils-werner/dump_db.git extensions/dump_db

will add the extension just as you'd expect it. It will work perfectly fine to the point until somebody else has to work on it. In most cases you're the only person able to read from that URL, your teammates will get permission denied errors.

The correct one would've been the public read-only version: git://github.com/nils-werner/dump_db.git.

Disappearing Users

Other problems may be beyond your control though: Sometimes people leave the community and delete their accounts on GitHub. Or you may have used somebody's fork of an extension on GitHub; a fork that has been deleted in the meantime.

You may be able to find another fork of that extension (obviously one that also has the commit you're interest in) but how do you tell your project and your teammates?

The solution is pretty easy but requires some work by hand:

  1. Edit the entry in .gitmodules, replacing the outdated URL by the new one
  2. Synchronize all your submodule-remotes with the URLs in .gitmodules
  3. Commit the change

> git submodule sync
> git commit -m "Tracking working repository for Subsection Manager"

Disappearing Code

You're in a bit more trouble if not only the URL of a repo has changed but if you can't find a working fork of that extension at all. It appears you'll be hitting a roadblock and might have to stop using that extension altogether.

But luckily, you're wrong. You do in fact know at least one fork of that extension: The submodule itself, on your machine. That submodule is a perfect clone of the repository you're looking for!

All you need to do is create a repository on say GitHub, then change into the submodule-folder, add the GitHub-repository to its remotes and push it.

> cd extensions/subsectionmanager
> git remote add myown git@github.com/nils-werner/subsectionmanager.git
> git push myown master

Afterwards, update the URL in .gitmodules and run git submodule sync as described before. Easy.

Deleting Submodules

Oh the pain when you've accidentally added a submodule to a wrong folder or you've found out you won't be needing it in your project. There is no git submodule rm command and Git will be quite annoying if you removed the folder by hand. One could be tempted to start all over again if that happens.

While the solution might be a bit more uncomfortable than a submodule rm command the solution is actually pretty simple too:

First, we have to remove the following from .git/config:

[submodule "extensions/subsectionmanager"]
        url = git://github.com/nilshoerrmann/subsectionmanager.git

then a pretty similar block from .gitmodules:

[submodule "extensions/subsectionmanager"]
        path = extensions/subsectionmanager
        url = git://github.com/nilshoerrmann/subsectionmanager.git

Afterwards we delete the folder and commit the changes:

> rm -rf extensions/subsectionmanager
> git add -u
> git commit -m "Subsection Manager removed"

Conclusion

As you can see, handling submodules is a pretty straightforward process if you treat them like the things they are: Two entirely separate repositories, both with their own workspace, log and remotes.

You can do all the usual commands in both of them without disturbing the other. The only difference is: Your project always tracks what revision your submodules are at and asks you to commit the new state after you've updated one of them.

Author

phoque Nils Werner Germany

Related Concepts

Symphony • Open Source XSLT CMS

Server Requirements

  • PHP 5.3-5.6 or 7.0-7.3
  • PHP's LibXML module, with the XSLT extension enabled (--with-xsl)
  • MySQL 5.5 or above
  • An Apache or Litespeed webserver
  • Apache's mod_rewrite module or equivalent

Compatible Hosts

Sign in

Login details