SVN externs are terrible and you shouldn’t use them. Also don’t use SVN if you’re able to avoid it, but that’s likely already done in modern organizations.
What is SVN?
Before digging into the meat of why svn externs are terrible, let’s start off with what SVN (Subversion) is; SVN is a system used to store source code and assets for software projects. The main repository is stored on a remote server and is represented as a collection of files and folders. You as a SVN user can check out any subfolder or file within a repository. In the case of an organization wanting multiple concurrent projects they’re stored in separate folders.
For one repository there is only a single history stored on the server that marches forward linearly with the revision id tracker which consists of a number counting up from the first commit. So you will see r1 (revision 1) all the way up to whatever stage the current history is in e.g. r14234. What about tagging a copy of source code or branching it to do separate development which later on gets merged? Well, SVN 'supports' those features, but at the same time it doesn’t really… A tag or a branch is just copying the files in a project to another named folder. There are conventions about the names and locations of those new folders, but it’s IMO a fairly weak feature.
What are SVN externs?
Let’s say that you have in your SVN repository two projects:
/ project1 / trunk / branches / project2 / trunk / branches
Perhaps project 2 uses project 1, or in other words it depends on that project’s code and resources. Dependency management is tricky and there’s a number of approaches, one of which is to copy code from project 1 into project 2, so when you build project 2, you end up building both projects together. However if you copy code from project 1, that means the copy can get out of date. So, instead you can reference the code from project 1 instead of copying it. That’s a good thing, right?
Well…… no, not from my point of view, at least without additional semantics. Just inserting a reference isn’t a great thing because when you’re working on project 2 you shouldn’t be impacted negatively by changes in project 1 and thus you don’t want breaking changes to interrupt your development unless you’re agreeing to trying out those breaking changes. SVN externs will take a copy of whatever you point them at, which can be used responsibly.
Consider the previous example again:
/ project1 / trunk / src / library.h / branches / branch1 / src / library.h / tags / release_v1 / src / library.h release_v2 / src / library.h / project2 / trunk / dependencies / $external_resource / branches
Here $external resource can point to a known working copy of project one which isn’t expected to change e.g. /project1/tags/release_v2 . If that’s the target, then when release_v3 is changed, then someone in project2 can update the library they’ve chosen to use to /project1/tags/release_v3. But, here’s where svn-externs get to be evil: $external_resource can be /project1/trunk/, /project1/branches/branch1, an individual file like /project1/trunk/src/library.h, or even a reference to a different SVN repository that you don’t control at all e.g. https://example.com/totally/not/your/repository/library.h.
If it’s any of those locations, the developers on project 1 can make changes which without any release process are immediately reflected in project 2. Why is that bad? Well, developers in project 1 aren’t informed about who is using their code as such by SVN. Who knows if a single file that you made is essentially getting copied all over the place. Who knows if your self consistent build process is going to be scrambled by being only partially copied elsewhere? Since svn externs make it equally easy to grab parts of libraries and live/tagged versions it encourages other projects to use internals (they don’t need the whole library) and use development copies (who wants to update versions later on), which end poorly for people trying to work on the project getting sliced up. Those developers are left guessing who is using their code and how they’re using it since unless you write additional tooling it’s near impossible to know how many externs there are within a given repo.
If you think that’s bad, let me remark that recursive SVN externs are a thing and I’m thankful that it looks like they’re rarely used. At least that’s based on both git-svn and git-svn-ext struggling with those cases. Heck at least last time I tried to use it git-svn barely supported SVN externs.
Along the same lines of tooling breaking on externs, even good old commandline svn starts to get ugly with externs. SVN status, similar to other scm status tools, will output one line per file that’s modified.
So you might see:
D oldcode.rb A newcode.py M workingcode.sh
That status would indicate that you deleted a ruby file, added a python file, and modified a shell script. The repository can contain many more files, but if they’re unmodified then you don’t get any status information on them because obviously you only care about what’s changed. Well, welcome to the world of SVN externs. Every Single Time you run svn status with a repository that has an extern it will end up having a X status on the file/folder that’s an extern reference. It doesn’t matter if it’s modified or not.
So you might see for a repository that you haven’t made any modifications:
X internal/project1 X vendor/libA X vendor/libB X vendor/libC Performing status on external item at 'internal/project1': X internal/project1/recursive-extern Performing status on external item at 'internal/project1/recursive-extern': Performing status on external item at 'vendor/libA': Performing status on external item at 'vendor/libB': Performing status on external item at 'vendor/libC':
Not to mention that working with svn externs isn’t via some 'svn extern' command, it’s via 'svn propget', 'svn propset', etc. It’s a thoroughly unpleasant experience through and through.
In conclusion, next time you hear someone complain about git submodules, feel free to say "hey, at least it isn’t svn externs".