Nov 2008

Coherence & Groupthink

Charles Petzold, one of the most famous authors of Windows programming books out there, wrote a great entry on his blog over a year ago that I’ve been meaning to comment on:

Once you’ve restricted yourself to information that turns up in Google searches, you begin having a very distorted view of the world.

On the Internet, everything is in tiny pieces. The typical online article or blog entry is 500, 1000, maybe 1500 words long. Sometimes somebody will write an extended “tutorial” on a topic, possibly 3,000 words in length, maybe even 5,000.

It’s easy to convince oneself that these bite-sized chunks of prose represent the optimum level of information granularity. It is part of the utopian vision of the web that this plethora of loosely-linked pages synergistically becomes all the information we need.

This illusion is affecting the way we learn, and I fear that we’re not getting the broader, more comprehensive overview that only a book can provide. A good author will encounter an unwieldy jungle of information and cut a coherent path through it, primarily by imposing a kind of narrative over the material. This is certainly true of works of history, biography, science, mathematics, philosophy, and so forth, and it is true of programming tutorials as well.

Sometimes you see somebody attempting to construct a tutorial narrative by providing a series a successive links to different web pages, but it never really works well because it lacks an author who has spent many months (or a year or more) primarily structuring the material into a narrative form.

For example, suppose you wanted to learn about the American Civil War. You certainly have plenty of online access to Wikipedia articles, blog entries, even scholarly articles. But I suggest that assembling all the pieces into a coherent whole is something best handled by a trained professional, and that’s why reading a book such as James McPherson’s Battle Cry of Freedom will give you a much better grasp of the American Civil War than hundreds of disparate articles.

If I sound elitist, it’s only because the time and difficulty required for wrapping a complex topic into a coherent narrative is often underestimated by those who have never done it. A book is not 150 successive blog entries, just like a novel isn’t 150 character sketches, descriptions, and scraps of dialog.

A related point I’d like to make is that people tend to read things that reinforce their viewpoints, and avoid things that go against their beliefs. If you’re a left-wing commie pinko in Sydney, you’re probably more likely to read the Sydney Morning Herald as your newspaper; if you’re a right-wing peacenik, you’ll probably prefer The Australian instead. If you’re a functional programming maven who sneers at C, you probably hang around Haskell or O’Caml or Erlang or Scheme geeks. If you’re a Mac programmer, you talk all day about how beautiful and glorious the Cocoa frameworks are, and probably have a firm hatred of C++ (even though there’s a decent chance you’ve never even used the language).

Hang around with other cultures sometimes. Like travelling, it’s good for you; it broadens your perspective, and gives you a better understanding of your own culture. The human nature of seeking confirmation of your own viewpoints, combined with Petzold’s astute observations about learning in bite-sized chunks, means that it’s incredibly easy to find information on the Internet that only explains one side of the story. How many people on your frequented mailing lists, IRC channels, Web forums or Twitter friends have similar opinions to you, and how many people in those communities truly understand other systems and have been shot down whenever they’ve tried to justify something valid that’s contrary to the community’s popular opinion? I’m not saying that hanging around like-minded communities is a bad idea; I’m simply saying to be aware of groupthink and self-reinforcing systems, and break out of your comfort zone sometimes to learn something totally different and contrary to what you’re used to. Make the effort to find out the whole picture; don’t settle for some random snippets of short tidbits that you read somewhere on the Web. Probably the best article I’ve ever read on advocacy is Mark-Jason Dominus’s Why I Hate Advocacy piece, written eight years ago in 2000. It still holds true today.


git-svn & svn:externals

I’ve written before about git-svn and why I use it, but a major stumbling block with git-svn has been been a lack of support for svn:externals. If your project’s small and you have full control over the repository, you may be fortunate enough to not have any svn:externals definitions, or perhaps you can restructure your repository so you don’t need them anymore and live in git and Subversion interoperability bliss.

However, many projects absolutely require svn:externals, and once you start having common libraries and frameworks that are shared amongst multiple projects, it becomes very difficult to avoid svn:externals. What to do for the git-svn user?

If you Google around, it’s easy enough to find solutions out there, such as git-me-up, step-by-step tutorials, explanations about using git submodules, and an overview of all the different ways you can integrate the two things nicely. However, I didn’t like any of those solutions: either they required too much effort, were too fragile and could break easily if you did something wrong with your git configuration, or were simply too complex for such a seemingly simple problem. (Ah, I do like dismissing entire classes of solutions by wand-having them as over-engineering.)

So, in the great spirit of scratching your own itch, here’s my own damn solution:


This is a very simple shell script to make git-svn clone your svn:externals definitions. Place the script in a directory where you have one or more svn:externals definitions, run it, and it will:

  • git svn clone each external into a .git_externals/ directory.
  • symlink the cloned repository in .git_externals/ to the proper directory name.
  • add the symlink and .git_externals/ to the .git/info/excludes/ file, so that you’re not pestered about it when performing a git status.

That’s pretty much about it. Low-tech and cheap and cheery, but I couldn’t find anything else like it after extensive Googling, so hopefully some other people out there with low-tech minds like mine will find this useful.

You could certainly make the script a lot more complex and do things such as share svn:externals repositories between different git repositories, traverse through the entire git repository to detect svn:externals definitions instead of having to place the script in the correct directory, etc… but this works, it’s simple, and it does just the one thing, unlike a lot of other git/svn integration scripts that I’ve found. I absolutely do welcome those features, but I figured I’d push this out since it works for me and is probably useful for others.

The source is on at Have fun subverting your Subversion overlords!