All Hail the DeathStation 9000
While learning about the immaculate DeathStation 9000, I came across the homepage of Armed Response Technologies. That page nearly had me snort coffee out of my nose on my beautiful new 30” LCD monitor. Make very sure you see their bloopers page.
(This is quite possibly the best thing to ever come out of having so much stupidit… uhh, I mean, undefined behaviour, in C.)
The Problem with Threads
If you haven’t had much experience with the wonderful world of multithreading and don’t yet believe that threads are evil1, Edward A. Lee has an excellent essay named “The Problem with Threads”, which challenges you to solve a simple problem: write a thread-safe Observer design pattern in Java. Good luck. (Non-Java users who scoff at Java will often fare even worse, since Java is one of the few languages with some measure of in-built concurrency control primitives—even if those primitives still suck.)
His paper’s one of the best introductory essays I’ve read about the problems with shared state concurrency. (I call it an essay since it really reads a lot more like an essay than a research paper. If you’re afraid of academia and its usual jargon and formal style, don’t be: this paper’s an easy read.) For those who aren’t afraid of a bit of formal theory and maths, he presents a simple, convincing explanation of why multithreading is an inherently complex problem, using the good ol’ explanation of computational interleavings of sets of states.
His essay covers far more than just the problem of inherent complexity, however: Lee then discusses how bad threading actually is in practice, along with some software engineering improvements such as OpenMP, Tony Hoare’s idea of Communicating Sequential Processes2, Software Transactional Memory, and Actor-style languages such as Erlang. Most interestingly, he discusses why programming languages aimed at concurrency, such as Erlang, won’t succeed in the main marketplace.
Of course, how can you refuse to read a paper that has quotes such as these?
- “… a folk definition of insanity is to do the same thing over and over again and to expect the results to be different. By this definition, we in fact require that programmers of multithreaded systems be insane. Were they sane, they could not understand their programs.”
- “I conjecture that most multi-threaded general-purpose applications are, in fact, so full of concurrency bugs that as multi-core architectures become commonplace, these bugs will begin to show up as system failures. This scenario is bleak for computer vendors: their next generation of machines will become widely known as the ones on which many programs crash.”
- “Syntactically, threads are either a minor extension to these languages (as in Java) or just an external library. Semantically, of course, they rhoroughly disrupt the essential determinism of the languages. Regrettably, programmers seem to be more guided by syntax than semantics.”
- “… non-trivial multi-threaded programs are incomprehensible to humans. It is true that the programming model can be improved through the use of design patterns, better granularity of atomicity (e.g. transactions), improved languages, and formal methods. However, these techniques merely chip away at the unnecessarily enormous non-determinism of the threading model. The model remains intrinsically intractable.” (Does that “intractable” word remind you of anyone else?)
- “… adherents to… [a programming] language are viewed as traitors if they succumb to the use of another language. Language wars are religious wars, and few of these religions are polytheistic.”
If you’re a programmer and aren’t convinced yet that shared-state concurrency is evil, please, read the paper. Please? Think of the future. Think of your children.
1 Of course, any non-trivial exposure to multithreading automatically implies that you understand they are evil, so the latter part of that expression is somewhat superfluous.
2 Yep, that Tony Hoare—you know, the guy who invented Quicksort?
svk and the Psychological Effect of Fast Commits
svk—a distributed Subversion client by Chia Liang Kao and company—is now an essential part of my daily workflow. I’ve been using it almost exclusively for the past year on the main projects that I work with, and it’s fantastic being able to code when you’re on the road and do offline commits, syncing back to the main tree when you’re back online. Users of other distributed revision control systems do, of course, get these benefits, but svk’s ability to work with existing Subversion repositories is the killer reason to use it. (I’m aware that Bazaar has some Subversion integration now, but it’s still considered alpha, whereas svk has been very solid for a long time now.)
The ability to do local checkins with a distributed revision control client has a nice side-effect: commits are fast. They typically take around two seconds with svk. A checkin from a non-distributed revision control client such as Subversion requires a round-trip to the server. This isn’t too bad on a LAN, but even for a small commit, it can take more than 10 or 15 seconds to a server on the Internet. The key point is that these fast commits have a psychological effect: having short commit times encourages you to commit very regularly. I’ve found that since I’ve switched to svk, not only can I commit offline, but I commit much more often: sometimes half a dozen times inside of 10 minutes. (svk’s other cool feature of dropping files from the commit by deleting them from the commit message also helps a lot here.) Regular commits are always better than irregular commits, because either (1) you’re committing small patches that are easily reversible, and/or (2) you’re working very prolifically. Both of these are a win!
So, if you’re still using Subversion, try svk out just to get the benefits of this and its other nifty features. The svk documentation is quite sparse, but there are some excellent tutorials that are floating around the ‘net.