Extensible Design in C++

Namespaces vs Static Methods

In C++, if you want a bunch of plain functions that you can put into a library, you can either do this:

class UtilityFunctions
{
public:
  static void Foo();
  static void Bar();
};

or this:

namespace UtilityFunctions
{
  void Foo();
  void Bar();
}

Either way, you call the functions with the same syntax:

void SomeFunction()
{
  UtilityFunctions::Foo();
  UtilityFunctions::Bar();
}

So, what’s the difference between using a class full of static methods vs using namespaces? While I’m sure there’s plenty of differences about how they’re implemented internally, the big difference is that namespaces are extensible, while a class full of static methods isn’t. That means that in another file, you can just add more functions to the namespace:

namespace UtilityFunctions
{
  void Baz();
  void Quux();
}

but you can’t add more static methods to the class.

The Expression Problem

This is rather handy, since it means that C++ can quite nicely solve the expression problem, a problem that plagues nearly all modern programming languages, even ones with expressive type systems such as Haskell and Ocaml. (Note that the expression problem specifically concerns statically typed languages, so while there are solutions for it in modern dynamic languages such as Python, Perl and Ruby, they don’t really count since they’re not statically typed. It’s easy to solve the problem if you’re prepared to throw away all notions of type safety at compile time!)

The expression problem is basically this:

  • You want to able to add new data types, and have existing functions work on those data types. This is easy in an object-oriented language: just subclass the existing data types, and all existing functions will work just fine with your new subclass. This is (very) hard in a functional language, because if you add new cases to a variant type, you must update every pattern match to work properly with the new case.
  • However, you also want to add new functions that will work with those data types. This is very easy in a functional language: just define a new function. This is solvable in an object-oriented language, but isn’t very elegant, because most object-oriented languages can’t add new methods to existing classes (Objective-C is a notable exception; see the footnote below). This means that you are forced to declare a function when you really wanted to add a new method to the class, or, in OO languages which don’t even have normal functions (e.g. Java), you have to declare a totally new class with a static method instead. Ouch.

However, since C++ provides (1) objects, (2) normal functions, and (3) extensible namespaces, this means that you solve the expression problem nicely using the above techniques. It still requires some forethought by planning to use a namespace for sets of functions that you expect to be able to extend, but it’s an elegant solution to the expression problem, as opposed to no solution or a crappy solution. (And I thought I’d never say “C++” and “elegant” in the same sentence).

Extensible Object Factories

There’s one more piece to the puzzle, however. If you’re making your own new subclass, you also want to be able to create objects of that class. However, what if you only know the exact type of object you want to create at runtime? Use a runtime-extensible object factory instead.

Let’s say you’re designing an extensible image library, to read a bunch of image formats such as JPG, PNG, GIF, etc. You can design an abstract Image class that a JPGImage, PNGImage, or GIFImage can then subclass. If you want a uniform interface to create such images, you can use the factory design pattern:

Image* image = ImageFactory::CreateImage("/path/to/image");

In this case, CreateImage() is a factory function that will return you an appropriate Image* object. (Well, if you’re really disciplined, you’ll be using the wonderful boost::shared_ptr rather than an evil raw pointer, but I digress…)

Now, let’s say you want to make this library extensible, so users could add in their own JPEG2000Image subclass outside of your library. How, then, do you let the CreateImage() function know about the user’s new JPEG2000Image class?

There are plenty of solutions to this, but since this is meant to be a didactic post, here’s a cheap’n’cheery solution for you: use a data structure to hold references to functions that are responsible for creating each different type of JPGImage, PNGImage, etc. You can then add to the data structure at runtime (usually called registering the creation function). The CreateImage() function can then look up the registered functions in the extensible data structure and call the appropriate function, no matter whether the image class is provided by your library (JPG, PNG), or by the user (JPEG2000).

If you put together all the above techniques, what you have is a fully extensible framework. A user can:

  • register new data types with the library at run-time,
  • use exactly the same interface to create new types of objects,
  • add new functions to your library without the awkwardness of using a different namespace,
  • … and still retain complete static type safety.

Footnote: Objective-C has a particularly interesting solution to the expression problem, via categories, which are statically type-checked despite Objective-C being a “dynamic” language.

blog comments powered by Disqus