A Call For A Filesystem Abstraction Layer

Filesystems are fundamental things for computer systems: after all, you need to store your data somewhere, somehow. Modern operating systems largely use the same concepts for filesystems: a file’s just a bucket that holds some bytes, and files are organised into directories, which can be hierarchical. Some fancier filesystems keep track of file versions and have record-based I/O, and many filesystems now have multiple streams and extended attributes. However, filesystem organisation ideas have stayed largely the same over the past few decades.

I’d argue that the most important stuff on your machine is your data. There are designated places to put this data. On Windows, all the most important stuff on my computer should really live in C:\Documents and Settings\Andre Pang\My Documents. In practice, because of lax permissions, almost everyone I know doesn’t store their documents there: they splatter it over a bunch of random directories hanging off C:\, resulting in a giant lovely mess of directories where some are owned by applications and the system, and some are owned by the user.

Mac OS X is better, but only because I have discipline: /Users/andrep is where I put my stuff. However, I’ve seen plenty of Mac users have data and personal folders hanging off their root directory. I’m pretty sure that my parents have no idea that /Users/your-name-here is where you are meant to put your stuff, and to this day, I’m not quite sure where my dad keeps all his important documents on his Mac. I hope it’s in ~/Documents, but if not, can I blame him? (UNIX only gets this right because it enforces permissions on you. Try saving to / and it won’t work. If you argue that this is a good thing, you’re missing the point of this entire article.)

One OS that actually got this pretty much correct was classic Mac OS: all system stuff went into the System folder (which all the Cool Kids named “System ƒ”, of course). The entire system essentials were contained in just two files: System, and Finder, and you could even copy those files to any floppy disk and make a bootable disk (wow, imagine that). The entire rest of the filesystem was yours: with the exception of the System folder, you organised the file system as you pleased, rather than the filesystem enforcing a hierarchy on you. The stark difference in filesystem organisation between classic Mac OS and Mac OS X is largely due to a user-centric approach for Mac OS from the ground-up, whereas Mac OS X had to carry all its UNIX weight with it, so it had to compromise and use a more traditionally organised computer filesystem.

As an example, in Mac OS X, if you want to delete Photoshop’s preferences, you delete the ~/Library/Preferences/com.adobe.Photoshop.plist file. Or; maybe you should call it the Bibliothèque folder in France (because that’s what it’s displayed as in the Finder if you switch to French)… and why isn’t the Preferences folder name localised too, and what mere mortal is going to understand why it’s called com.adobe.Photoshop.plist? On a technical level, I completely understand why the Photoshop preferences file is in the ~/Library/Preferences/ directory. But at a user experience level, this is a giant step backwards from Mac OS, where you simply went to the System folder and you trashed the Adobe Photoshop Preferences file there. How is this progress?

I think the fundamental problem is that Windows Explorer, Finder, Nautilus and all the other file managers in the world are designed, by definition, to browse the filesystem. However, what we really want is an abstraction level for users that hides the filesystem from them, and only shows them relevant material, organised in a way that’s sensible for them. The main “file managers” on desktop OSs (Finder and Windows Explorer) should be operating at an abstraction level above the filesystem. The operating system should figure out where to put files on a technical (i.e. filesystem) level, but the filesystem hierarchy should be completely abstracted so that a user doesn’t even realise their stuff is going into /Users/joebloggs.

iTunes and iPhoto are an example of what I’m advocating, because they deal with all the file management for you. You don’t need to worry where your music is or how your photos are organised on the filesystem: you just know about songs and photos. There’s no reason why this can’t work for other types of documents too, and there’s no reason why such an abstracted view of the filesystem can’t work on a systemwide basis. It’s time for the operating system to completely abstract out the filesystem from the user experience, and to turn our main interaction with our documents—i.e. the Finder, Windows Explorer et al—into something that abstracts away the geek details to a sensible view of the documents that are important to us.

One modern operating system has already done this: iOS. iOS is remarkable for being an OS that I often forget is a full-fledged UNIX at its heart. In the iOS user experience, the notion of files is completely gone: the only filenames you ever see are usually email attachments. You think about things as photos, notes, voice memos, mail, and media; not files. I’d argue that this is a huge reason that users find an iPhone and iPad much more simple than a “real” computer: the OS organises the files for them, so they don’t have to think that a computer deals with files. A computer deal with photos and music instead.

There are problems with the iOS approach: the enforced sandboxing per app means that you can’t share files between apps, which is one of the most powerful (and useful) aspects of desktop operating systems. This is a surmountable goal, though, and I don’t think it’d be a difficult task to store documents that can be shared between apps. After all, it’s what desktop OSs do today: the main challenge is in presenting a view of the files that are sensible for the user. I don’t think we can—nor should—banish files, since we still need to serialise all of a document’s data into a form that’s easily transportable. However, a file manager should be metadata-centric and display document titles, keywords, and tags rather than filenames. For many documents, you can derive a filename from its metadata that you can then use to transport the file around.

We’ve tried making the filesystem more amenable to a better user experience by adding features such as extended attributes (think Mac OS type/creator information), and querying and indexing features, ala BFS. However, with the additional complexity of issues such as display names (i.e. localisation), requiring directory hierarchies that should remain invisible to users, and the simple but massive baggage of supporting traditional filesystem structures (/bin/ and /lib/ aren’t going away anytime soon, and make good technical sense), I don’t think we can shoehorn a filesystem browser anymore into something that’s friendly for users. We need a filesystem abstraction layer that’s system-wide. iOS has proven that it can be done. With Apple’s relentless progress march and willingness to change system APIs, Linux’s innovation in the filesystem arena and experimentation with the desktop computing metaphor, and Microsoft’s ambitious plans for Windows 8, maybe we can achieve this sometime in the next decade.


Linux Audio and the Paradox of Choice

Mike Melanson—a primary author of the Linux Flash plugin, xine, ffmpeg, and a general crazy-good multimedia hacker—on the state of Linux Audio APIs:

There are 2 primary methods of sending audio data to a DAC under Linux: OSS and ALSA. OSS came first; ALSA supplanted OSS. Despite this, and as stated above, there are numerous different ways to do the DAC send. There are libraries and frameworks that live at a higher level than OSS and ALSA. In the end, they all just send the data out through OSS or ALSA.

The zaniest part is that some of these higher level libraries can call each other, sometimes in a circular manner. Library A supports sending audio through both OSS and ALSA, and library B does the same. But then library A also has a wrapper to send audio through library B and vice versa. For that matter, OSS and ALSA both have emulation layers for each other. I took the time to map out all of the various libraries that I know of that operate on Linux and are capable of nudging that PCM data out to a DAC:

Barry Schwartz would be shaking his head, methinks. And yes, I’m well aware of efforts to unify this mess. That doesn’t excuse that this jungle has been the state of Linux audio for the past ten years. I love the comments too: instead of admitting how dumbass this is, they give suggestions for using even more APIs (“try KDE4’s Phonon! That’ll fix everything!”)… totally missing the amusing irony, and also missing the point that Mike needs something that works on as many Linux distributions as possible.


On Civil Debate

Compare the response given by David Heinemeier Hansson to Alex Payne in the recent Rails and scaling controversy, to Ingo Molnar’s response to Con Kolivas regarding the new Modular Schedule Core in Linux. Which community would you rather be part of based on this little sample?

(Somewhat ironic since it was Hansson himself that said “It’s no[t] the event that matters, but the reaction to it”, as well as being an evangelist for the It Just Doesn’t Matter principle.)


Pushing the Limits

OK, this is both ridiculous and cool at the same time. I need to write code for Mac OS X, Windows and Linux for work, and I like to work offline at cafes since I actually tend to get more work done when I’m not on the Internet (totally amazing, I know). This presents two problems:

  1. I need a laptop that will run Windows, Mac OS X and Linux.
  2. I need to work offline when we use Subversion for our revision control system at work.

Solving problem #1 turns out to be quite easy: get a MacBook (Pro), and run Mac OS X on it. Our server runs fine on Darwin (Mac OS X’s UNIX layer), and I can always run Windows and Linux with Parallels Desktop if I need to.

For serious Windows coding and testing, though, I actually need to boot into real Windows from time to time (since the program I work on, cineSync, requires decent video card support, which Parallels doesn’t virtualise very well yet). Again, no problem: use Apple’s Boot Camp to boot into Windows XP. Ah, but our server requires a UNIX environment and won’t run under Windows! Again, no problem: just install coLinux, a not very well known but truly awesome port of the Linux kernel that runs as a process on Windows at blazing speeds (with full networking support!).

Problem #2 — working offline with Subversion — is also easily solved. Download and install svk, and bingo, you have a fully distributed Subversion repository. Hack offline, commit your changes offline, and push them back to the master repository when you’re back online. Done.

Where it starts to get stupid is when I want to:

  • check in changes locally to the SVK repository on my laptop when I’m on Mac OS X…
  • and use those changes from the Mac partition’s SVK repository while I’m booted in Windows.

Stumped, eh? Not quite! Simply:

  • purchase one copy of MacDrive 6, which lets you read Mac OS X’s HFS+ partitions from Windows XP,
  • install SVK for Windows, and
  • set the %SVKROOT% environment variable in Windows to point to my home directory on the Mac partition.

Boom! I get full access to my local SVK repository from Windows, can commit back to it, and push those changes back to our main Subversion server whenever I get my lazy cafe-loving arse back online. So now, I can code up and commit changes for both Windows and the Mac while accessing a local test server when I’m totally offline. Beautiful!

But, the thing is… I’m using svk — a distributed front-end to a non-distributed revision control system — on a MacBook Pro running Windows XP — a machine intended to run Mac OS X — while Windows is merrily accessing my Mac HFS+ partition, and oh yeah, I need to run our server in Linux, which is actually coLinux running in Windows… which is, again, running on Mac. If I said this a year ago, people would have given me a crazy look. (Then again, I suppose I’m saying it today and people still give me crazy looks.) Somehow, somewhere, I think this is somewhat toward the evil end of the scale.

Comments 2006, Day 3 (Wednesday)

Right, it’s definitely shaping up to be one of those kind of weeks. I finally managed to catch some sleep this morning by missing the morning tutorials: considering I had a huge three and a half hours of sleep last night, about 6 hours the night before, and no sleep on the night before I flew off to New Zealand, I think it was about time to let my poor body recover for a while. I normally feel rather seedy and tired (in that unproductive-tired way) when I wake up at midday, but it was all good today.

So, I got my lazy ass over to the conference (O, the hardship of 5 minutes’ walk through beautiful university grounds) and actually managed to see all the talks that day, huzzah. Andrew Tridgell’s talk on Samba 4 absolutely rocked as you’d expect, even if you, like me, weren’t interested in Samba 4 at all. Being able to write Javascript to script server-related Windows RPC calls is crazy enough, but remotely editing a Windows’s machine’s registry via an AJAX-style interface in your Web browser was something else. Oh yes, and my little tip about inverting your screen to make it more readable also really saves your battery life: I was easily getting over 3 hours of battery out of my 3-year-old Powerbook. The temperature today’s a bit more like what the forecasts predicted, too: much cooler, being around 14℃ in the morning and night, and around 22℃ in the afternoon. I’m glad I brought along some long-sleeve tops!

Of course, it was after the conference proper when the fun started. Google were holding a round of drinks for conference delegates at night at the Bennu bar in Dunedin, so of course a lot of people came along to try to completely empty out the bar. It was meant to go from 9-10 only (so hurry up and get completely plastered in an hour) but it turns out that offering only beer for free makes a tab go a long way, so we were all still drinking courtesy of Google well past midnight. I managed to get several rounds of free vodka shots off the Google folks too, so overall, I didn’t do too badly considering I’m a Cadbury’s boy: four beers and three vodka shots left me in quite the happy mood when we left there some time after midnight. It was, again, damn good to catch up and socialise with everyone, and even more so when free beer’s offered! It’s a good week to be in Dunedin indeed :).

Comments 2006, Day 2 (Tuesday)

One interesting rule of thumb that Damian Conway mentioned in his presentation skills session is that, on average, it takes 8 hours of preparation per one hour of talking. I initially raised by eyebrows at this figure, but it turns out that Damian’s likely right (as usual): I ended staying up until around 4:30am to finish off our slides, and got up rather excruciatingly at 8am to grab breakfast at Jeff and Pia’s again. For those interested: Weet-bix + three pancakes (although I made a token attempt to be too polite to have their lovely pancakes, since I’d already had the weet-bix…). Ah yes, and the excellent Otago Daily Times’s front page story today was about how all the poor Dunedin citizens were all pasty-white thanks to a lack of sun this summer at the beach. I read this as a very nice excuse to slap pictures of three hot chicks in two-pieces on the front page of the paper.

I actually decided to skip the morning talks that day to work on the slides, so I ended up holing myself up in the (lovely) apartment until around 12. Yes, I think Damian’s 8 hours of preparation was correct indeed: Anthony and I probably spent around 10 hours of prep in total, though I’m fairly type A when it comes to making sure all the details are nailed down right.

I ended up having an energy bar and a Coke for lunch (sorry about that mum!) and managed to catch the end of Conrad’s talk on CMMLWiki when I got back, as well as watch parts of Keith Packard’s hilarious talk on Linux-powered rockets (complete with pictures of rockets hitting the earth at 800KM/h, and stories of their failed recovery of a CompactFlash card inside said rocket…).

I’m glad to report that I think our talk went pretty well: we had around 30 people attending, and Anthony and I got to chat afterwards with some people who were pretty interested in the stuff we were doing. Hopefully we’ll be able to get a videotape of it some time in the future, and I can place it here as a contribution to embarrassing myself more.

Since the talk was over, we decided that having some dinner was in order soon. No-one had any plans, so I made an executive decision to meet up at the Terrace at 7 o’clock, and the 4 or 5 folks who decided to go there grew to 8, then 10, then about 18. Hooray for lots of company! I had a most excellent mixed grill on hot rocks and more Speights beer for dinner, and enjoyed the merry company of all the other geeks until around midnight. Catching up with everyone here is truly great; since I’ve moved over to using a Mac as my main platform, I’m not so involved with the Linux community these days, and I forget from time to time how awesome everyone is (both from a social standpoint, and just how damn good these people are at what they do).

P.S. Linus is here, for those fanboys who are interested. The more amusing thing is that he’s really sunburnt. For the serious geeks, Van Jacobson (yep, that Van Jacobson) is also giving a talk. You can bet I’ll be attending that one.



We needed to set up some Annodex servers for demos this week, and our server software currently runs best on Linux. So, what to do if you’re using Windows machines which you can’t install Linux on for whatever reason, political or technical? Run Linux inside Windows, of course, via coLinux.

coLinux is great. No, scratch that — coLinux is really great. Not only does it work, it works really well: it’s fast (I really don’t think I’ve ever seen a Debian system boot up in 2-3 seconds), it’s stable, and it even uses a pretty small amount of memory, since Linux servers tend to be on the trim side. A full-blown Linux installation for us with Apache serving multi-megabyte multimedia streams to multiple Windows clients was using up less than 30MB of Windows’s memory pool. Low fat.

If you must have Windows on your desktop/laptop for whatever reason, but need Linux and are getting sick of doing the reboot dance just to switch OSs, give coLinux a whirl. And, if you want to get geek cred points, watch your friends’ jaws drop when they see X11 applications hosted on coLinux displaying in Cygwin/X; it’s pretty scary just how well it all works. Now, whither my coLinux for Mac OS X port (and flying car)?