Spirit: Parser Combinators for C++

28 December 2004, 04:52 AM Filed in: …on Coding

I was thinking of writing a parser combinator library for C++ today so that I could write a C++ parser in a style similar to using Daan Leijen’s awesome Parsec Haskell library. Then, I came across Spirit, part of the excellent C++ Boost libraries. Of course, they’re advertised as template-based parsers rather than parser combinator-based parsers, since C++ programmers will go blank in the face when you say ‘parser combinators’. If you’re not familiar with parser combinators, here should be all the motivation you need for using Spirit, from its introduction page:

A simple EBNF grammar snippet:

    group       ::= '(' expression ')'
    factor      ::= integer | group
    term        ::= factor (('*' factor) | ('/' factor))*
    expression  ::= term (('+' term) | ('-' term))*

is approximated using Spirit's facilities as seen in this code snippet:

    group       = '(' >> expression >> ')';
    factor      = integer | group;
    term        = factor >> *(('*' >> factor) | ('/' >> factor));
    expression  = term >> *(('+' >> term) | ('-' >> term));

Mapping an EBNF directly on to the language syntax: ahh, so good. If only more people realised that the whole embedded domain-specific language approach is so nice!

Comments

Custom Screen Sizes with NVidia Chipsets

13 December 2004, 12:46 AM Filed in: …on Software

Well, here’s something I had absolutely no idea existed before today: you can add your own custom screen resolutions with NVidia’s video drivers.

Control Panels -> Display Properties -> Settings tab -> Advanced button -> your nvidia chipset tab

Select the Screen Resolutions & Refresh Rates menu item on the ‘drawer’ next to the dialog box

Click on the Add button, and add away.

This is great for those monitors which can’t quite push it to 1600×1200 comfortably (e.g. being either too blurry or just having too low a refresh rate at such high resoutions). I’m running my old-ish 21” CRT at 1400×1050 now at 85Hz: quite a decent amount more desktop real estate than 1280×1024, with a refresh rate where I won’t be tearing my eyes out. Nice.

Comments

I'm .cgi

06 December 2004, 09:53 AM Filed in: About Me

Which File Extension are You?

Comments

Logitech V500

03 December 2004, 09:25 PM Filed in: …on Gadgets

Aww jeah, I gotta have me one of these. And it’s even accompanied by the most impressive advertising I’ve ever seen for a mouse. (I particularly like the “see the scroll panel in action” demo. Look at that spreadsheet fly!)

Comments

RDF, the Semantic Web, Filesystems, and Databases

03 December 2004, 09:17 PM Filed in: …on Coding

The propellerheads at Lambda have an interesting discussion that started with RDF (at least, interesting if you’re already familiar with RDF)), and evolved to discussing not only RDF, but also the semantic web, data schema, and ReiserFS and file systems.

Comments

Numerics Support in Programming Languages

02 December 2004, 06:18 AM Filed in: …on Coding

A nice quote I found from a comment on Slashdot:

Languages like OCAML and Java are defective as general purpose languages because they don’t support efficient data abstraction for numerical types. The fact that their designers just don’t get that fact is a testament to the ignorance of their designers. It’s also what people really mean when they say that those kinds of languages are “just not efficient as C/C++”: it means that in C/C++, you can get whatever code you write to run fast, while in OCAML or Java, there are always problems where you have to drop down to C.

I’ll insert the usual “I agree” here. This is especially a problem with language benchmarks, which typically have benchmarks that operate on huge numbers of tiny bits of data. This usually destroys any chances that a functional language has of competing with a very low-level language like C/C++, because often these big arrays are represented as a series of pointers to the actual data, rather than simply big arrays that directly contain the data. This means one more pointer indirection for every index operation in the array, blowing away your cache hits and thus making your program run several orders of magnitudes slower. (If your language is also lazy, like Haskell is, you basically cannot work around this performance restriction unless you make your data structure strict … in which case, well, you’re not using laziness any more.)

This problem needs to be solved, without forcing the programmer to spend lots of time annotating exactly what data structures should be unboxed and what should be boxed, and what functions are OK to work with unboxed/strict data structures. Otherwise, people just aren’t going to these other languages for processing large quantities of small data. And in this day and age of computing, processing large quantities of small data is required for quite a lot of applications …

There’s also some interesting comments about Python 2.4’s new generator expressions, and how they are similar yet different to lambda expressions/anonymous functions: in particular, how they appear to give rather nice performance benefits. I haven’t given generators too much thought at all yet, assuming they were less elegant, ad-hoc implementation of representing lazy data structures. Sounds like I have some investigation to do!

Comments

Stay Hungry, Stay Foolish

André Pang's Weblog

Spirit: Parser Combinators for C++

Custom Screen Sizes with NVidia Chipsets

I'm .cgi

Logitech V500

RDF, the Semantic Web, Filesystems, and Databases

Numerics Support in Programming Languages