Parsing Schemas

March 18, 2008

For a while now I’ve been playing with the idea of parsing the LCFG schemas and converting them to a proper object-orientated representation using Moose. The first attempt was rapidly getting out of control so I decided to start again.

The legendary Damian Conway wrote a Perl module for text parsing named Parse::RecDescent, it’s been around a long time but it has never been surpassed. The big advantage here is that the module can be fed a yacc-like grammar but the results can be manipulated with all the power of Perl. The script for my second attempt is noticably smaller, it doesn’t yet do everything that the first script does but the principal code is done. It still needs to store the resulting parse tree and then push it through Template Toolkit rather than just run a load of print statements. All the power is now in the grammar, I don’t claim to be great at writing grammars so this is probably a lot more messy than it could be but it does work. As a simple example, I have parsed the schema for gdm, the output shows what the script has discovered.

This took a few hours of work, mainly this involved getting my head around Parse::RecDescent but it is already much more capable than my previous attempt.