LCFG XML Profile changes

As part of the LCFG v4 client project I am working on converting the XML profile parsing over to using the libxml2 library. Recent testing has revealed a number of shortcomings in the way the LCFG XML profiles are generated which break parsers which are stricter than the old W3C code upon which the current client is based. In particular the encoding of entities has always been done in a style which is more suitable for HTML than XML. There is really only a small set of characters that must be encoded for XML, those are: single-quote, double-quote, left-angle-bracket, right-angle-bracket and ampersand (in some contexts the set can be even smaller). The new XML parser was barfing on unknown named entities which would be supported by a typical web browser. It is possible to educate an XML parser about these entities but it’s not really necessary. A better solution is to emit XML which is utf-8 compliant which avoids the needs for additional encoding. Alongside this problem of encoding more than was necessary the server was not encoding significant whitespace, e.g. newlines, carriage returns and tabs. By default a standards compliant XML parser will ignore such whitespace. An LCFG resource might well contain such whitespace so it was necessary to add encoding support to the server. In the process of making these changes to the LCFG::Server::Profile::XML module I merged all the calls to the encoder into a call to a single new EncodeData subroutine so that it is now trivial to tweak the encoding as required. These changes will be going out in version 3.3.0 of the LCFG-Compiler package in the next stable release. As always, please let us know if these changes break anything.

Comments are closed.