python and string encodings

May 21, 2013

I’ve recently finished the User accessible login reports project. After the initial roll-out to users I had a few reports of people getting server errors when certain sets of data were viewed. This website is written in Python and uses the Django framework. During the template processing stage we were getting error messages like the following:

DjangoUnicodeDecodeError: 'utf8' codec can't decode byte 0xe0 in position 30: invalid continuation byte.

It appears that not all data coming from the whois service is encoded in the same way (see RFC 3912 for a discussion of the issue). In this case it was using a latin1 encoding but whois is quite an old service which has no support for declaring the content encoding used so we can never know what we are going to have to handle in advance.

A bit of searching around revealed the chardet module which can be used to automatically detect the encoding used in a string. So, I just added the following code and the problem was solved.

import chardet
enc = chardet.detect(val)['encoding']
if enc != 'utf-8':
    val = val.decode(enc)
val = val.encode('ascii','replace')

The final result is that I am guaranteed to have the string from whois as an ascii string with any unsupported characters replaced by a question mark (?). It’s not a perfect representation but it is web safe and is good enough for my needs.

LCFG Client Refactoring: starting the daemon

May 21, 2013

Following on from my previous work on fixing the way in which the UDP socket is opened for receiving notification messages I have been looking at why the LCFG component just hangs when the rdxprof process fails to daemonise.

It turns out that the LCFG client component uses an obscure ngeneric feature of the Start function which is that the final step is to call a StartWait function if it has been defined. In the client component this StartWait function sits waiting forever for a client context change even when the rdxprof process failed to start…

I think the problem comes from an expectation that the call to the Daemon function, which starts and backgrounds the rdxprof process, will fail if rdxprof fails to run. It does not fail ($? is zero) and the PID of the rdxprof process is always accessible through the $! variable, even if it was very short-lived.

There is, thankfully, a very simple solution here. The client component already has a IsProcessRunning function which can be used to check if the process associated with a PID is still active. This has to be used carefully, I have put a short sleep after the daemonisation stage to ensure that the process is fully started before doing the check. The check is also fairly naive so there is the slight risk that if the system is under resource pressure the rdxprof process could fail and then the PID could be immediately reused. For now I think it’s reasonable to just accept the risks attached and revisit the issue later if it causes us problems. Associated with this, clearly the StartWait function really ought to eventually give up.

LCFG Client Refactor: notification handling

May 21, 2013

One long-standing issue with running the LCFG client (rdxprof) in daemon mode has been that if another process has already acquired the UDP socket which it wants (port 732) then it does not fail at startup but just hangs. This is clearly rather undesirable behaviour as it leaves the machine in an unmanageable state but because the client process appears to be running it’s difficult to notice that anything is actually wrong.

Yesterday I spent a while looking at this problem. I reduced it to the most simple case of a script with a while-loop listening for messages on a UDP socket and then printing the messages to the screen. Running multiple processes at the same time revealed that there was nothing preventing multiple binds on the socket. I eventually discovered that this is caused by the SO_REUSEADDR option being set on the socket.

When using TCP setting this option is often necessary. It allows a process to reacquire access to a socket when it restarts. Sometimes processes may restart too quickly and the socket would otherwise not be ready. For UDP this option is only necessary if you want to listen on a broadcast or multicast address and have multiple listeners on the same machine, that’s a fairly unusual scenario.

Disabling the SO_REUSEADDR option does exactly what we want. Attempting to run two rdxprof processes now results in it exiting with status 1 and this message:

[FAIL] client: can't bind UDP socket
[FAIL] client: Address already in use

There is a further problem with the LCFG client component not returning control to the caller when it fails to start rdxprof and I will have to do some further investigations into that problem.