# What's Chris been doing?

Successes and failures at inf.ed.ac.uk

## intel at last

Quick recap: automatic sleep is working happily with SL5.3 on 745s and 755s if they use the intel video driver. I want to get the 745 working also if it uses the i810 video driver. (The 755 with i810 doesn’t resume reliably.)

So, this morning’s experiments so far:

• Revive a 745 from near death and get it up and running as a healthy-looking DICE machine.
• With intel video driver, sleep it with what I’ve found to be the correct sleep command: /usr/sbin/pm-suspend --quirk-vbemode-restore
• Yes, it resumes cleanly
• Login, and I don’t see the “Resume Problem” error. Good, this is as expected. Logout again.
• Switch to i810
• sleep with same command
• It doesn’t resume. Reboot the machine.
• Try again without quirks: /usr/sbin/pm-suspend
• This resumes cleanly.
• Login, and as expected I see the “Resume Problem” error message. Good.

If this behaviour – different quirks for different video drivers for the same model – is representative it leaves me with the irritating problem of using different sleep commands for the same model depending on which video driver it’s using. It’s irritating because it doesn’t fit the current idea of setting the exact suspend command on a per-model basis in the sleep.h defaults header; also, that header is currently included before the header which sets the video driver so the information about which video driver is in use isn’t available to the sleep defaults header. So I’ll have to do what I did with the business of checking the video driver and somehow get the component itself to decide on the fly what command to use in which circumstance – things will have to be reorganised *again* and further complicated. Gah. I’m getting a bit fed up of redesigning the software to get round bugs in other peoples’ code.

But anyway, I can at least now test the suppression of the error message. This can be done with gconf. One way to alter gconf settings is with the command line tool gconftool-2. The gconftool-2 man page mentions the --type or -t option – to specify the type of the data you’re setting a preference key to – but then doesn’t mention it in its list of options. It has some similar looking options though – --list-type, --car-type and --cd-type – but none of them work with -s or --set, the option you use to set a value. And if you use -s without setting a type it tells you “Must specify a type when setting a value”. Luckily --type does turn out to exist, it’s just not listed on the man page. So this is the first non-error-producing command to try to stop the error message you get after you login after what the system thinks is an imperfect suspend and resume:

gconftool-2 -s /apps/gnome-power-manager/notify_hal_error -t bool false

You can check that you’ve changed the value by examining it before and afterwards using -g or --get:

gconftool-2 -g /apps/gnome-power-manager/notify_hal_error

In this case it’ll print out “true” or “false”.

So, after doing this on the test machine, I repeat the suspend (with pm-suspend) and resume. This time it doesn’t resume cleanly.
Blast.
Is this because I’ve changed that gnome setting? Surely not. I’m assuming that sleep and resume on 745/i810/5.3 is just unreliable, it sometimes works and sometimes doesn’t. Maybe I’ll go back later and undo things and try again but for now I’ll have to limit sleep support to the machines using the intel driver.

Later. I switched the same machine back to the intel driver and then left it. When I went back to it an hour or two later it had gone to sleep, but had hung. So it hangs when using previously reliable resume commands with both the i810 and intel drivers. I’d say there’s something wrong with that machine. Right enough when I revived it earlier today it had had filesystem damage; I repaired that with fsck but perhaps that wasn’t enough. I’ve now initiated a complete reinstall, with fresh filesystems, to see if that changes it back to the expected behaviour.

In the meantime I tried a different tack: to find out why we can’t move to the intel driver and try to shift that barrier. My memory was that we had stuck with the i810 driver because that was the only one which worked with our old and creaky version of Webots which is needed for teaching. Stephen confirms this memory. I talked to Graham, Mr. Webots, and it turns out that he now has authorisation to move us up to webots version 6 which doesn’t exhibit any of the bad behaviour of the elderly version we have. Hooray! He’s optimistic that webots v6 will work with the intel driver on 5.3 745s and 755s, but he’ll test it to check. In the meantime, he points out, we can change the 5.3 machines to the intel driver anyway as no 5.3 machines are yet used for teaching, and Stephen adds that webots isn’t going to be needed anyway until at least September. Excellent! So I’ve altered the dell_optiplex_gx745.h and dell_optiplex_755.h headers to exclude develop machines from the inclusion of lcfg/options/video_i810.h. This seems to have the desired effect on a test 5.3 745: /etc/X11/xorg.conf is rebuilt with no mention of “i810″ and one mention of “intel” drivers. Thus my lcfg-sleep test pool has gone up from 3-4 machines to 30-40 at least. Excellent. Perhaps it’s about time I figured out how to monitor their sleep patterns then. For now I’ve changed the sleep.ng_logrotate resource in dice/options/sleep.h to have them mailed to me until I figure out something more satisfactory. 30-40 machines mailing me two log files once a week, shouldn’t be too bad.

A quick inspection of the sleep log on a random test machine revealed that the component still hadn’t started, so I’ve gone round all of the test machines and started it. On most it hadn’t started but started successfully at my command. Some were down or unavailable, half a dozen or so were already running it.

Written by Chris Cooke

May 28, 2009 at 1:46 pm

Posted in Uncategorized

Tagged with , , ,