Multiseat and anaconda bugs
A year ago, I put together a post about the multiseat Fedora systems we’re using in our school. Over the past month, I’ve been putting together an upgrade from our Fedora 19 image to Fedora 21.
While doing the upgrade, I ran into a few bugs, and the first one was a doozy! Roughly half the time our multiseat systems started, the login screen would only show on two or three of the four seats. The only way to fix it was to restart the display manager, and even that only had a 50% chance of success.
At first I tried bodging around the bug by staggering the timing of Xorg’s startup, but that only made things worse. So I started looking at the logs and then looking at the Xorg code. It became obvious that the problem was that the first seat (seat0) would try to claim all the GPUs on the system. If it beat the other seats to their GPUs, they would, oddly enough, refuse to start. I put together a patch, filed a bug, and watched as those who know a lot more about Xorg’s internals take my ugly patch and make it beautiful. This patch has been merged into Xorg 1.17 and I’m hoping we’ll get it backported for F20 and F21 as I really don’t want to have to maintain internal Xorg packages until we switch to F22.
There do seem to be a couple of other bugs related to lightdm/xorg, but they’re far rarer and I haven’t spent much time on tracking them down, much less filing bugs. Occasionally lightdm starts the X server, but never gets a signal back saying that it’s ready, so they both sit there waiting for the other process. And far more rarely, the greeter crashes, which causes lightdm to shut down the seat. I think lightdm should retry a few times, but either it doesn’t or I haven’t found the right config option yet.
We did run into one interesting race condition in anaconda when we started mass-installing F21 on our systems. We use iPXE and Fedora’s PXE network install images with a custom kickstart to do the install (in graphical mode, because pretty installs make it less likely that a student will press the reset button while the install is progressing). On some systems, I’d get an error message that basically said that a repository that was supposed to be enabled had disappeared, which would crash anaconda.
Thanks to anaconda’s wonderful debugging tools, I was able to work out what list was being emptied and finally tracked it down to a race between the backend filling the frontend with its list of repositories and the frontend telling the backend to remove any repositories that aren’t in its list of repositories. Another ugly patch attached to the bug report, and we’ll see what happens with this one. At least I’m able to rebuild the squashfs installer image so the bug is fixed for us internally.
So most of our computers have now been upgraded to Fedora 21 and the reaction from our students has been positive. Now to get some Fedora 22 test systems built…
Comments
Elliott
Sunday, Feb 1, 2015
Jonathan Dieter
Sunday, Feb 1, 2015