simple examples of how to

Saturday, October 10, 2009

Debugging linux kernel

1 Debugging Linux Kernel Lockup / Panic / Oops

Here are some notes on how to debug Linux kernel lockups – both "hard lockups" and "soft lockups" – and other panic, BUG, and oops situations. I am not an expert in this, but I figured incomplete information was better than no information, so here we go:

1. One way of confirming that you are the victim of a lockup is to note that the keyboard “caps lock” light does not respond to the “caps lock” key. Similarly the the “num lock” light won’t respond to the “num lock” key. Furthermore, the machine will not respond to ctrl-alt-delete.

Some people take this symptom as their definition of a hard lockup ... but beware that there is a situation that the kernel calls a soft lockup that exhibits the same symptom.

One way a soft lockup can occur is when the machine goes into a loop with interrupts turned off. This commonly happens if a device driver uses spinlocks improperly.

2. It is good to enable "Detect Soft Lockup" in the kernel. I believe everybody should do this, routinely, even if they are not expecting kernel bugs. To enable this:
 make menuconfig          \--> Kernel Hacking            \--> Detect Soft Lockups 

and then of course recompile your kernel, install the newly compiled kernel, and reboot.

For slightly more information, see the associated with this configuration option. As it says (in part) there:

Say Y here to enable the kernel to detect "soft lockups", which are bugs that cause the kernel to loop in kernel mode for more than 10 seconds, without giving other tasks a chance to run.

When a soft-lockup is detected, the kernel will print the current stack trace (which you should report), but the system will stay locked up. This feature has negligible overhead.

3. If there is a kernel “panic” or “BUG” or “oops”, you will want to capture the stack trace.

In some smallish subset of cases, the stack trace will be saved in the log files, but you should not count on this.

Far and away the best way to do this is to set up a “serial console”. That is, you arrange for console i/o (including oops messages) to appear on a serial port.

Getting this to work requires the following steps:

        make menuconfig          \--> Device Drivers            \--> Character devices              \--> Serial drivers                \--> Console on 8250/16550 and compatible serial port 

Then, in your /boot/grub/menu.lst file, add a boot option, namely

          console=ttyS0,115200 

or more explicitly, you need a grub stanza something like this:

     title Linux (serial console)         root (hd0,2)         kernel /boot/vmlinuz-2.6.99 ro root=/dev/sda3 console=ttyS0,115200 console=tty0 

Here tty0 refers to “the” PC screen (i.e. the one hooked to “the” graphics card via the VGA interface or some such). Meanwhile, ttyS0 refers to the lowest-numbered serial line. Note that ttyS0 is what Microsoft calls com1, and ttyS1 is what they call com2, et cetera; the MS numbers are systematically one unit higher.

You are not required to explicitly specify the baudrate (115200) of the serial line, but I recommend you do so. Of course you are free to use another serial line such as ttyS1 if you prefer. In any case, you must use the correct capitalization (capital S). Note that you can specify more than one console=... option, as in the example above. If you specify none, you get tty0 by default. If you specify only ttyS0, you get that instead of tty0. If you want both, you must specify both.

Tangential remark: Choosing to log kernel messages to the serial port is independent of choosing to permit logins on that serial port; you can choose either or both or neither.

If you choose both, it allows you to administer a system that has no screen at all.

Edit /etc/inittab to tell init to spawn a getty on the chosen serial line. I recommend you leave at least one runlevel where the getty is not spawned, for convenience if you ever need to use that serial port for something else. You may also need to edit /etc/securetty if you want to permit root logins on the serial line.

If you want to interact with the grub menu via the serial line, you must reconfigure grub accordingly. See the grub info pages. (You can skip this task if you are content to let grub boot the default kernel without interaction, which is often the case. Just don’t make a mistake with your grub configuration, or you’ll be locked out until you hook up a screen.)

Then of course you must hook up a serial cable from your computer (#1) to some other computer (#2). We assume computer #2 will remain running even if/when computer #1 crashes. On computer #2, run some communication program such as Kermit to allow you to talk to the serial line, and log the traffic to a disk file.

Computer #2 doesn’t need to be a Linux box. If it is a windows box, you can install Kermit-for-windows, or just use the built-in “hyperterm” application to make the connection and log the traffic.

As for the cable itself, you need “null modem” functionality. This just involves crossing a couple of wires. In many cases, if the cable has female connectors on both ends, it will have this functionality built in. In particular, a so-called LapLink cable has null-modem functionality built in. Conversely, if the cable looks like an extension cord (male on one end, female on the other) it most likely does not have null-modem functionality, and you will need a separate dongle (both to perform the sex-change operation and to cross the required wires).

To test that it is working, try something like

       echo "Hi there." > /dev/console 

and verify that the message is seen by computer #2.

If you have two computers, you can use each to ride herd on the other. All you need is two cables. Just use ttyS0 as the console on each one, and monitor it with ttyS1 on the other. Presumably they won’t both crash at the same time. If you have a large number of computers, you can connect them in a big daisy chain: ABCDEA. If you have an even number of machines, you might consider connecting them in pairs, but the daisy chain is just as easy, and isn’t limited to even numbers. If machine N crashes, you can ssh to machine N+1 (via its ethernet interface) to collect the logged information; we don’t need to rely on the serial links for all of our communication.

4. If you are debugging Linux device drivers, additional steps are needed. The problem is that the normal Linux serial-port driver is interrupt driven, so if your driver crashes with the interrupts off, you’ll never see the stack trace on the serial console. The fix for this is simple:
 make menuconfig          \--> Kernel Hacking            \--> Early printk 

The point here is that by selecting this option, you get a non-interrupt-dependent printk (not just an “early” printk). This trick is not very well documented or widely known, so be glad that somebody told you about it.

There are some mild downsides to the early printk option; see the menuconfig for this option for details.

5. I’m not entirely sure what the kernel calls “hard” lockup. I suppose it is any lockup so horrible that it cannot be detected by the aforementioned soft lockup detector.

The simplest way to escape from a hard lockup and get a stack trace is by means of a watchdog timer. For info on watchdog timers, read /usr/src/linux/Documentation/watchdog/*.txt.

If you are running on a system that has an Intel 82801 “I/O Controller Hub” chip (which includes most of the reasonably modern Intel-based systems) then life is simple: you can use the TCO timer and route it to the processor’s NMI line (Non-Maskable Interrupt).

To make this happen:

 make menuconfig           \--> Device Drivers             \--> Character devices               \--> Watchdog Cards                 \--> Intel i8xx TCO Timer/Watchdog                 \--> Intel TCO Timer/Watchdog 

Make it a module. Load it with modprobe iTCO-wdt.

Note that in some older kernels the option was named differently
 make menuconfig           \--> Device Drivers             \--> Character devices               \--> Watchdog Cards                 \--> Intel i8xx TCO Timer/Watchdog 
The module was loaded with modprobe i8xx-tco.

You can tickle it with the simple userspace program in section 2, or the even simpler program mentioned in /usr/src/linux/Documentation/watchdog/watchdog.txt. That program is advertised as “Example Watchdog Driver” but it’s not a driver in the usual sense of the word; it’s really an “Example Watchdog Daemon” or something like that.

Alternatively, you can tickle it using something like echo > /dev/watchdog every so often. Use echo -n V > /dev/watchdog to make the watchdog stop watching (so you can stop tickling, without causing a reboot).

If you don’t have an 82801 chip, you’ll have to buy one of the hardware cards described in the aforementioned watchdog.txt file.

6. There is also a thing called “softdog” aka “soft watchdog” aka “software watchdog” ... but I’ve never figured out what it’s good for.
  • For soft lockups, it is not needed; the aforementioned soft lockup detector works fine.
  • For hard lockups, it is not effective.
  • I guess you could use it to check the health of some critical userspace application ... but in this case I would think that userspace timers would be a more appropriate solution.

If you’re still interested, you can find it at:

 make menuconfig           \--> Device Drivers             \--> Character devices               \--> Watchdog Cards                 \--> Software watchdog 

7. Another way of performing the “watchdog” task is via an external power controller, aka controlled power strip. An example is the RPC-S6, which has six independently-controlled power outlets, and accepts control signals via a serial port. There are similar products that accept control signals via a parallel port or via IP.

There are at least two ways to proceed:

  • You can have two or more power controllers, such that power to each machine comes from a strip controlled by some other machine.
  • Suppose you have only one power controller, and it is controlled by machine A. You can set up a watchdog function (aka heartbeat function) on one of the outlets, and use that outlet to power machine A. That means that if machine A ever hangs, the power controller will cycle power to that outlet, causing a reboot.

    Of course if machine A is not hung, you have programmatic control of all the machines plugged into the power controller. This includes control of machine A itself. Beware that any command to power down machine Ais irreversible, unless the same command brings the power back up later.

2 Example Watchdog Daemon Program

#include \
#include \
#include \
#include \

#include

typedef void (*sighandler_t)(int);

int fd;

void handler(int sig) {
write(fd, "V", 1);
fprintf(stdout, "Bye (%d).\n", sig);
exit(0);
}

void inst(const int sig){
sighandler_t rslt = signal(sig, handler);
if (rslt == SIG_ERR) {
fprintf(stderr, "Could not set up signal handler: ");
perror(0);
exit(1);
}
}

int main(int argc, const char *argv[]) {

inst(SIGHUP); // hangup
inst(SIGINT); // often tied to ^C
inst(SIGTERM); // default for kill command

fd = open("/dev/watchdog", O_WRONLY);
if (fd == -1) {
fprintf(stderr, "Could not open /dev/watchdog: ");
perror(0);
// exit(1);
}
while (1) {
write(fd, "\0", 1);
fsync(fd);
sleep(10);
}
}

3  References

1.
Glenn Turner “Remote Serial Console HOWTO” http://www.tldp.org/HOWTO/Remote-Serial-Console-HOWTO/

No comments:

Post a Comment