Oops, Linux crashes

From ArmadeusWiki
Revision as of 18:35, 21 October 2007 by JulienB (Talk | contribs) (Backtrace & System.map)

Jump to: navigation, search

You've just started kernel development and got something like that:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: Oops: 817 [#1] PREEMPT
Modules linked in:
CPU: 0    Not tainted  (2.6.23.1 #14)
pc : [<c0025720>]    lr : [<c019dca0>]    psr: 20000013
sp : c037dcfc  ip : c037dd0c  fp : c037dd08
r10: 00000000  r9 : 003c0000  r8 : 003c0000
r7 : c029eccc  r6 : 00000002  r5 : 00020000  r4 : 000fa000
r3 : 00000000  r2 : 00000002  r1 : c33e0000  r0 : c33c0000
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: c000717f  Table: 08e2c000  DAC: 00000017
Process pdflush (pid: 64, stack limit = 0xc037c258)
Stack: (0xc037dcfc to 0xc037e000)
dce0:                                                                c037dd18
dd00: c037dd0c c019dca0 c00256f0 c037dd9c c037dd1c c0199950 c019dc90 c037dd48
...
dfe0: 00000000 00000000 00000000 c037dff8 c00347dc c004711c e58d7000 e51b303c
Backtrace:
Function entered at [<c00256e0>] from [<c019dca0>]
Function entered at [<c019dc80>] from [<c0199950>]
Function entered at [<c019983c>] from [<c019c0b0>]
Function entered at [<c019bdb4>] from [<c0197dd0>]
...

Well don't worry we will try to look at how to interpret this beast.

ARM registers

To understand a kernel Oops you must understand how the underlaying processor architecture is working (here ARM one). ARM processors has a set of 32 bits registers:

  • general purposes one: r0 -> r10
  • pc: program counter, which handles the currently executed instruction address in memory
  • lr: link register, which handles the address to which the program counter should be set when it will reach a "return from subroutine" instruction. To summarize lr holds the address of the function which called the currently executing code.
  • sp: stack pointer, holds the current position in stack

You can see the registers content when the error occured at the beginning of the Oops.

Backtrace & System.map

Backtrace:
Function entered at [<c00256e0>] from [<c019dca0>]
Function entered at [<c019dc80>] from [<c0199950>]

Immediate informations on where the kernel crashes can be deducted from the backtrace analysis. In this backtrace are listed all the functions called before the crash (in reverse order).
In our example we can deduce that kernel has:

  • crashed at (pc)
  • faulty function starts at 0xc00256e0 and was called by a function starting at 0xc019dc80

To make the translation between these memory addreses and the "real" function names, we have to look in the System.map file which stores all the Linux kernel function offset. So to have the functions name just do a:

[armadeus] $ grep c00256e0 buildroot/build_arm_nofpu/linux-2.6.X.X/System.map

As each kernel build could generate a new System.map, you must be sure to do that with the crashing kernel System.map...