Difference between revisions of "Oops, Linux crashes"

From ArmadeusWiki
Jump to: navigation, search
m (Usage)
m (Links)
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
You've just started kernel development and got something like that:
 
You've just started kernel development and got something like that:
  
 +
<pre class="apf">
 
  Unable to handle kernel NULL pointer dereference at virtual address 00000000
 
  Unable to handle kernel NULL pointer dereference at virtual address 00000000
 
  pgd = c0004000
 
  pgd = c0004000
Line 26: Line 27:
 
  Function entered at [<c019bdb4>] from [<c0197dd0>]
 
  Function entered at [<c019bdb4>] from [<c0197dd0>]
 
  ...
 
  ...
 +
</pre>
 
Well don't worry we will try to look at how to interpret this beast.
 
Well don't worry we will try to look at how to interpret this beast.
  
Line 39: Line 41:
  
 
==Backtrace & System.map==
 
==Backtrace & System.map==
 
+
<pre class="apf">
 
  Backtrace:
 
  Backtrace:
 
  Function entered at [<c00256e0>] from [<c019dca0>]
 
  Function entered at [<c00256e0>] from [<c019dca0>]
 
  Function entered at [<c019dc80>] from [<c0199950>]
 
  Function entered at [<c019dc80>] from [<c0199950>]
 +
</pre>
  
 
Immediate informations on where the kernel crashes can be deducted from the backtrace analysis. In this backtrace are listed all the functions called before the crash (in reverse order).<br>
 
Immediate informations on where the kernel crashes can be deducted from the backtrace analysis. In this backtrace are listed all the functions called before the crash (in reverse order).<br>
Line 51: Line 54:
 
To make the translation between these memory addresses and the "real" function names, we have to look in the System.map file which stores all the Linux kernel function offset.
 
To make the translation between these memory addresses and the "real" function names, we have to look in the System.map file which stores all the Linux kernel function offset.
 
So to have the functions name just do a:
 
So to have the functions name just do a:
  [armadeus] $ grep c00256e0 buildroot/build_armv4t/linux-2.6.X.X/System.map
+
<pre class="host">
 +
  [armadeus] $ grep c00256e0 $ARMADEUS_LINUX_DIR/System.map
 +
</pre>
  
 
As each kernel build could generate a new System.map, you must be sure to do that with the crashing kernel System.map...
 
As each kernel build could generate a new System.map, you must be sure to do that with the crashing kernel System.map...
Line 272: Line 277:
 
==Links==
 
==Links==
 
* [http://linux.die.net/man/8/ksymoops ksymoops man page]
 
* [http://linux.die.net/man/8/ksymoops ksymoops man page]
 +
* http://www.dedoimedo.com/computers/crash-analyze.html#mozTocId34663

Latest revision as of 15:18, 14 February 2012

You've just started kernel development and got something like that:

 Unable to handle kernel NULL pointer dereference at virtual address 00000000
 pgd = c0004000
 [00000000] *pgd=00000000
 Internal error: Oops: 817 [#1] PREEMPT
 Modules linked in:
 CPU: 0    Not tainted  (2.6.23.1 #14)
 pc : [<c0025720>]    lr : [<c019dca0>]    psr: 20000013
 sp : c037dcfc  ip : c037dd0c  fp : c037dd08
 r10: 00000000  r9 : 003c0000  r8 : 003c0000
 r7 : c029eccc  r6 : 00000002  r5 : 00020000  r4 : 000fa000
 r3 : 00000000  r2 : 00000002  r1 : c33e0000  r0 : c33c0000
 Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
 Control: c000717f  Table: 08e2c000  DAC: 00000017
 Process pdflush (pid: 64, stack limit = 0xc037c258)
 Stack: (0xc037dcfc to 0xc037e000)
 dce0:                                                                c037dd18
 dd00: c037dd0c c019dca0 c00256f0 c037dd9c c037dd1c c0199950 c019dc90 c037dd48
 ...
 dfe0: 00000000 00000000 00000000 c037dff8 c00347dc c004711c e58d7000 e51b303c
 Backtrace:
 Function entered at [<c00256e0>] from [<c019dca0>]
 Function entered at [<c019dc80>] from [<c0199950>]
 Function entered at [<c019983c>] from [<c019c0b0>]
 Function entered at [<c019bdb4>] from [<c0197dd0>]
 ...

Well don't worry we will try to look at how to interpret this beast.

ARM registers

To understand a kernel Oops you must understand how the underlaying processor architecture is working (here ARM one). ARM processors has a set of 32 bits registers:

  • general purposes one: r0 -> r10
  • pc: program counter, which handles the currently executed instruction address in memory
  • lr: link register, which handles the address to which the program counter should be set when it will reach a "return from subroutine" instruction. To summarize lr holds the address of the function which called the currently executing code.
  • sp: stack pointer, holds the current position in stack

You can see the registers content when the error occured at the beginning of the Oops.

Backtrace & System.map

 Backtrace:
 Function entered at [<c00256e0>] from [<c019dca0>]
 Function entered at [<c019dc80>] from [<c0199950>]

Immediate informations on where the kernel crashes can be deducted from the backtrace analysis. In this backtrace are listed all the functions called before the crash (in reverse order).
In our example we can deduce that kernel has:

  • crashed at 0xc0025720 (pc)
  • faulty function starts at 0xc00256e0 and was called by a function starting at 0xc019dc80

To make the translation between these memory addresses and the "real" function names, we have to look in the System.map file which stores all the Linux kernel function offset. So to have the functions name just do a:

 [armadeus] $ grep c00256e0 $ARMADEUS_LINUX_DIR/System.map

As each kernel build could generate a new System.map, you must be sure to do that with the crashing kernel System.map...

Automatic Oops decoding with ksymoops

As you probably don't want to decode manually the whole Oops trace, there is a pretty good tool to do that: ksymoops.

Installation

$ apt-get install ksymoops

Usage

  • copy your Oops text output to a text file (oops.txt) on your Host (for example):
Unable to handle kernel NULL pointer dereference at virtual address 00000002
pgd = c09d4000
[00000002] *pgd=08b7b031, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] PREEMPT
Modules linked in: fpga_serial 8250 snd_imx_alsa_tsc2102 tsc2102 hwmon spi_imx snd_pcm_oss snd_mixer_oss snd_pcm snd_page_alloc snd_timer snd soundcore
CPU: 0    Not tainted  (2.6.23.1 #9)
pc : [<bf03f068>]    lr : [<bf041058>]    psr: 80000093
sp : c09fdc04  ip : c09fdc14  fp : c09fdc10
r10: c09fc000  r9 : fffffffe  r8 : a0000013
r7 : bf042888  r6 : bf042888  r5 : 00000002  r4 : 00000000
r3 : 00000000  r2 : 00000001  r1 : 00000002  r0 : bf042888
Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: c000717f  Table: 089d4000  DAC: 00000015
Process insmod (pid: 311, stack limit = 0xc09fc258)
Stack: (0xc09fdc04 to 0xc09fe000)
dc00:          c09fdc44 c09fdc14 bf041058 bf03f010 c0e14760 00000001 00000001
dc20: 00000002 bf042888 c0e14760 bf0422dc 00000000 c09fc000 c09fdcc8 c09fdc48
dc40: c0187d18 bf040fc8 c0152510 c01534a4 c09fdc6c c09fdc60 c0e14778 c0152500
dc60: c09fdc80 c09fdc70 c018a294 c0189774 c0dcc400 c09fdc98 c09fdc84 c018a4f0
dc80: c018a288 c0e14778 bf042888 c09fdca8 c09fdc9c c01741a4 c018a468 c09fdcc8
dca0: 00000001 00000002 00000000 bf042888 bf0440a8 c09fdd04 00000000 c09fdcec
dcc0: c09fdccc bf041d74 c0187c44 bf0440a8 00000000 bf0441d0 bf0440a0 c02b6e64
dce0: c09fddb0 c09fdcf0 bf041e4c bf041bf8 c02e4c1c c00bcf24 c0b548c0 c02f2e00 
dd00: bf042324 ea000010 00000000 00000041 001c2000 00000000 00020100 00000000
dd20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dd40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 10000000
dd60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 bf0440a8
dd80: 00000000 00000000 bf0440a8 00000000 bf04231c bf0440a8 c02b6e64 00000000
dda0: 00000000 c09fddc0 c09fddb4 c018e22c bf041db0 c09fdde4 c09fddc4 c018c320
ddc0: c018e21c 00000000 c09fddf8 c018c3c4 bf0440a8 bf04414c c09fddf4 c09fdde8
dde0: c018c3d4 c018c248 c09fde20 c09fddf8 c018b654 c018c3d4 c02a0544 c02a0544
de00: bf042370 bf0440a8 bf044168 00000000 bf044110 c09fde38 c09fde24 c018c44c
de20: c018b614 bf0440a8 c02a0448 c09fde54 c09fde3c c018b980 c018c3e8 00000000
de40: bf0440a8 ea000014 c09fde8c c09fde58 c0189d5c c018b950 ea000024 00000000
de60: c02a05e8 bf0440a0 bf0440a0 ea000014 00000000 c1868ee4 c1858000 000002a0
de80: c09fdeb0 c09fde90 c018e03c c0189a0c 00000001 bf0440a0 ea000022 ea000014
dea0: ea000024 c09fdec4 c09fdeb4 c018e0d0 c018df68 ea000012 c09fdee4 c09fdec8
dec0: bf046094 c018e0c0 bf044240 00000000 bf044240 bf000000 c09fdfa4 c09fdee8
dee0: c0054f58 bf046010 00000000 c0242474 000004b0 c0242474 000004b0 c0033e28
df00: c186e0a8 00000000 0000002b 0000002b 00000028 c09fc000 00000000 00000000
df20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
df40: 00000006 00000000 0000000a 00000000 00000000 00000000 0000001d 0000001c
df60: c186e0b4 c0a83580 000b8008 00000003 c09fc000 00000000 00000000 000161b7
df80: 0008b3f4 be96ddc4 00000080 c0022024 c09fc000 400a393c 00000000 c09fdfa8
dfa0: c0021e80 c0053e38 0008b3f4 be96ddc4 00900080 000b8018 000161b7 000b8008
dfc0: 000161b7 0008b3f4 be96ddc4 00000000 be96deab 00000002 400a393c 0000d768
dfe0: 4005ae10 be96dce4 00049508 4005ae20 20000010 00900080 bf042420 00000160
Backtrace:
Function entered at [<bf03f000>] from [<bf041058>] 
Function entered at [<bf040fb8>] from [<c0187d18>]
Function entered at [<c0187c34>] from [<bf041d74>]
Function entered at [<bf041be8>] from [<bf041e4c>]
 r8:c02b6e64 r7:bf0440a0 r6:bf0441d0 r5:00000000 r4:bf0440a8
Function entered at [<bf041da0>] from [<c018e22c>]
Function entered at [<c018e20c>] from [<c018c320>]
Function entered at [<c018c238>] from [<c018c3d4>]
 r8:bf04414c r7:bf0440a8 r6:c018c3c4 r5:c09fddf8 r4:00000000
Function entered at [<c018c3c4>] from [<c018b654>]
Function entered at [<c018b604>] from [<c018c44c>]
 r7:bf044110 r6:00000000 r5:bf044168 r4:bf0440a8
Function entered at [<c018c3d8>] from [<c018b980>]
 r5:c02a0448 r4:bf0440a8
Function entered at [<c018b940>] from [<c0189d5c>]
 r6:ea000014 r5:bf0440a8 r4:00000000
Function entered at [<c01899fc>] from [<c018e03c>]
Function entered at [<c018df58>] from [<c018e0d0>]
 r7:ea000024 r6:ea000014 r5:ea000022 r4:bf0440a0
Function entered at [<c018e0b0>] from [<bf046094>]
 r4:ea000012
Function entered at [<bf046000>] from [<c0054f58>]
 r7:bf000000 r6:bf044240 r5:00000000 r4:bf044240
Function entered at [<c0053e28>] from [<c0021e80>]
Code: e5903000 e5d33001 ea000001 e5903004 (e7d33001)
note: insmod[311] exited with preempt_count 1
BUG: scheduling while atomic: insmod/0x40000002/311
Function entered at [<c0025a94>] from [<c002d9b0>]
Function entered at [<c002d980>] from [<c023ef6c>]
Function entered at [<c023ef00>] from [<c0030010>]
Function entered at [<c002fff8>] from [<c023fd08>]
Function entered at [<c023fcc4>] from [<c006632c>]
 r4:c09fc000
Function entered at [<c0065d74>] from [<c006be70>]
Function entered at [<c006bddc>] from [<c0030678>]
 r8:c0292b50 r7:00000001 r6:c0cb1700 r5:c0f4a0a0 r4:c0f4a060
Function entered at [<c0030634>] from [<c0035418>]
 r6:c0cb1700 r5:c0f4a094 r4:c0f4a060
Function entered at [<c0035334>] from [<c0035a60>]
 r6:c0cb1700 r5:0000000b r4:c09fc000
Function entered at [<c0035864>] from [<c0025d98>]
Function entered at [<c0025af0>] from [<c00271b4>]
Function entered at [<c0027148>] from [<c002749c>]
 r7:c09fdbbc r6:00000017 r5:c0cb1700 r4:ffffffff
Function entered at [<c0027298>] from [<c002120c>]
Function entered at [<c00211d0>] from [<c0021a40>]
Exception stack(0xc09fdbbc to 0xc09fdc04)
dba0:                                                                bf042888
dbc0: 00000002 00000001 00000000 00000000 00000002 bf042888 bf042888 a0000013
dbe0: fffffffe c09fc000 c09fdc10 c09fdc14 c09fdc04 bf041058 bf03f068 80000093
dc00: ffffffff
 r8:a0000013 r7:bf042888 r6:bf042888 r5:c09fdbf0 r4:ffffffff
Function entered at [<bf03f000>] from [<bf041058>]
Function entered at [<bf040fb8>] from [<c0187d18>]
Function entered at [<c0187c34>] from [<bf041d74>]
Function entered at [<bf041be8>] from [<bf041e4c>]
 r8:c02b6e64 r7:bf0440a0 r6:bf0441d0 r5:00000000 r4:bf0440a8
Function entered at [<bf041da0>] from [<c018e22c>]
Function entered at [<c018e20c>] from [<c018c320>]
Function entered at [<c018c238>] from [<c018c3d4>]
 r8:bf04414c r7:bf0440a8 r6:c018c3c4 r5:c09fddf8 r4:00000000
Function entered at [<c018c3c4>] from [<c018b654>]
Function entered at [<c018b604>] from [<c018c44c>]
 r7:bf044110 r6:00000000 r5:bf044168 r4:bf0440a8
Function entered at [<c018c3d8>] from [<c018b980>]
 r5:c02a0448 r4:bf0440a8
Function entered at [<c018b940>] from [<c0189d5c>]
 r6:ea000014 r5:bf0440a8 r4:00000000
Function entered at [<c01899fc>] from [<c018e03c>]
Function entered at [<c018df58>] from [<c018e0d0>]
 r7:ea000024 r6:ea000014 r5:ea000022 r4:bf0440a0
Function entered at [<c018e0b0>] from [<bf046094>]
 r4:ea000012
Function entered at [<bf046000>] from [<c0054f58>]
 r7:bf000000 r6:bf044240 r5:00000000 r4:bf044240
Function entered at [<c0053e28>] from [<c0021e80>]
Segmentation fault
  • launch:
$ ksymoops -V buildroot/build_armv4t/linux-2.6.23.1/vmlinux -m buildroot/build_armv4t/linux-2.6.23.1/System.map --no-ksyms --no-lsmod \ 
           -o buildroot/project_build_armv4t/armadeus/root/lib/modules/2.6.23.1/ oops.txt
  • and after some processing you will get something like that:
Trace; c0025a94 <dump_stack+0/14>
Trace; c002d9b0 <__schedule_bug+30/38>
Trace; c002d980 <__schedule_bug+0/38>
Trace; c023ef6c <schedule+6c/354>
Trace; c023ef00 <schedule+0/354>
Trace; c0030010 <__cond_resched+18/3c>
Trace; c002fff8 <__cond_resched+0/3c>
Trace; c023fd08 <cond_resched+44/50>
Trace; c023fcc4 <cond_resched+0/50>
Trace; c006632c <unmap_vmas+5b8/614>
Trace; c0065d74 <unmap_vmas+0/614>
Trace; c006be70 <exit_mmap+94/170>
Trace; c006bddc <exit_mmap+0/170>
Trace; c0030678 <mmput+44/f0>

>>r8; c0292b50 <init_pid_ns+0/14>

Trace; c0030634 <mmput+0/f0>
Trace; c0035418 <exit_mm+e4/e8>
Trace; c0035334 <exit_mm+0/e8>
Trace; c0035a60 <do_exit+1fc/83c>
Trace; c0035864 <do_exit+0/83c>
Trace; c0025d98 <die+2a8/2f0>
Trace; c0025af0 <die+0/2f0>
Trace; c00271b4 <__do_kernel_fault+6c/7c>
Trace; c0027148 <__do_kernel_fault+0/7c>
Trace; c002749c <do_page_fault+204/220>
Trace; c0027298 <do_page_fault+0/220>
Trace; c002120c <do_DataAbort+3c/a0>
Trace; c00211d0 <do_DataAbort+0/a0>
Trace; c0021a40 <__dabt_svc+40/60>
Trace; bf03f000 Before first symbol
Trace; bf041058 Before first symbol
Trace; bf040fb8 Before first symbol
Trace; c0187d18 <uart_add_one_port+e4/2b0>
Trace; c0187c34 <uart_add_one_port+0/2b0>
Trace; bf041d74 Before first symbol
Trace; bf041be8 Before first symbol
Trace; bf041e4c Before first symbol

>>r8; c02b6e64 <__key.0+0/0>

Trace; bf041da0 Before first symbol
Trace; c018e22c <platform_drv_probe+20/24>
Trace; c018e20c <platform_drv_probe+0/24>
Trace; c018c320 <driver_probe_device+e8/18c>
Trace; c018c238 <driver_probe_device+0/18c>
Trace; c018c3d4 <__device_attach+10/14>

>>r6; c018c3c4 <__device_attach+0/14>

Trace; c018c3c4 <__device_attach+0/14>
Trace; c018b654 <bus_for_each_drv+50/8c>
Trace; c018b604 <bus_for_each_drv+0/8c>
Trace; c018c44c <device_attach+74/a8>
Trace; c018c3d8 <device_attach+0/a8>
Trace; c018b980 <bus_attach_device+40/98>

>>r5; c02a0448 <platform_bus_type+0/178>

Trace; c018b940 <bus_attach_device+0/98>
Trace; c0189d5c <device_add+360/5b4>
Trace; c01899fc <device_add+0/5b4>
Trace; c018e03c <platform_device_add+e4/158>
Trace; c018df58 <platform_device_add+0/158>
Trace; c018e0d0 <platform_device_register+20/24>
Trace; c018e0b0 <platform_device_register+0/24>
Trace; bf046094 Before first symbol
Trace; bf046000 Before first symbol
Trace; c0054f58 <sys_init_module+1130/11c8>
Trace; c0053e28 <sys_init_module+0/11c8>
Trace; c0021e80 <ret_fast_syscall+0/2c>
  • which is more understandable, isn't it !? ;-). Read it from bottom to top. (here my driver is crashing when trying to add a new serial port to the Linux infrastructure)

Links