What is a core dump?
A core dump is a snapshot of useful information of the process - such as memory contents, registers etc. Core dump files are represented in ELF format.
Triggers that generate core dump:
- A core dump is automatically triggered (depending of system core dump configuration) in the event of some fatal error of a process.
- A core dump can be generated manually using linux command gcore.
- A core dump can be generated manually using gdb facilities.
- A core dump can be generated programmatically from inside a process.
Core dump can be used by developers offline to debug fatal errors of the process, inspect process state even for non-critical software issues.
The offline debugging ability offered by core dumps helps:
- debug software issues which are infrequently reproducible
- debug software issues where access to the affected devices is limited/unavailable (such as customer environment)
Are core dumps generated by default?
Depends on the configuration. System should be configured to generate core dumps. The default behavior depends on the default system configuration for core dump file.
How to configure a system to generate a core dump in the event of fatal errors?
Set the maximum size of core files created using the following command:
ulimit -c
<max_core_file_size>
When set
to 0, core files are not generated. When core file being generated exceeds the
above size, then core file would be truncated to the above size. To set core file
size to unlimited, use the following command:
ulimit -c unlimited
You can check the current value using the following command:
You can check the current value using the following command:
root@babu-VirtualBox:~/tools/core_dump# ulimit -a | grep core
core file size (blocks, -c) 0
root@babu-VirtualBox:~/tools/core_dump#
core file size (blocks, -c) 0
root@babu-VirtualBox:~/tools/core_dump#
The core file location & format can be configured using the following command:
echo "/tmp/core_files/core.%p" > /proc/sys/kernel/core_pattern
If no directory is specified (as provided above - /tmp/core_files), the generated core file is stored in the current directory of the process.
%p above specifies the format of the core file name. Here are all the allowed format specifiers:
%p: pid of the process dumped
%u: uid of the process dumped
%g: gid of the process dumped
%s: signal number causing core dump
%t: timestamp of core dump. specified in seconds since seconds since 0:00h, 1 Jan 1970
%h: hostname (uname command output)
%e: executable filename
The file location & format configured above is not retained across reboots. To keep the configuration even across reboots, the above config should be done in /etc/sysctl.conf. Add "kernel.core_pattern=/tmp/core_files/core.%p" to /etc/sysctl.conf.
The currently configured core file location & format can be viewed using the following command:
root@babu-VirtualBox:~/tools/core_dump# cat /proc/sys/kernel/core_pattern
core
root@babu-VirtualBox:~/tools/core_dump#
or
root@babu-VirtualBox:~/tools/core_dump# sysctl -a | grep core_
kernel.core_pattern = core
kernel.core_pipe_limit = 0
kernel.core_uses_pid = 1
root@babu-VirtualBox:~/tools/core_dump#
In addition to the above commands, I found two more core dump related configuration facilities:%p above specifies the format of the core file name. Here are all the allowed format specifiers:
%p: pid of the process dumped
%u: uid of the process dumped
%g: gid of the process dumped
%s: signal number causing core dump
%t: timestamp of core dump. specified in seconds since seconds since 0:00h, 1 Jan 1970
%h: hostname (uname command output)
%e: executable filename
The file location & format configured above is not retained across reboots. To keep the configuration even across reboots, the above config should be done in /etc/sysctl.conf. Add "kernel.core_pattern=/tmp/core_files/core.%p" to /etc/sysctl.conf.
root@babu-VirtualBox:~/tools/core_dump# cat /proc/sys/kernel/core_pattern
core
root@babu-VirtualBox:~/tools/core_dump#
root@babu-VirtualBox:~/tools/core_dump# sysctl -a | grep core_
kernel.core_pattern = core
kernel.core_pipe_limit = 0
kernel.core_uses_pid = 1
root@babu-VirtualBox:~/tools/core_dump#
root@babu-VirtualBox:~/tools/core_dump# cat /proc/sys/kernel/core_pipe_limit
0
root@babu-VirtualBox:~/tools/core_dump# cat /proc/sys/kernel/core_uses_pid
1 <-- Impact of a non-zero value is same as %p format specifier in /proc/sys/kernel/core_pattern
root@babu-VirtualBox:~/tools/core_dump#
or
root@babu-VirtualBox:~/tools/core_dump# sysctl -a | grep core_
kernel.core_pattern = core
kernel.core_pipe_limit = 0
kernel.core_uses_pid = 1
root@babu-VirtualBox:~/tools/core_dump#
I am yet to explore the purpose of these configuration facilities.
How to manually generate a core dump?
- CLI method:
root@babu-VirtualBox:~/tools/core_dump# ls
latencytop latencytop.c latencytop.o
Makefile
root@babu-VirtualBox:~/tools/core_dump# pidof
latencytop
2832
root@babu-VirtualBox:~/tools/core_dump# gcore
2832
0xb76e5424 in __kernel_vsyscall ()
Saved corefile core.2832
root@babu-VirtualBox:~/tools/core_dump# ls
core.2832 latencytop latencytop.c
latencytop.o Makefile
root@babu-VirtualBox:~/tools/core_dump#
root@babu-VirtualBox:~/tools/core_dump#
Note: This method generated core file irrespective of ulimit configuration.
or
root@babu-VirtualBox:~/tools/core_dump# ls
ex latencytop latencytop.c latencytop.o Makefile
root@babu-VirtualBox:~/tools/core_dump# kill -s SIGSEGV `pidof latencytop` (or SIGABRT instead of SIGSEGV)
root@babu-VirtualBox:~/tools/core_dump# ls
core.3669 ex latencytop latencytop.c latencytop.o Makefile
root@babu-VirtualBox:~/tools/core_dump#
- GDB method:
root@babu-VirtualBox:~/tools/core_dump#
ls
latencytop
latencytop.c latencytop.o Makefile
root@babu-VirtualBox:~/tools/core_dump# pidof latencytop
2832
root@babu-VirtualBox:~/tools/core_dump# gdb -p 2832
GNU gdb
(GDB) 7.6.1-ubuntu
Copyright
(C) 2013 Free Software Foundation, Inc.
License
GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is
free software: you are free to change and redistribute it.
There is
NO WARRANTY, to the extent permitted by law. Type "show
copying"
and
"show warranty" for details.
This GDB
was configured as "i686-linux-gnu".
For bug
reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Attaching
to process 2832
Reading
symbols from /home/babu/tools/core_dump/latencytop...done.
Reading
symbols from /lib/i386-linux-gnu/libc.so.6...Reading symbols from
/usr/lib/debug/lib/i386-linux-gnu/libc-2.17.so...done.
done.
Loaded
symbols for /lib/i386-linux-gnu/libc.so.6
Reading
symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug/lib/i386-linux-gnu/ld-2.17.so...done.
done.
Loaded
symbols for /lib/ld-linux.so.2
0xb76e5424
in __kernel_vsyscall ()
(gdb)
generate-core-file
Saved
corefile core.2832
(gdb)
quit
A
debugging session is active.
Inferior 1 [process 2832] will be
detached.
Quit
anyway? (y or n) y
Detaching
from program: /home/babu/tools/core_dump/latencytop, process 2832
root@babu-VirtualBox:~/tools/core_dump#
root@babu-VirtualBox:~/tools/core_dump# ls
core.2832
latencytop latencytop.c latencytop.o Makefile
root@babu-VirtualBox:~/tools/core_dump#
Note: This method generated core file irrespective of ulimit configuration.
Note: This method generated core file irrespective of ulimit configuration.
- Invoke abort where ever core dump should be generated. Ofcourse, the process gets terminated after generated the core. Another point to remember - System should have been configured to generate core dumps. Otherwise, abort will not generate core.
- Another approach to generate core dump without terminating the process is to fork the process, let the child invoke abort & let the parent continue with normal execution.
Note: This method generated core file only if ulimit is configured to a non-zero size.
Steps to ANALYZE core dump file offline
gdb is used to analyze core dump file. After using the steps required to load the core file into gdb, all gdb commands can be used to inspect/debug the core file.
Step1:
Identify the name of the application from the core file.
If %e not configured in /proc/sys/kernel/core_pattern, then the core file name itself indicates the application that generated the core. If not, then we can use the following command to get the application name:
root@babu-VirtualBox:~/tools/core_dump# file core.5240
core.5240: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from './latencytop'
root@babu-VirtualBox:~/tools/core_dump#
or
root@babu-VirtualBox:~/tools/core_dump# strings core.5240 | tail -n 1
./latencytop
root@babu-VirtualBox:~/tools/core_dump#
Step2:core.5240: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from './latencytop'
root@babu-VirtualBox:~/tools/core_dump#
or
root@babu-VirtualBox:~/tools/core_dump# strings core.5240 | tail -n 1
./latencytop
root@babu-VirtualBox:~/tools/core_dump#
To debug a core file using gdb, we need binaries with debug symbols. However, in embedded systems, debug disabled or debug-symbol-stripped binaries are used in QA environment and customer deployments. So, its most likely that we first need to get a debug enabled application binary and shared/static libraries. We also need to ensure that the code from which we build debug binary is same as the code of the binaries on which the software issue is observed.
To demo the core dump tool, let me use the application latencytop (https://github.com/babuneelam/gcov_uspace_tests/tree/master/latencytop). I have enabled gdb flags in the Makefile as well. And I then generated a core file using kill command.
Step3:
Feed the core file & debug-enabled/unstripped binary to gdb to debug the core file:
root@babu-VirtualBox:~/tools/core_dump# gdb latencytop
GNU gdb
(GDB) 7.6.1-ubuntu
Copyright
(C) 2013 Free Software Foundation, Inc.
License
GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is
free software: you are free to change and redistribute it.
There is
NO WARRANTY, to the extent permitted by law. Type "show
copying"
and
"show warranty" for details.
This GDB
was configured as "i686-linux-gnu".
For bug
reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading
symbols from /home/babu/tools/core_dump/latencytop...done.
(gdb) set solib-search-path
/lib/i386-linux-gnu/ <--- this can be a list of colon separated PATHs
(gdb) core-file core.3678
[New LWP 3678]
warning: Can't read pathname for load map:
Input/output error.
Core was generated by `./latencytop'.
#0 0xb7725424 in __kernel_vsyscall ()
(gdb)
(gdb) bt
#0
0xb7725424 in __kernel_vsyscall ()
#1
0xb7612740 in __nanosleep_nocancel ()
at ../sysdeps/unix/syscall-template.S:81
#2
0xb7612563 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
#3
0x08048d81 in main (argc=1, argv=0xbfd2a354) at latencytop.c:173
(gdb) list latencytop.c:173
168
169 //abort();
170
171 while ((iterations == 0) || (count++ < iterations)) {
172
173 sleep(delay);
174
175 e = NULL;
176 if (pid) {
177 if (tid) {
(gdb)
168
169 //abort();
170
171 while ((iterations == 0) || (count++ < iterations)) {
172
173 sleep(delay);
174
175 e = NULL;
176 if (pid) {
177 if (tid) {
(gdb)
solib-search path provides the location of the shared libraries the binary uses. Without providing info about where the shared libraries are, the stack trace wouldn't display the symbol names that help corner down the crashing code path. In this case, I am setting shared library path to /lib/i386-linux-gnu/. However, i may different in different systems. For embedded systems where the development & build machines are different, developers need to spend significant effort in setting up the library path. This would be even more challenging if the build system were not to place all shared libraries in common build directory.
How would the stack trace be if we don't feed the location of libraries to gdb during core dump analysis:
root@babu-VirtualBox:~/tools/core_dump# gdb
latencytop
GNU gdb (GDB) 7.6.1-ubuntu
Copyright (C) 2013 Free Software Foundation,
Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change
and redistribute it.
There is NO WARRANTY, to the extent permitted by
law. Type "show copying"
and "show warranty" for details.
This GDB was configured as
"i686-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from
/home/babu/tools/core_dump/latencytop...done.
(gdb) set solib-search-path /lib1/ <--- Providing incorrect
shared library path for demonstration purpose
(gdb) core-file core.3678
[New LWP 3678]
warning: Can't read pathname for load map:
Input/output error.
warning: Could not load shared library symbols
for 2 libraries, e.g. /lib/i386-linux-gnu/libc.so.6.
Use the "info sharedlibrary" command
to see the complete listing.
Do you need "set solib-search-path" or
"set sysroot"?
Core was generated by `./latencytop'.
#0 0xb7725424 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7725424 in __kernel_vsyscall ()
#1 0xb7612740 in ?? ()
#2 0xb7612563 in ?? ()
#3 0xbfd2a0bc in ?? ()
#4 0x00000000 in ?? () <--- symbol names are not displayed!! As we can see, this is not limited to just the symbols of shared library, but even impacted symbols of the binary !!
(gdb)
How would the stack trace be if we don't compile the binary with debug (-g) flag or strip the debug symbols:
root@babu-VirtualBox:~/tools/core_dump# gdb latencytop
GNU gdb (GDB) 7.6.1-ubuntu
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/babu/tools/core_dump/latencytop...(no debugging symbols found)...done.
(gdb) set solib-search-path /lib/i386-linux-gnu/
(gdb) core core.5188
[New LWP 5188]
warning: Can't read pathname for load map: Input/output error.
Core was generated by `./latencytop'.
Program terminated with signal 11, Segmentation fault.
#0 0xb77b5424 in __kernel_vsyscall ()
(gdb) bt
#0 0xb77b5424 in __kernel_vsyscall ()
#1 0xb76a2740 in __nanosleep_nocancel ()
at ../sysdeps/unix/syscall-template.S:81
#2 0xb76a2563 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
#3 0x08048d81 in main () <-- With debug flags enabled, we even got info about file & line number - "latencytop.c:173".. This is missing without gdb flags!!
(gdb)
TBD
solib-absolute-prefix - TBD
Core dump tool - Internals
Following is the sequence of events/code-flows that create a core dump of a process:
- User space application encounters an fatal-error/exception. This leads kernel to raise a signal for this process. This signal could be raised manually using kill command or methods as well.
- Kernel basically sets a flag in process descriptor indicating that a signal is raised & it's handling is pending.
- Kernel then handles the signal. Signal handling is done by kernel every time the process is resumed for execution in user space. Today, kernel has this resuming opportunity in scheduler and interrupt-handler context. So, in these two places, kernel checks for pending signals of the process.
- If there are pending signals, then kernel invokes default signal handler: If the signal were compile-time configured to dump registers, then core dump is generated. User space process is not invoked at all in accomplishing this. If a custom signal handler were registered, then custom handling instead of core dump is done.
- The signals for which core dump is compile-time configured can be obtained from http://lxr.linux.no/linux+v3.12.6/include/linux/signal.h#L331.
- signal.c generates core dump in do_coredump function. To accomplish this, this function in turn invokes the registered binary format(say ELF)'s core dump function handler as can be seen here.
References:
http://ss64.com/bash/ulimit.html
http://man7.org/linux/man-pages/man2/getrlimit.2.html
http://man7.org/linux/man-pages/man5/core.5.html