Strace – Reverse Engineering – System Calls
If there is one recurring problem that I often see gagging the forums is “Library missing”, or often “installed libraries which a program doesn’t find”.
I decided to share a simple debugging technique which could save the day or even the hours… Google might not be the right choice all the time, when you have got strace at your finger tips.
1. System Calls
To understand strace, you first need to understand what a system call is. So what is a system call? a system call is simply a kernel function, which I would say executes within the kernel mode and thus resides between the user code and the kernel.
Whenever in a C program, you call the function open(), you are indeed calling a C function “open” which in turn just switches from user mode to kernel mode and run the system call “open” of the kernel.
So the concept of switching is very important here to our understanding of system calls and functions. A switching event would usually be either a software interrupt, a gate or trap instruction.
2. Reverse Engineering with Strace
First let’s break our system to setup
[root@web01 ~]# ldd /bin/ls linux-gate.so.1 => (0x009c2000) librt.so.1 => /lib/librt.so.1 (0x003d0000) libacl.so.1 => /lib/libacl.so.1 (0x003b3000) libselinux.so.1 => /lib/libselinux.so.1 (0x00b20000) libc.so.6 => /lib/libc.so.6 (0x00243000) libpthread.so.0 => /lib/libpthread.so.0 (0x0038e000) /lib/ld-linux.so.2 (0x00225000) libattr.so.1 => /lib/libattr.so.1 (0x003ac000) libdl.so.2 => /lib/libdl.so.2 (0x00388000) libsepol.so.1 => /lib/libsepol.so.1 (0x00110000)
As we can see those are the libraries “ls” must load before executing its system call and give us the usual pretty output.
Let’s move librt.so.1 out of /lib to our backup folder in /root/libBackup
Execute ls at the command line
[root@web01 ~]# ls ls: error while loading shared libraries: librt.so.1: cannot open shared object file: No such file or directory
Of course, the error message here is pretty obvious… ls needs “librt.so.1″ to run and as good systems administrators, we all know where to look in for shared libraries right ?
Anyway, for the sake of this exercice, let’s assume we have no clue that librt.so.1 is supposed to be in /lib…
(now for the fun of it, google the “above ls error” and be amazed on how many person reported this error on forums)
So let’s use our strace magic here and see how we can fix the problem.
[root@web01 ~]# strace /bin/ls
execve("/bin/ls", ["/bin/ls"], [/* 21 vars */]) = 0
brk(0) = 0x88d2000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=22984, ...}) = 0
mmap2(NULL, 22984, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f88000
close(3) = 0
open("/lib/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/lib/tls/i686/sse2/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
As we can see, “ls” is trying to open librt.so.1 at “open(“/lib/librt.so.1″, O_RDONLY) = -1 ENOENT (No such file or directory)”… reading the following of the output, we see how the program is trying to look up for the library file in other libraries folders as set in our Library Path shell variable.
The solution, would therefore be to “MOVE” our librt.so.1 file back to our /lib folder and resolve our headaches.
(I wrote MOVE in bold, since COPY relay on this library… so copy would be broken as well at this point).
—–
Now, let us spice up things around, let’s erase the content of librt.so.1 (Make sure to backup the original).
[root@web01 ~]# echo "" > /lib/librt.so.1
let’s try… and…
ls: error while loading shared libraries: /lib/librt.so.1: file too short
Now, things are getting interesting… you may wonder, what in the world, does “file too short” could possibly mean?
The error, gives you the path “/lib”, so we know the file is there, since it doesn’t necessary complain that it can’t find it. So let’s try to strace is and get what is really happening.
[root@web01 ~]# strace /bin/ls
execve("/bin/ls", ["/bin/ls"], [/* 21 vars */]) = 0
brk(0) = 0x8147000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=22984, ...}) = 0
mmap2(NULL, 22984, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f13000
close(3) = 0
open("/lib/librt.so.1", O_RDONLY) = 3
read(3, "\n", 512) = 1
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f12000
writev(2, [{"/bin/ls", 7}, {": ", 2}, {"error while loading shared libra"..., 36}, {": ", 2}, {"/lib/librt.so.1",
15}, {": ", 2}, {"file too short", 14}, {"", 0}, {"", 0}, {"\n", 1}], 10/bin/ls: error while loading shared
libraries: /lib/librt.so.1: file too short
) = 79
Now.. thanks again to Strace, we got our answer..
For the sake of understanding all this gibberish, let’s go through the most essential ones.
1. execve executes a program pointed by the const filename and optionally an argv const parameter
2. brk(0) – brk called with the argument 0 just looks up for a breakpoint, set of free and malloc (memory management) takes place at this level
3. nmap is creating here a pagefile at 0xb7f13000
then comes the open call we saw earlier, followed which in return is followed by a “read(3, “\n”, 512) = 1″
Now… let’s break here and go back to our error “file too short”….
read() – ssize_t read (int fd, void *buf, size_t count) – access the file and loops its content through a buffer buf to the number of bytes “count”.. upon read, read will therefore outputs the number of bytes read. In our result here, we see 2 things: the buffer starts with “\n” and the return number of bytes read is 1… whereas it is supposed to be 512, since the lib file is supposed to contain 512 bytes count of data.
A shared library is also supposed to contain an ELF header… which in this case, it doesn’t (of course, it doesn’t we did erase that lib content lines earlier
)
A library header would therefore be as read(3, “\177ELF……”, 512)
The “\n” starting buffer and 1 byte read therefore means that our file is just empty ?
- Problem solved -
3. Other cases where Strace can help
Feeling like some programs run slow? Do an strace and look up the access paths for each library… this would tell you about your LD_LIBRARY_PATH and potentially for what to optimize
Another common case would be system call hangs, when the system call has no code return, which in return would lead to debug using other tools.
I hope that was useful