Archive

Archive for the ‘Unix / Linux’ Category

OpenVZ/Virtuozzo YUM ?

January 30th, 2009 Ali Abbas No comments

I was cleaning up my hard disk and came across a tar file I put together a while back when dealing with Virtuozzo containers needing Yum.

http://alouche.net/centos/yum-centos4-64.tar

The rpm files are for Centos 4.X 64 bits… anyone looking up for Centos 5.X, the libraries should be the same, so fetching the same rpm names (#arch #version) should be a piece of cake.

- untar
- rpm -ivh *

and voila!

Hope that helps!

Ali

Categories: Redhat/Centos

Centos 5.3 to be released soon

January 21st, 2009 Ali Abbas No comments

As we all know RedHat 5.3 has been released yesterday (20th Jan 08)…

While the mainstream packages are being tossed together for the centos release, you may want to get further twitter updates through http://twitter.com/CentOS

Wondering what’s to be new in this new release, hitch down to the redhat official changelogs

Red Hat Enterprise Linux 5.3 GA Announcement:
https://www.redhat.com/archives/rhelv5-list/2009-January/msg00092.html

Kernel (2.6.18-128.el5) changelog:
http://rhn.redhat.com/errata/RHSA-2009-0225.html

Release notes:
http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Release_Notes/index.html

Press release:
http://www.redhat.com/about/news/prarchive/2009/rhel_5_3.html

Package manifest (LOTS of errors in this doc):
http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Release_Manifest/index.html

Categories: General, Redhat/Centos

Strace – Reverse Engineering – System Calls

December 17th, 2008 Ali Abbas No comments

If there is one recurring problem that I often see gagging the forums is “Library missing”, or often “installed libraries which a program doesn’t find”.

I decided to share a simple debugging technique which could save the day or even the hours… Google might not be the right choice all the time, when you have got strace at your finger tips.

1. System Calls

To understand strace, you first need to understand what a system call is. So what is a system call? a system call is simply a kernel function, which I would say executes within the kernel mode and thus resides between the user code and the kernel.

Whenever in a C program, you call the function open(), you are indeed calling a C function “open” which in turn just switches from user mode to kernel mode and run the system call “open” of the kernel.

So the concept of switching is very important here to our understanding of system calls and functions. A switching event would usually be either a software interrupt, a gate or trap instruction.

2. Reverse Engineering with Strace

First let’s break our system to setup


[root@web01 ~]# ldd /bin/ls
linux-gate.so.1 =>  (0x009c2000)
librt.so.1 => /lib/librt.so.1 (0x003d0000)
libacl.so.1 => /lib/libacl.so.1 (0x003b3000)
libselinux.so.1 => /lib/libselinux.so.1 (0x00b20000)
libc.so.6 => /lib/libc.so.6 (0x00243000)
libpthread.so.0 => /lib/libpthread.so.0 (0x0038e000)
/lib/ld-linux.so.2 (0x00225000)
libattr.so.1 => /lib/libattr.so.1 (0x003ac000)
libdl.so.2 => /lib/libdl.so.2 (0x00388000)
libsepol.so.1 => /lib/libsepol.so.1 (0x00110000)

As we can see those are the libraries “ls” must load before executing its system call and give us the usual pretty output.

Let’s move librt.so.1 out of /lib to our backup folder in /root/libBackup

Execute ls at the command line


[root@web01 ~]# ls
ls: error while loading shared libraries: librt.so.1:
cannot open shared object file: No such file or directory

Of course, the error message here is pretty obvious… ls needs “librt.so.1″ to run and as good systems administrators, we all know where to look in for shared libraries right ?

Anyway, for the sake of this exercice, let’s assume we have no clue that librt.so.1 is supposed to be in /lib…

(now for the fun of it, google the “above ls error” and be amazed on how many person reported this error on forums)

So let’s use our strace magic here and see how we can fix the problem.


[root@web01 ~]# strace /bin/ls
execve("/bin/ls", ["/bin/ls"], [/* 21 vars */]) = 0
brk(0)                                  = 0x88d2000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=22984, ...}) = 0
mmap2(NULL, 22984, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f88000
close(3)                                = 0
open("/lib/librt.so.1", O_RDONLY)       = -1 ENOENT (No such file or directory)
open("/lib/tls/i686/sse2/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)

As we can see, “ls” is trying to open librt.so.1 at “open(“/lib/librt.so.1″, O_RDONLY)       = -1 ENOENT (No such file or directory)”… reading the following of the output, we see how the program is trying to look up for the library file in other libraries folders as set in our Library Path shell variable.

The solution, would therefore be to “MOVE” our librt.so.1 file back to our /lib folder and resolve our headaches.

(I wrote MOVE in bold, since COPY relay on this library… so copy would be broken as well at this point).

—–

Now, let us spice up things around, let’s erase the content of librt.so.1  (Make sure to backup the original).


[root@web01 ~]# echo "" > /lib/librt.so.1

let’s try… and…


ls: error while loading shared libraries: /lib/librt.so.1: file too short

Now, things are getting interesting… you may wonder, what in the world, does “file too short” could possibly mean?

The error, gives you the path “/lib”, so we know the file is there, since it doesn’t necessary complain that it can’t find it. So let’s try to strace is and get what is really happening.


[root@web01 ~]# strace /bin/ls
execve("/bin/ls", ["/bin/ls"], [/* 21 vars */]) = 0
brk(0)                                  = 0x8147000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=22984, ...}) = 0
mmap2(NULL, 22984, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f13000
close(3)                                = 0
open("/lib/librt.so.1", O_RDONLY)       = 3
read(3, "\n", 512)                      = 1
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f12000
writev(2, [{"/bin/ls", 7}, {": ", 2}, {"error while loading shared libra"..., 36}, {": ", 2}, {"/lib/librt.so.1",

15}, {": ", 2}, {"file too short", 14}, {"", 0}, {"", 0}, {"\n", 1}], 10/bin/ls: error while loading shared

libraries: /lib/librt.so.1: file too short
) = 79

Now.. thanks again to Strace, we got our answer..

For the sake of understanding all this gibberish, let’s go through the most essential ones.

1. execve executes a program pointed by the const filename and optionally an argv const parameter
2. brk(0) – brk called with the argument 0 just looks up for a breakpoint, set of free and malloc (memory management) takes place at this level
3. nmap is creating here a pagefile at 0xb7f13000

then comes the open call we saw earlier, followed which in return is followed by a “read(3, “\n”, 512) = 1″

Now… let’s break here and go back to our error “file too short”….

read() – ssize_t read (int fd, void *buf, size_t count) – access the file and loops its content through a buffer buf to  the number of bytes “count”.. upon read, read will therefore outputs the number of bytes read. In our result here, we see 2 things: the buffer starts with “\n” and the return number of bytes read is 1… whereas it is supposed to be 512, since the lib file is supposed to contain 512 bytes count of data.

A shared library is also supposed to contain an ELF header… which in this case, it doesn’t (of course, it doesn’t we did erase that lib content lines earlier :) )

A library header would therefore be as read(3, “\177ELF……”, 512)

The “\n” starting buffer and 1 byte read therefore means that our file is just empty ? :)

- Problem solved -

3. Other cases where Strace can help

Feeling like some programs run slow? Do an strace and look up the access paths for each library… this would tell you about your LD_LIBRARY_PATH and potentially for what to optimize

Another common case would be system call hangs, when the system call has no code return, which in return would lead to debug using other tools.

I hope that was useful :)

Categories: Unix / Linux

How to read a load output

December 16th, 2008 Ali Abbas No comments

This is a very quick post as to quickly explained to all Linux power users on how to read a load output; since it has come to my knowledge that the great majority of linux users do not know how to after talking this afternoon with a friend.

Open a cli and type in “uptime” or “w” or why not “top”.

[root@dev1 ~]# uptime
14:19:49 up 27 min,  1 users, load average: 6.63, 5.25, 3.09

*** this uptime is made up for tutorial purposes ***

Now.. to better understand, imagine this array

| 1mn | 5mn | 15mn|
| 6.63 | 5.25 | 3.09 |

So each of those values are the average load during those exact time fractions… 6.63 is thus the average load during the last minute, 5.25 during the last 5mn and of course 3.09 during the last 15mn

Now, that we know what each value is associated to, we need to understand the two phases of a processus.

When I say 2 phases, I am only refering here to the cpu “running” and “awaiting” time. In other words, either a processus is being executed OR is waiting to be executed.

So those “numbers” you see, those average load numbers simply represents the number of processus being either executed or waiting to be executed on the system.

Starting to be clear? :)

Now… you may wonder, when do we say “there is too much load” and when “there isn’t”.

Let’s imagine our system only has one single cpu… according to our previous load output, it would mean

During the last minute, 6.63 processus were either running or waiting… since the single cpu would handle a processus at a time, that gives us (1 processus being executed and 5.53 awaiting), which deducs that the system was overloaded of 553%

I wrote “overloaded” in bold as this is the factor you currently should be wondering of, rather than the total average load. On a production system, it matters more as to what is on the queue, than to what is being currently processed.

Think of it as a pile of clients awaiting in the line at the grocery store… the cashier deals only with one client at a time, when the pile keeps growing and she can’t seem to keep up with the flux of client, she will start “panicking” and “stress”… get tired and probably at the end, just close the waiting line. A big number clients will then result in a second cashier to join and help.

Well, it isn’t like this really on a unix system :) ! however, it just gives the idea.

Now say, we had 4 cpus… with a load of 6.63, that would mean 2.63 processes waiting, as in 263% overloaded.

I hope that was helpful and intuitive.

Categories: Unix / Linux

Create your own manpage

December 12th, 2008 Ali Abbas No comments

Those who know me, know that I usually tend to document every little hack, server modification, kernel update and events that I happened to correct, install, modify, hack or fix!

I mainly do this as to keep on being productive but as well as keeping other admins updated on the modification which happened on the server.

Having said that! Most of us find ourselves often in front of a console terminal and sometimes in need of a memory refresh… wether it is us who can’t remember how we fixed a specific problem or another system administration who wonders “why is this perl script here and what does it do”.

Now… one could either set a homepage, create a documentation (wiki) and say “check this url for further info”. Considering that this is a cool alternative, it gets trickier when you are sitting in the server room with only the holy black console. Your server could have no connectivity, thus paralyzing you to fetch any remote documentation.

Many reasons, why manpages are cool tricks and since we are cool people! why not.

Now, let’s imagine I have a bash/perl script called “MysqlNas”.. this script does one thing and this thing is to backup through a cron, the server’s mysql table unto a NAS.

Now let’s create a manpage for our script, so that anyone who types “man mysqlnas” will get a rich set of infos / warnings / update version OR even server hacks we had to do, to make the script work (well let’s imagine :) ).

We are going to use a free utiliy called txt2man… the utility can be downloaded from the FSF website at http://directory.fsf.org/project/txt2man/

txt2man is a shell script, so you will call it as in ./txt2man

Create a txt file which will contain your program infos


vi mysqlnas.txt

Then type in (adjust at your will) the following structure


NAME
mysqlnas - backup the mysql databased unto the nas server

SYNOPSIS
/usr/bin/mysqlnas.pl database_name

DESCRIPTION
MysqlNas is a cool perl script which will recursively backup every database in /var/lib/mysql

A command argument could be parsed to the script to backup a single database... no ARG will result in the backup of all databases

ENVIRONMENT

You need to run the script as root

BUGS

Yet to be discovered

AUTHOR

Ali Abbas (alouche07@gmail.com)

One saved.. run the following command


./txt2man -t Mysql2Nas -s 8 mysqlnas.txt > mysqlnas.man

And voila we have created our manapage :) … Now we need to move the mysqlnas.man page into the /usr/share/man/(language)/manX/ for our man command to locate it.

** (language) : correspond to your system’s language! in our case, it would be “en”
** (X): correspond to the manpage section… we have set “-s 8″, meaning we are create our manapage in section 8 of the manage sections (for more info, refer to Wikipedia Manapage Sections)

so


mv mysqlnas.man /usr/share/man/en/man8/

now type in your command line


man mysqlnas

enjoy :)

Categories: Unix / Linux