October 21, 2009

PHP OPCodes Cached with APC - part 1

There are many caching system in use to optimize the execution of PHP script on busy web/database servers. Today we will focus on the OPcode caching method using APC.

Before we start… what is an OPcode?

The OPcode is an executable code generated each time a PHP script is interpreted and compiled. Each time you visit a webpage, the webserver (apache for example) would generate an OPcode of the PHP script serving your request. They are therefore simply C data structure which are interpreted by the PHP Virtual Machine (Zend Engine).

Now you can imagine, generating the OPcode can be a drain on the server and quite useless if the code does not change often. This is where the OPcode caching system comes into play; but before we go on, let’s see some OPcode example using the Vulcan Logic Disassembler.

First, we create a file test.php in which we will execute a unix ls -l command

test.php - <? system(“ls -l”); ?>

wrk01:/var/www# php -d vld.active=1 test.php Branch analysis from position: 0 Return found filename:       /var/www/test.php function name:  (null) number of ops:  4 compiled vars:  none

line     #  op                           fetch          ext  return  operands

2     0  SEND_VAL                                                 ‘ls+-l’ 1  DO_FCALL                                      1          ‘system’ 4     2  RETURN                                                   1 3* ZEND_HANDLE_EXCEPTION

Now let’s try a simple ( echo “hello world!” )

wrk01:/var/www# php -d vld.active=1 test.php Branch analysis from position: 0 Return found filename:       /var/www/test.php function name:  (null) number of ops:  3 compiled vars:  none

line     #  op                           fetch          ext  return  operands

2     0  ECHO                                                     ‘hello+world%21’ 4     1  RETURN                                                   1 2* ZEND_HANDLE_EXCEPTI

In our first example, we had 4 OPcodes and in our second, 3 OPCodes… while 3,4 makes no difference whatsoever, it would be interesting to estimate how many OPcodes, an application such as WordPress generates :-). But as you guessed, the more opcodes to generate, the longer the execution process would take, leading to slow performance.

Now what is APC ?

APC stands for Alternative PHP Cache; what it does, is to simply cache those OPcodes into a memory by providing a locking mechanism and a suite of routines to add newer cache, remove expired data cache and pull out the OPcode cache from memory. It also provides you within its framework API to see stats on which OPcodes are cached, which script have been executed, which ones have missed to be cached etc..

APC has also been developed like a Zend Extension, thus it is possible to just implement it in your current PHP install as a module. Furthermore it requires no script side modification as APC is transparent to your PHP developers.

Before we go on, let’s install APC

On Debian

wrk01:~# apt-get install php-apc

You can also install it using PECL

wrk01:~# pecl install apc

Now APC is enabled with the default values and settings. Let’s spend some time now discussing the Locking Type mechanism used by APC, as it will allow us to explore furthermore APC.

APC Support

enabled

Version

3.0.19

MMAP Support

Enabled

MMAP File Mask

no value

Locking type

pthread mutex Locks

Revision

$Revision: 3.154.2.5 $

As we see in our phpinfo() output, APC is currently using the “pthread mutex Locks”. It is important to keep in mind that, they are other Locking mechanism available as the “Inter-Process Control”, the “Linux Futext” and Spin locks.

Why having a Locking Mechanism? Because APC caches the OPcodes into shared memory, it is important to control access modification to those cache by other processes and thus protect data integrity.

In the second part of this post, I will go over on how to tweak and configure APC using the APC configuration “apc.ini” file.

If you recall, I said earlier that APC was transparent to the PHP developers; while that is true, it isn’t limited to it. As a matter of fact, APC allows you to store variables, results of SQL queries directly into the cache. This is therfore refered as the User Data Cache.

User Data Cache

Let’s imagine, you have a wordress blog with more than 1000 posts and well you wish to provide diverse statistics over those 1000 posts such as “number of comments, number of votes etc…”. If your stat script would query your database each time someone visits your site to gather the stats, it could easily result in a bottleneck if you are having more than 1000+ users at once visiting your blog. The ideal would be to therefore cache the stat results into cache.

The following APC function allow you to directly manipulate the APC cache.

Info

  1. array apc_cache_info ([ string $cache_type [, bool $limited ]] )
  2. array apc_sma_info ([ bool $limited ] )

Insertion

  1. bool apc_store ( string $key , mixed $var [, int $ttl ] )
  2. bool apc_add ( string $key , mixed $var [, int $ttl ] )
  3. bool apc_compile_file ( string $filename )

Retrieve data

  1. mixed apc_fetch ( string $key )

Deletion

  1. bool apc_delete ( string $key )
  2. bool apc_clear_cache ([ string $cache_type ] )

Quick Example of APC User Data Cache

We assume that we have a function called gatherStats() which queries the database and returns a value such as the number of posts.

The variable stocking the number of posts is $myStats and we want to cache that variable for 30mn. We will be using 2 APC functions bool apc_store ( string $key , mixed $var [, int $ttl ] ) and mixed apc_fetch ( string $key )

if (!$myStats = apc_fetch(‘myStats’))

{

$myStats = gatherStats();

apc_store(‘myStats’,$myStats,1800);

}

We first check if the key myStats exists in the APC cache using the apc_fetch boolean function, if it doesn’t we query our database and stock the value in the cache using the apc_store function.

I hope that was informative, I will be writing Part 2 soon which will tackle the general configuration and tweaking of APC.

Cheers,