May 15, 2010

Linux Kernel Route Cache

To understand the importance of the routing cache, it is important to keep in mind and visualize the 3 main routing hash tables in use in the kernel for routing decisions… the Route Cache (what we will be discussing), the Route Policy Database and the Route Table. It is also in this order that the network subsystem queries the tables to make a forwarding decision. To display the “Route Cache”, one could simply issue the “ip route show cache” command.

[ kernel network subsystem ] —-> Route Cache || [ If no match ] —-> RPDB || [ If no match ] —-> Route Table

When the routing subsystem of the kernel is initialized, an exec of ip_rt_init is initiated. [sourcecode language=“c”]void __init ip_init(void) { dev_add_pack(&ip_packet_type); ip_rt_init(); inet_initpeers(); #ifdef CONFIG_IP_MULTICAST proc_net_create(“igmp”, 0, ip_mc_procinfo); #endif } [/sourcecode] ** (source: linux/net/ipv4/ip_output.c)**

Part of ip_rt_init is to allocate the memory set to be used to cache the network routes but also to initialize global variables such as rt_hash_table and rt_hash_rnd and many others.

You can view the details of the ip_rt_init function in** /net/ipv4/route.c**

rt_hash_table defines a hash table of the route cache with rt_hash_mask holding its size. An easy way to check the size of the routing cache table is to look in dmesg, by grepping “IP route”

[email protected]:~$ dmesg |grep “IP route” [    1.814492] IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)

The max size of the route cache is configurable through /proc/sys/net/ipv4/route/max_size. That being said, the current size is determined when booting in junction with the amount of ram available; furthermore to prevent the size of the hash to exceed the max_size, the kernel makes use of a garbage collector gc_tresh

When an interface goes down/up or a new changes take place which would affect the routing cache, the kernel executes rt_cache_flush which in returns executes rt_run_flush.Many events such as IP removal, removal of interfaces etc… trigger the route cache to be flushed, however keep in mind, that periodically it will be flushed based on the value of** rt_secret_timer**. The time value is configurable in /proc/sys/net/ipv4/route/secret_interval.

To trigger a route cache flush, issue

echo -1 > /proc/sys/net/ipv4/route/flush or ip route flush cache

ip_rt_min_delay and ip_rt_max_delay define the time within which the flush would occur, setting the ip_rt_min_delay value to 0 would immediately ensure that the cache is flushed when rt_run_flush is triggered.

I have only tackled a tiny aspect of the Route Cache network subsystems and hopefully this gives some basic ideas unto which to build to get a grounded understanding on how the global subsystem operates at the Network Stack - having said that, the routing cache is only one subset of all its functions.