December 1, 2015

OpenNebula in a server with two network interfaces. Organizing the traffic with iproute2

During the installation of OpenNebula in our sandbox server we decided that it would be proper to split the traffic between the two network interfaces, one for administration, and one for the VMs traffic to the physical network.

There are many ways of handling the traffic, but for our convenience we decided to use the iproute2 routes and rules.



We started by adding the interface (eth1) to the bridge being used by the VMs (Vbr0)

So, to facilitate the understanding of the examples it would be better if the reader can understand our context by assuming the following facts:
  • Our gateway is 192.168.7.254
  • The subnet of the physical network is 192.168.7.0/24
  • Our eth0 is using a static IP 192.168.7.1
  • eth1 is connected to the physicall network but not using directly an IP, instead it is part of the Vbr0 bridge
  • Vbr0 is a bridge that bridges eth1 and the vnet<n> interfaces created for the virtual machines
  • Vbr0 has two static IPs, 192.168.7.2, and an aliased IP (Vbr0:1) 192.168.254.254
  • The subnet of the VMs is 192.168.254.0/2
  • The gateway of the VMs is 192.168.254.254 (We are still evaluating if it is a good idea to have this gateway or instead adding a rule on the VM to route the 192.168.7.0/24 subnet)

First, as we will have two network interfaces connected to the same subnet, we should enable the arp_filter to avoid the ARP flux issue (reference: http://linux-ip.net/html/ether-arp.html)
# sysctl -w net.ipv4.conf.all.arp_filter=1 # sysctl -w net.ipv4.conf.eth0.arp_filter=1 # sysctl -w net.ipv4.conf.Vbr0.arp_filter=1 # echo "net.ipv4.conf.all.arp_filter = 1" >> /etc/sysctl.conf
Then, to separate the routing tables for the traffic of each interface, we have defined two routing tables (Note that the max amount if routing tables is 253):
# echo "1 eth0" >> /etc/iproute2/rt_tables # echo "200 Vbr0" >> /etc/iproute2/rt_tables
Now that we have the routing tables created, we can start with adding the proper routes, and removing the obsoletes:
# ip route add 192.168.7.0/24 dev eth0 src 192.168.7.1 table eth0 # ip route add 192.168.7.0/24 dev Vbr0 src 192.168.7.2 table Vbr0 # ip route add 192.168.254.0/24 dev Vbr0
# ip route del 192.168.7.0/24 dev eth0
# ip route del 192.168.7.0/24 dev Vbr0
Just to provide a small explanation about why we removed the routes to the 192.168.7.0/24 subnet at the end, this is because we are avoiding duplicates, and the rules are already handling the IP traffic through the respective table depending on the source, in this case, as we are not specifying the table from where the routes should be deleted, the ip route command assumes it is from the main table (the default routing tables are local, main and default)

Apart from that, for the routes we created on the tables we are specifying which source address address to use as we know that with the rules we will use we only handle the traffic using coming from that IP, again this is just a preventive configuration, but by forcing this it will facilitate the detection of errors during the testing of the configuration.

Now it is time for the rules, in our case the rules are easy to understand, basically we will send the traffic to each table depending on its source:
# ip rule add from 192.168.7.1 table eth0 # ip rule add from 192.168.7.2 table Vbr0
As all this changes where performed online remember that the ARP table cache and the routes cache should be flushed on all the devices reaching this server as the arp tables where probably filled up with incorrect MACs (Both IPs to the same MAC address). In linux it can be done using the ip commands:
# ip neigh flush all
# ip route flush cache
The question at this stage (and after testing everything and confirming that it is actually working) would be probably how to turn this changes into permanent ones. It could be achieved in many ways, but one of the recommended options in this case (As the rules and routes we are generating are still simple and easy to understand) will be for sure using post-up in the network configuration to add our custom rules and routes. The following is the example of our /etc/network/interfaces file, note the amount of times we flush the cache of the arp tables and the routes each time we add the new routes, this is just a preventive step as in some of the cases we faced cases the cache was obsolete and the identifying the errors was really confusing:

# This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto eth0 iface eth0 inet static address 192.168.7.1 netmask 255.255.255.0 gateway 192.168.7.254 #post-up ip route add default via 192.168.7.254 dev eth0 table eth0 post-up ip route add 192.168.7.0/24 dev eth0 src 192.168.7.1 table eth0 post-up ip rule add from 192.168.7.1 table eth0 post-up ip route delete 192.168.7.0/24 dev eth0 proto kernel scope link src 192.168.7.1 post-up ip route flush cache post-up ip neigh flush all # bridge_maxwait 5 auto Vbr0 iface Vbr0 inet static address 192.168.7.2 netmask 255.255.255.0 network 192.168.7.0 pre-up ip link set eth1 down pre-up brctl addbr Vbr0 pre-up brctl addif Vbr0 eth1 pre-up ip addr flush dev eth1 pre-up brctl stp Vbr0 on pre-up brctl setmaxage Vbr0 12 pre-up brctl sethello Vbr0 2 pre-up brctl setfd Vbr0 9 #post-up ip route add default via 192.168.7.254 dev Vbr0 table Vbr0 post-up ip route add 192.168.7.0/24 dev Vbr0 src 192.168.7.2 table Vbr0 post-up ip rule add from 192.168.7.2 table Vbr0 post-up ip link set eth1 up post-up ip route delete 192.168.7.0/24 dev Vbr0 proto kernel scope link src 192.168.7.2 post-up ip route add 192.168.7.0/24 dev Vbr0 table VMFarm post-up ip route flush cache post-up ip neigh flush all post-down ip link set eth1 down post-down ip link set Vbr0 down post-down brctl delif Vbr0 eth1 post-down brctl delbr Vbr0 auto Vbr0:1 iface Vbr0:1 inet static address 192.168.254.254 netmask 255.255.255.0 network 192.168.254.0 post-up ip route delete 192.168.254.0/24 dev Vbr0 proto kernel scope link src 192.168.254.254 post-up ip route add 192.168.254.0/24 dev Vbr0 post-up ip route flush cache post-up ip neigh flush all post-up ip rule add from 192.168.254.0/24 to 192.168.7.0/24 table VMFarm


I hope this was clear enough to let the reader start understanding the capacity and usage of iproute2, but still this example is just a simple one, http://www.linux-ip.net is a good reference about networking administration in linux, with lots of documentation and more advanced examples.

Also, in case of any mistakes, or improvements to my article please let me know so I can be improved for further readers.

EDIT: I was thinking that for the reader it could be also useful if I share the way I thought about the configuration before actually doing it on the server, as I consider essential the power of abstraction when resolving issues.

Basically I did a small table with the rules and the routes in parallel, and then I manually tested how would the rules act with IPs from different subnets and/our classes.

In the next picture there is a photo of the paper where I represented the rules, and in parallel the ip routes and rules on the server, everything matched with letters.



Related links: