{"id":111,"date":"2014-07-26T13:43:59","date_gmt":"2014-07-26T03:43:59","guid":{"rendered":"https:\/\/icicimov.com\/blog\/?p=111"},"modified":"2017-01-02T14:57:48","modified_gmt":"2017-01-02T03:57:48","slug":"vipeip-fail-over-with-keepalived-in-amazon-vpc-across-availability-zones","status":"publish","type":"post","link":"https:\/\/icicimov.com\/blog\/?p=111","title":{"rendered":"VIP(EIP) fail over with Keepalived in Amazon VPC across availability zones"},"content":{"rendered":"<p>This example covers VIP failover in AWS VPC across AZ&#8217;s with Keepalived. The main problem in AWS is that this provider is blocking the <code>multicast<\/code> traffic in the VPC&#8217;s. To circumvent this we need to switch to unicast for the LVS\/IPVS cluster communication. Another issue is the challenge of the virtual environment it self, more specific the VIP failover. In the virtual world it is not enough to move the VIP from one host to another but we also need to inform the physical host Hypervisor platform (Xen,KVM etc) about the change so the traffic can be correctly routed to the new destination via its SDN (Software Defined Network).<\/p>\n<p>The solution of the first problem is using the <code>unicast_src_ip<\/code> and <code>unicast_peer<\/code> options to tell Keepalived to use <code>unicast<\/code> for communication. It does the job but is pretty limiting solution since we need to specify the IP&#8217;s of the nodes in the setup. For the second one, VIP failover which in case of AWS will be <code>EIP<\/code> (Elastic IP), we need a <code>notify_master<\/code> script which implements this function via AWS CLI utilities.<\/p>\n<h2>Nodes preparation<\/h2>\n<p>We have set up two nodes, one in each AZ (Avalibility Zone). The service using the VIP in this case is HAProxy. There are two internal networks in each AZ each of the nodes is connected to. There are routing tables set in the VPC so appropriate subnets can sea each other, one routing table for <code>10.18.16.0\/24<\/code> and <code>10.18.18.0\/24<\/code> and separate one for <code>10.18.17.0\/24<\/code> and <code>10.18.19.0\/24<\/code>.<\/p>\n<h3>On host01<\/h3>\n<p>The primary network interface eth0 has EIP of <code>54.226.x.x<\/code> associated to it. The internal IP&#8217;s for both interfaces on this server and the network config are given below:<\/p>\n<pre><code>user@host01:~$ ip addr show \n1: lo: &lt;LOOPBACK,UP,LOWER_UP&gt; mtu 16436 qdisc noqueue state UNKNOWN\n    link\/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00\n    inet 127.0.0.1\/8 scope host lo\n    inet6 ::1\/128 scope host\n       valid_lft forever preferred_lft forever\n2: eth0: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc pfifo_fast state UP qlen 1000\n    link\/ether 06:12:0b:e6:d1:86 brd ff:ff:ff:ff:ff:ff\n    inet 10.18.16.11\/24 brd 10.18.16.255 scope global eth0\n    inet6 fe80::412:bff:fee6:d186\/64 scope link\n       valid_lft forever preferred_lft forever\n3: eth1: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc pfifo_fast state UP qlen 1000\n    link\/ether 06:27:ca:b8:b1:98 brd ff:ff:ff:ff:ff:ff\n    inet 10.18.17.11\/24 brd 10.18.17.255 scope global eth1\n    inet6 fe80::427:caff:feb8:b198\/64 scope link\n       valid_lft forever preferred_lft forever\n\nuser@host01:~$ cat \/etc\/network\/interfaces\n# This file describes the network interfaces available on your system\n# and how to activate them. For more information, see interfaces(5).\n\n# The loopback network interface\nauto lo \niface lo inet loopback\n\n# The primary network interface\nauto eth0 \niface eth0 inet dhcp \n\n# The secondary network interface\nauto eth1 \niface eth1 inet static \naddress 10.18.17.11 netmask 255.255.255.0\n\npost up ip route add 10.18.19.0\/24 dev eth1 via 10.18.17.1 || true \npost-down ip route del 10.18.19.0\/24 dev eth1 via 10.18.17.1 || true<\/code><\/pre>\n<h3>On host02<\/h3>\n<p>The primary network interface eth0 has EIP of <code>54.219.x.x<\/code> associated to it. The internal IP&#8217;s for both interfaces on this server and the network config are given below:<\/p>\n<pre><code>user@host02:~$ ip addr show \n1: lo: &lt;LOOPBACK,UP,LOWER_UP&gt; mtu 16436 qdisc noqueue state UNKNOWN\n    link\/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00\n    inet 127.0.0.1\/8 scope host lo\n    inet6 ::1\/128 scope host\n       valid_lft forever preferred_lft forever\n2: eth0: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc pfifo_fast state UP qlen 1000\n    link\/ether 02:46:47:62:60:0c brd ff:ff:ff:ff:ff:ff\n    inet 10.18.18.11\/24 brd 10.18.18.255 scope global eth0\n    inet6 fe80::46:47ff:fe62:600c\/64 scope link\n       valid_lft forever preferred_lft forever\n3: eth1: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc pfifo_fast state UP qlen 1000\n    link\/ether 02:f3:3f:20:8d:cb brd ff:ff:ff:ff:ff:ff\n    inet 10.18.19.11\/24 brd 10.18.19.255 scope global eth1\n    inet6 fe80::f3:3fff:fe20:8dcb\/64 scope link\n       valid_lft forever preferred_lft forever\n\nuser@host02:~$ cat \/etc\/network\/interfaces\n#This file describes the network interfaces available on your system\n#and how to activate them. For more information, see interfaces(5)\n\n#The loopback network interface\nauto lo \niface lo inet loopback\n\n#The primary network interface\nauto eth0 \niface eth0 inet dhcp\n\n# The secondary network interface\nauto eth1 \niface eth1 inet static address 10.18.19.11 netmask 255.255.255.0\n\npost-up ip route add 10.18.17.0\/24 dev eth1 via 10.18.19.1 || true \npost-down ip route del 10.18.17.0\/24 dev eth1 via 10.18.19.1 || true<\/code><\/pre>\n<h2>Keepalived cluster setup<\/h2>\n<p>There is a free EIP <code>54.192.x.x<\/code> available that will be used as VIP for the load balancing. Same as any other VIP scenario this EIP will be floating between the servers depending on which one is in active or passive mode. Except that for services in separate AZ in Amazon moving the VIP between servers is not enough. Since the routing is done by the AWS, the moving of the VIP (the EIP in this case) has to be done via Amazon API so the infrastructure actually knows how and where to route the VIP (read EIP) on switch over. Having this said we need to write a custom script <code>\/etc\/keepalived\/vrrp.sh<\/code> that will do the switch over in Amazon way.<\/p>\n<p>Basically keepalived will be monitoring the local haproxy process and in case of failure on the <code>MASTER<\/code> node, the <code>BACKUP<\/code> node will take over the EIP and become a MASTER it self. The <code>notify_master<\/code> script also re-assigns the old EIP back to the BACKUP server so it can be still accessed remotely. The value of the <code>virtual_ipaddress<\/code> does not matter in this case.<\/p>\n<p>One very important setting is the priority of the VRRP instance which needs to be higher on one of the nodes. In case we have set MASTER and BACKUP state then the MASTER needs to have higher priority. The difference between the priorities of the both nodes has to be lower then the weight of the health check script which is another thing to have in mind. It is very important for making the instance fail over decision.<\/p>\n<h3>On host01<\/h3>\n<p>We create new <code>\/etc\/keepalived\/keepalived.conf<\/code> config file:<\/p>\n<pre><code>vrrp_script haproxy-check {\n    script \"killall -0 haproxy\"\n    interval 2\n    weight 20\n}\n \nvrrp_instance haproxy-vip {\n    state MASTER\n    priority 102\n    interface eth0\n    virtual_router_id 47\n    advert_int 3\n \n    unicast_src_ip 10.18.16.11\n    unicast_peer {\n        10.18.18.11\n    }\n \n    notify_master \"\/etc\/keepalived\/vrrp.sh 54.192.x.x start\"\n \n    virtual_ipaddress {\n        10.15.85.31\n    }\n \n    track_script {\n        haproxy-check weight 20\n    }\n}<\/code><\/pre>\n<p>The API script for EIP switch over <code>\/etc\/keepalived\/vrrp.sh<\/code>:<\/p>\n<pre><code>#!\/bin\/bash\n[ -f \/etc\/keepalived\/aws.conf ] && . \/etc\/keepalived\/aws.conf\n. \/lib\/lsb\/init-functions\n \nENI_ID=\"eni-2dxxxxxx\"\nALOC_ID=\"eipalloc-39xxxxxx\"\nINST_ID=\"i-acxxxxxx\"\nELASTIC_IP=$1\n \ncase $2 in\n    start)\n        ec2-associate-address -n $ENI_ID -a $ALOC_ID --allow-reassociation\n        echo started\n        ;;\n    stop)\n        #ec2-disassociate-address $ELASTIC_IP\n        echo stopped\n        ;;\n    status)\n        ec2-describe-addresses | grep \"$ELASTIC_IP\" | grep \"$INST_ID\" > \/dev\/null\n        [ $? -eq 0 ] && echo OK || echo FAIL\n        ;;\nesac<\/code><\/pre>\n<h3>On host02<\/h3>\n<p>We create new <code>\/etc\/keepalived\/keepalived.conf<\/code> config file:<\/p>\n<pre><code>vrrp_script haproxy-check {\n    script \"killall -0 haproxy\"\n    interval 2\n    weight 20\n}\n \nvrrp_instance haproxy-vip {\n    state BACKUP\n    priority 101\n    interface eth0\n    virtual_router_id 47\n    advert_int 3\n \n    unicast_src_ip 10.18.18.11\n    unicast_peer {\n        10.18.16.11\n    }\n \n    notify_master \"\/etc\/keepalived\/vrrp.sh 54.192.x.x start\"\n \n    virtual_ipaddress {\n        10.15.85.31\n    }\n \n    track_script {\n        haproxy-check weight 20\n    }\n}<\/code><\/pre>\n<p>The API script for EIP switch over <code>\/etc\/keepalived\/vrrp.sh<\/code>:<\/p>\n<pre><code>#!\/bin\/bash\n[ -f \/etc\/keepalived\/aws.conf ] && . \/etc\/keepalived\/aws.conf\n. \/lib\/lsb\/init-functions\n \nENI_ID=\"eni-eaxxxxxx\"\nALOC_ID=\"eipalloc-39xxxxxx\"\nINST_ID=\"i-72xxxxxx\"\nELASTIC_IP=$1\n \ncase $2 in\n    start)\n        ec2-associate-address -n $ENI_ID -a $ALOC_ID --allow-reassociation\n        echo started\n        ;;\n    stop)\n        #ec2-disassociate-address $ELASTIC_IP\n        echo stopped\n        ;;\n    status)\n        ec2-describe-addresses | grep \"$ELASTIC_IP\" | grep \"$INST_ID\" > \/dev\/null\n        [ $? -eq 0 ] && echo OK || echo FAIL\n        ;;\nesac<\/code><\/pre>\n<p>The <code>\/etc\/keepalived\/aws.conf<\/code> file holds the EC2 credentials for a user with limited permissions to associate and describe addresses only.<\/p>\n<h2>References<\/h2>\n<p><a href=\"https:\/\/github.com\/acassen\/keepalived\/blob\/master\/doc\/keepalived.conf.SYNOPSIS\">Keepalived Github project<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This example covers VIP failover in AWS VPC across AZ&#8217;s with Keepalived. The main problem in AWS is that this provider is blocking the multicast traffic in the VPC&#8217;s. To circumvent this we need to switch to unicast for the&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[],"class_list":["post-111","post","type-post","status-publish","format-standard","hentry","category-high-availability"],"_links":{"self":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/111","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=111"}],"version-history":[{"count":19,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/111\/revisions"}],"predecessor-version":[{"id":159,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/111\/revisions\/159"}],"wp:attachment":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=111"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=111"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=111"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}