{"id":269,"date":"2016-09-16T11:59:14","date_gmt":"2016-09-16T01:59:14","guid":{"rendered":"https:\/\/icicimov.com\/blog\/?p=269"},"modified":"2017-03-03T16:00:36","modified_gmt":"2017-03-03T05:00:36","slug":"proxmox-clustering-and-nested-virtualization","status":"publish","type":"post","link":"https:\/\/icicimov.com\/blog\/?p=269","title":{"rendered":"Proxmox clustering and nested virtualization"},"content":{"rendered":"<p>The motivation for creating this setup is the possibility of having Encompass private virtualization cloud deployed in any third party infrastructure provider DC, like for example <a href=\"http:\/\/www.softlayer.com\/\">SoftLayer<\/a> that we already use to host our product on Bare-Metal serves. The solution is based on Proxmox PVE (Proxmox Virtualization Environment) version 4.1.x (upgraded to 4.2 later) with KVM as hypervisor. As stated on their website, <code>Proxmox VE<\/code> is a powerful and lightweight open source server virtualization software, optimized for performance and usability. For maximum flexibility, Proxmox VE supports two virtualization technologies &#8211; Kernel-based Virtual Machine (KVM) and container-based virtualization with Linux Containers (LXC).<\/p>\n<p>The HA on hypervisor level is being provided with the PVE&#8217;s built in clustering feature. It also provides live migration for the VM&#8217;s (not supported for LXC containers) when created with root disk on shared storage which means we can move VM&#8217;s from one node to another without any downtime for the running instances. The cluster will also automatically migrate the VM&#8217;s from the node that has crashed or has been put into maintenance mode.<\/p>\n<p>Just to number some of the latest PVE 4 key features:<\/p>\n<ul>\n<li>Based on Debian 8 &#8211; 64 bit<\/li>\n<li>Broad hardware support<\/li>\n<li>Host support for Linux and Windows at 32 and 64 bits<\/li>\n<li>Support for the latest Intel and AMD chipset<\/li>\n<li>Optimization for the bare-metal virtualization to support high workloads<\/li>\n<li>Web management with all the features necessary to create and manage a virtual infrastructure<\/li>\n<li>Management through web interface without needing to use any client software<\/li>\n<li>Combination of two virtualization technology KVM and LXC<\/li>\n<li>Clustering for HA<\/li>\n<\/ul>\n<h2>Host Preparation<\/h2>\n<p>The setup has been fully tested on a Proxmox PVE cluster of two Proxmox instances launched on our office Virtualization Server running Proxmox-3.1 and kernel 3.10:<\/p>\n<pre><code>root@virtual:~# pveversion -v\nproxmox-ve-2.6.32: 3.1-114 (running kernel: 3.10.0-18-pve)\npve-manager: 3.1-21 (running version: 3.1-21\/93bf03d4)\npve-kernel-2.6.32-20-pve: 2.6.32-100\npve-kernel-3.10.0-18-pve: 3.10.0-46\npve-kernel-2.6.32-26-pve: 2.6.32-114\nlvm2: 2.02.98-pve4\nclvm: 2.02.98-pve4\ncorosync-pve: 1.4.5-1\nopenais-pve: 1.1.4-3\nlibqb0: 0.11.1-2\nredhat-cluster-pve: 3.2.0-2\nresource-agents-pve: 3.9.2-4\nfence-agents-pve: 4.0.0-2\npve-cluster: 3.0-8\nqemu-server: 3.1-8\npve-firmware: 1.0-23\nlibpve-common-perl: 3.0-8\nlibpve-access-control: 3.0-7\nlibpve-storage-perl: 3.0-17\npve-libspice-server1: 0.12.4-2\nvncterm: 1.1-4\nvzctl: 4.0-1pve4\nvzprocps: 2.0.11-2\nvzquota: 3.1-2\npve-qemu-kvm: 1.4-17\nksm-control-daemon: 1.1-1\nglusterfs-client: 3.4.1-1\n<\/code><\/pre>\n<p>This server has one physical CPU with 6 cores, meaning in KVM and other hypervisors it will be presented as 12 cpu&#8217;s due to hyper-threading.<\/p>\n<pre><code>root@virtual:~# egrep -c '(vmx|svm)' \/proc\/cpuinfo\n12\n<\/code><\/pre>\n<p>we can see all 12 CPU cores are virt enabled supporting the Intel VMX extension in this case.<\/p>\n<p>We needed PVE kernel of 3.10.x since it has nested virtualization feature which is not available in the current PVE 2.6.32 kernel:<\/p>\n<pre><code>root@virtual:~# modinfo kvm_intel\nfilename:       \/lib\/modules\/2.6.32-26-pve\/kernel\/arch\/x86\/kvm\/kvm-intel.ko\nlicense:        GPL\nauthor:         Qumranet\nsrcversion:     672265D1CCD374958DD573E\ndepends:        kvm\nvermagic:       2.6.32-26-pve SMP mod_unload modversions\nparm:           bypass_guest_pf:bool\nparm:           vpid:bool\nparm:           flexpriority:bool\nparm:           ept:bool\nparm:           unrestricted_guest:bool\nparm:           eptad:bool\nparm:           emulate_invalid_guest_state:bool\nparm:           yield_on_hlt:bool\nparm:           vmm_exclusive:bool\nparm:           ple_gap:int\nparm:           ple_window:int\n<\/code><\/pre>\n<p>I did an upgrade using the kernel from PVE test repository:<\/p>\n<pre><code>root@virtual:~# wget -q http:\/\/download.proxmox.com\/debian\/dists\/wheezy\/pvetest\/binary-amd64\/pve-kernel-3.10.0-18-pve_3.10.0-46_amd64.deb\nroot@virtual:~# dpkg -i pve-kernel-3.10.0-18-pve_3.10.0-46_amd64.deb\n<\/code><\/pre>\n<p>Then to avoid some issues I saw being reported about this kernel giving panics and failed server start ups in case of SCSI drives (failed SCSI bus scanning) we change the default kernel startup line in <code>\/etc\/default\/grub<\/code> from:<\/p>\n<pre><code>GRUB_CMDLINE_LINUX_DEFAULT=\"quiet\"\n<\/code><\/pre>\n<p>to<\/p>\n<pre><code>GRUB_CMDLINE_LINUX_DEFAULT=\"quiet scsi_mod.scan=sync rootdelay=10\"\n<\/code><\/pre>\n<p>and reboot the server to activate the new kernel. After it comes back, we can see our kernel module now has nested capability:<\/p>\n<pre><code>root@virtual:~# modinfo kvm_intel\nfilename:       \/lib\/modules\/3.10.0-18-pve\/kernel\/arch\/x86\/kvm\/kvm-intel.ko\nlicense:        GPL\nauthor:         Qumranet\nrhelversion:    7.2\nsrcversion:     9F7B2EB3976CBA6622D41D4\nalias:          x86cpu:vendor:*:family:*:model:*:feature:*0085*\ndepends:        kvm\nintree:         Y\nvermagic:       3.10.0-18-pve SMP mod_unload modversions\nparm:           vpid:bool\nparm:           flexpriority:bool\nparm:           ept:bool\nparm:           unrestricted_guest:bool\nparm:           eptad:bool\nparm:           emulate_invalid_guest_state:bool\nparm:           vmm_exclusive:bool\nparm:           fasteoi:bool\nparm:           enable_apicv:bool\nparm:           enable_shadow_vmcs:bool\nparm:           nested:bool\nparm:           pml:bool\nparm:           ple_gap:int\nparm:           ple_window:int\nparm:           ple_window_grow:int\nparm:           ple_window_shrink:int\nparm:           ple_window_max:int\n<\/code><\/pre>\n<p>Before we start any LXC\/VM on the Proxmox server we need to enable the KVM <code>nested<\/code> virtualization so we can run containers and vm&#8217;s inside the nested hosts so we reload the kernel module:<\/p>\n<pre><code>root@virtual:~# modprobe -r -v kvm_intel\nroot@virtual:~# modprobe -v kvm_intel nested=1\n<\/code><\/pre>\n<p>To enable the feature on reboot:<\/p>\n<pre><code>root@virtual:~# echo \"options kvm-intel nested=y\" &gt; \/etc\/modprobe.d\/kvm-intel.conf\n<\/code><\/pre>\n<p>We can now create our nested Proxmox instances and choose <code>host<\/code> for <code>CPU Type<\/code> so they can inherit the KVM features. Then we can see on the nested Proxmox hosts after startup:<\/p>\n<pre><code>root@proxmox01:~# egrep -c '(vmx|svm)' \/proc\/cpuinfo\n12\n<\/code><\/pre>\n<p>the CPU virtualization features of the host are being passed on to the launched instances.<\/p>\n<p>Just a note for the case when we want to run nested Proxmox instances in <code>libvirt<\/code>. The cpu mode needs to be set to <code>host-passthrough<\/code> by editing the domain&#8217;s xml file:<\/p>\n<pre><code>&lt;cpu mode='host-passthrough'&gt;\n<\/code><\/pre>\n<p>otherwise the nested virtualization will not work. Selecting the <code>Copy Cpu Configuration<\/code> in VirtManager sets the cpu mode to <code>host-model<\/code> which does not enable this feature although the name suggests it should.<\/p>\n<p>The Host is already part of our office network <code>192.168.0.0\/24<\/code> which is presented to the running instances as external bridged network for internet access. I have created two additional networks isolated virtual bridges for clustering purposes <code>10.10.1.0\/24<\/code> and <code>10.20.1.0\/24<\/code>, the relevant setup in <code>\/etc\/network\/interfaces<\/code>:<\/p>\n<pre><code># Create private network bridge with DHCP server\nauto vmbr1\niface vmbr1 inet static\n  address 10.10.1.1\n  netmask 255.255.255.0\n  bridge_ports vmbr1tap0\n  bridge_waitport 0\n  bridge_fd 0\n  bridge_stp off\n  pre-up \/usr\/sbin\/tunctl -t vmbr1tap0\n  pre-up \/sbin\/ifconfig vmbr1tap0 up\n  post-down \/sbin\/ifconfig vmbr1tap0 down\n  post-up dnsmasq -u root --strict-order --bind-interfaces \\\n  --pid-file=\/var\/run\/vmbr1.pid --conf-file= \\\n  --except-interface lo --listen-address 10.10.1.1 \\\n  --dhcp-range 10.10.1.10,10.10.1.20 \\\n  --dhcp-leasefile=\/var\/run\/vmbr1.leases\n\n# Create private network bridge with DHCP server\nauto vmbr2\niface vmbr2 inet static\n  address 10.20.1.1\n  netmask 255.255.255.0\n  bridge_ports vmbr2tap0\n  bridge_waitport 0\n  bridge_fd 0\n  bridge_stp off\n  pre-up \/usr\/sbin\/tunctl -t vmbr2tap0\n  pre-up \/sbin\/ifconfig vmbr2tap0 up\n  post-down \/sbin\/ifconfig vmbr2tap0 down\n  post-up dnsmasq -u root --strict-order --bind-interfaces \\\n  --pid-file=\/var\/run\/vmbr2.pid --conf-file= \\\n  --except-interface lo --listen-address 10.20.1.1 \\\n  --dhcp-range 10.20.1.10,10.20.1.20 \\\n  --dhcp-leasefile=\/var\/run\/vmbr2.leases\n<\/code><\/pre>\n<p>For the bridges to be recognized in Proxmox they need to be named <code>vmbrX<\/code>, where X is a digit. We can see a DHCP service has been provided on both private networks via dnsmasq. The resulting network configuration is as follows:<\/p>\n<pre>\nroot@virtual:~# ip addr show vmbr1\n6: vmbr1: <broadcast ,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP\n    link\/ether 32:35:2d:99:67:b5 brd ff:ff:ff:ff:ff:ff\n    inet 10.10.1.1\/24 brd 10.10.1.255 scope global vmbr1\n       valid_lft forever preferred_lft forever\n    inet6 fe80::3035:2dff:fe99:67b5\/64 scope link\n       valid_lft forever preferred_lft forever\n \nroot@virtual:~# ip addr show vmbr2\n8: vmbr2: <\/broadcast><broadcast ,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP\n    link\/ether 72:a4:b7:53:a2:c5 brd ff:ff:ff:ff:ff:ff\n    inet 10.20.1.1\/24 brd 10.20.1.255 scope global vmbr2\n       valid_lft forever preferred_lft forever\n    inet6 fe80::70a4:b7ff:fe53:a2c5\/64 scope link\n       valid_lft forever preferred_lft forever\n<\/broadcast><\/pre>\n<p>Actually, one more bridge has been configured:<\/p>\n<pre>\nroot@virtual:~# ip addr show vmbr3\n38: vmbr3: <no -CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN\n    link\/ether b6:35:b1:ca:8e:74 brd ff:ff:ff:ff:ff:ff\n    inet 10.30.1.1\/24 brd 10.30.1.255 scope global vmbr3\n       valid_lft forever preferred_lft forever\n<\/no><\/pre>\n<p>but we will come back to this one later.<\/p>\n<h2>Proxmox-4.1 Cluster Setup<\/h2>\n<p>On both nested Proxmox instances I have installed bare metal PVE-4.1 from ISO image. It is based on Debian-8 (Jessy) and it comes with 4.2.6 kernel:<\/p>\n<pre><code>root@proxmox01:~# pveversion -v\nproxmox-ve: 4.1-26 (running kernel: 4.2.6-1-pve)\npve-manager: 4.1-1 (running version: 4.1-1\/2f9650d4)\npve-kernel-4.2.6-1-pve: 4.2.6-26\nlvm2: 2.02.116-pve2\ncorosync-pve: 2.3.5-2\nlibqb0: 0.17.2-1\npve-cluster: 4.0-29\nqemu-server: 4.0-41\npve-firmware: 1.1-7\nlibpve-common-perl: 4.0-41\nlibpve-access-control: 4.0-10\nlibpve-storage-perl: 4.0-38\npve-libspice-server1: 0.12.5-2\nvncterm: 1.2-1\npve-qemu-kvm: 2.4-17\npve-container: 1.0-32\npve-firewall: 2.0-14\npve-ha-manager: 1.0-14\nksm-control-daemon: 1.2-1\nglusterfs-client: 3.5.2-2+deb8u1\nlxc-pve: 1.1.5-5\nlxcfs: 0.13-pve1\ncgmanager: 0.39-pve1\ncriu: 1.6.0-1\nzfsutils: 0.6.5-pve6~jessie\nopenvswitch-switch: 2.3.0+git20140819-3\n<\/code><\/pre>\n<p>To keep PVE up-to-date we need to enable the no-subscription repository:<\/p>\n<pre><code>$ echo 'deb http:\/\/download.proxmox.com\/debian jessy pve-no-subscription' | tee \/etc\/apt\/sources.list.d\/pve-no-subscription.list\n<\/code><\/pre>\n<p>To upgrade everything to the latest PVE which is 4.2 since 27\/04\/2016:<\/p>\n<pre><code># apt-get -y update &amp;&amp; apt-get -y upgrade &amp;&amp; apt-get -y dist-upgrade\n# reboot --reboot --force\n<\/code><\/pre>\n<p>The instances are attached to both private networks created on the Host as described in the previous section:<\/p>\n<pre><code>root@proxmox01:~# ifconfig\neth0      Link encap:Ethernet  HWaddr c2:04:26:bd:ae:23 \n          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1\n          RX packets:15984 errors:0 dropped:0 overruns:0 frame:0\n          TX packets:3967 errors:0 dropped:0 overruns:0 carrier:0\n          collisions:0 txqueuelen:1000\n          RX bytes:1539185 (1.4 MiB)  TX bytes:732810 (715.6 KiB)\n\neth1      Link encap:Ethernet  HWaddr 06:97:e4:a3:7b:be \n          inet addr:10.10.1.185  Bcast:10.10.1.255  Mask:255.255.255.0\n          inet6 addr: fe80::497:e4ff:fea3:7bbe\/64 Scope:Link\n          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1\n          RX packets:1932 errors:0 dropped:0 overruns:0 frame:0\n          TX packets:2094 errors:0 dropped:0 overruns:0 carrier:0\n          collisions:0 txqueuelen:1000\n          RX bytes:372103 (363.3 KiB)  TX bytes:447392 (436.9 KiB)\n\neth2      Link encap:Ethernet  HWaddr e2:55:6a:54:23:63 \n          inet addr:10.20.1.185  Bcast:10.20.1.255  Mask:255.255.255.0\n          inet6 addr: fe80::e055:6aff:fe54:2363\/64 Scope:Link\n          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1\n          RX packets:572 errors:0 dropped:0 overruns:0 frame:0\n          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0\n          collisions:0 txqueuelen:1000\n          RX bytes:44180 (43.1 KiB)  TX bytes:1062 (1.0 KiB)\n\nvmbr0     Link encap:Ethernet  HWaddr c2:04:26:bd:ae:23 \n          inet addr:192.168.0.185  Bcast:192.168.0.255  Mask:255.255.255.0\n          inet6 addr: fe80::c004:26ff:febd:ae23\/64 Scope:Link\n          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1\n          RX packets:15848 errors:0 dropped:0 overruns:0 frame:0\n          TX packets:3968 errors:0 dropped:0 overruns:0 carrier:0\n          collisions:0 txqueuelen:0\n          RX bytes:1305985 (1.2 MiB)  TX bytes:732928 (715.7 KiB)\n\n\nroot@proxmox02:~# ifconfig\neth0      Link encap:Ethernet  HWaddr 1a:dc:cf:9c:40:f5 \n          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1\n          RX packets:15415 errors:0 dropped:0 overruns:0 frame:0\n          TX packets:2639 errors:0 dropped:0 overruns:0 carrier:0\n          collisions:0 txqueuelen:1000\n          RX bytes:1515062 (1.4 MiB)  TX bytes:551935 (538.9 KiB)\n\neth1      Link encap:Ethernet  HWaddr 7a:ff:59:17:9d:94 \n          inet addr:10.10.1.186  Bcast:10.10.1.255  Mask:255.255.255.0\n          inet6 addr: fe80::78ff:59ff:fe17:9d94\/64 Scope:Link\n          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1\n          RX packets:2628 errors:0 dropped:0 overruns:0 frame:0\n          TX packets:1897 errors:0 dropped:0 overruns:0 carrier:0\n          collisions:0 txqueuelen:1000\n          RX bytes:489533 (478.0 KiB)  TX bytes:394310 (385.0 KiB)\n\neth2      Link encap:Ethernet  HWaddr 3e:a1:05:95:4f:6e \n          inet addr:10.20.1.186  Bcast:10.20.1.255  Mask:255.255.255.0\n          inet6 addr: fe80::3ca1:5ff:fe95:4f6e\/64 Scope:Link\n          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1\n          RX packets:907 errors:0 dropped:0 overruns:0 frame:0\n          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0\n          collisions:0 txqueuelen:1000\n          RX bytes:72827 (71.1 KiB)  TX bytes:1062 (1.0 KiB)\n\nvmbr0     Link encap:Ethernet  HWaddr 1a:dc:cf:9c:40:f5 \n          inet addr:192.168.0.186  Bcast:192.168.0.255  Mask:255.255.255.0\n          inet6 addr: fe80::18dc:cfff:fe9c:40f5\/64 Scope:Link\n          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1\n          RX packets:15262 errors:0 dropped:0 overruns:0 frame:0\n          TX packets:2625 errors:0 dropped:0 overruns:0 carrier:0\n          collisions:0 txqueuelen:0\n          RX bytes:1288482 (1.2 MiB)  TX bytes:530571 (518.1 KiB)\n<\/code><\/pre>\n<p>PVE takes over the primary interface <code>eth0<\/code> and moves its setup to the <code>vmbr0<\/code> linux bridge. We start by creating the Cluster ie <code>Corosync<\/code> configuration which PVE uses for its cluster messaging, I&#8217;m opting here for dual Corosync ring in passive mode:<\/p>\n<pre><code>root@proxmox01:~# pvecm create proxmox -bindnet0_addr 192.168.0.185 -ring0_addr 192.168.0.185 -bindnet1_addr 10.10.1.185 -ring1_addr 10.10.1.185 -rrp_mode passive\n<\/code><\/pre>\n<p>We run this on one node only. Then we add the second one, proxmox02, to the cluster:<\/p>\n<pre><code>root@proxmox02:~# pvecm add 192.168.0.185 -ring0_addr 192.168.0.186 -ring1_addr 10.10.1.186\nThe authenticity of host '192.168.0.185 (192.168.0.185)' can't be established.\nECDSA key fingerprint is 43:8d:e3:79:70:88:0f:c8:e3:26:73:f8:c3:67:43:ef.\nAre you sure you want to continue connecting (yes\/no)? yes\nroot@192.168.0.185's password:\ncopy corosync auth key\nstopping pve-cluster service\nbackup old database\nwaiting for quorum...OK\ngenerating node certificates\nmerge known_hosts file\nrestart services\nsuccessfully added node 'proxmox02' to cluster.\nroot@proxmox02:~#\n<\/code><\/pre>\n<p>This will also add the root user ssh key to each other autorized_keys file. If we now check the cluster state:<\/p>\n<pre><code>root@proxmox02:~# pvecm status\nQuorum information\n------------------\nDate:             Fri Mar  4 16:15:27 2016\nQuorum provider:  corosync_votequorum\nNodes:            2\nNode ID:          0x00000002\nRing ID:          8\nQuorate:          Yes\n\nVotequorum information\n----------------------\nExpected votes:   2\nHighest expected: 2\nTotal votes:      2\nQuorum:           2 \nFlags:            Quorate\n\nMembership information\n----------------------\n    Nodeid      Votes Name\n0x00000001          1 192.168.0.185\n0x00000002          1 192.168.0.186 (local)\n<\/code><\/pre>\n<p>and we can see the Corosync process running on both nodes and see its configuration created by PVE:<\/p>\n<pre><code>root@proxmox02:~# cat \/etc\/pve\/corosync.conf\nlogging {\n  debug: off\n  to_syslog: yes\n}\nnodelist {\n  node {\n    name: proxmox02\n    nodeid: 2\n    quorum_votes: 1\n    ring0_addr: 192.168.0.186\n    ring1_addr: 10.10.1.186\n  }\n  node {\n    name: proxmox01\n    nodeid: 1\n    quorum_votes: 1\n    ring0_addr: 192.168.0.185\n    ring1_addr: 10.10.1.185\n  }\n}\nquorum {\n  provider: corosync_votequorum\n}\ntotem {\n  cluster_name: proxmox\n  config_version: 2\n  ip_version: ipv4\n  rrp_mode: passive\n  secauth: on\n  version: 2\n  interface {\n    bindnetaddr: 192.168.0.185\n    ringnumber: 0\n  }\n  interface {\n    bindnetaddr: 10.10.1.185\n    ringnumber: 1\n  }\n}\n<\/code><\/pre>\n<p>The PVE cluster uses <code>Watchdog<\/code> for fencing. If no hardware one is configured on the nodes it will use the linux <code>softdog<\/code> by default, which makes the solution possible to run inside VM&#8217;s as well as on real hardware.<\/p>\n<p>After installation the Proxmox GUI will be available on any of the cluster servers, so <code>https:\/\/192.168.0.185:8006<\/code> and <code>https:\/\/192.168.0.186:8006<\/code> will both work. After logging in we can see both nodes added to the <code>Datacenter<\/code>. When we click  on the <code>HA<\/code> tab we will see <code>quorum ok<\/code> status.<\/p>\n<p>In case we do some manual changes (there is special procedure described on the PVE website for this) to <code>\/etc\/pve\/corosync.conf<\/code> file we will need to restart all cluster services:<\/p>\n<pre><code># systemctl restart corosync.service\n# systemctl restart pve-cluster.service\n# systemctl restart pvedaemon.service\n# systemctl restart pveproxy.service\n<\/code><\/pre>\n<p>but reboot is cleaner and really recommended.<\/p>\n<p>It is very important to note that the PVE configuration lives in <code>\/etc\/pve<\/code> which is fuse mounted read only file system in user space:<\/p>\n<pre><code>root@proxmox02:~# mount | grep etc\n\/dev\/fuse on \/etc\/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)\n<\/code><\/pre>\n<p>and has no write privileges when running a PVE cluster. The only way to edit any files under <code>\/etc\/pve<\/code> is to disable the cluster by setting:<\/p>\n<pre><code>DAEMON_OPTS=\"-l\"\n<\/code><\/pre>\n<p>in the <code>\/etc\/default\/pve-cluster<\/code> file and reboot the node so it comes up in local mode. Then edit and make changes, remove the <code>-l<\/code> from above and reboot again.<\/p>\n<p>If we need to change <code>\/etc\/pve\/corosync.conf<\/code> on a node with no quorum, we can run:<\/p>\n<pre><code># pvecm expected 1\n<\/code><\/pre>\n<p>to set the expected vote count to 1. This makes the cluster quorate and you can fix your config, or revert it back to the back up. If that wasn&#8217;t enough (e.g.: corosync is dead) use:<\/p>\n<pre><code># systemctl stop pve-cluster\n# pmxcfs -l\n<\/code><\/pre>\n<p>to start the <code>pmxcfs<\/code> (proxmox cluster file system) in a local mode. We have now write access, so we need to be very careful with changes! After restarting the file system should merge changes, if there is no big merge conflict that could result in a split brain.<\/p>\n<p>At the end we install some software we are going to need later:<\/p>\n<pre><code># apt-get install -y uml-utils openvswitch-switch dnsmasq\n<\/code><\/pre>\n<h2>Adding VM or LXC container as HA instance<\/h2>\n<p>As mentioned before, PVE supports High Availability for the launched instances. This is made fairly simple using the CLI tools, the following procedure shows how to add a VM to the HA manager:<\/p>\n<pre><code>root@proxmox01:~# qm list\n      VMID NAME                 STATUS     MEM(MB)    BOOTDISK(GB) PID      \n       102 vm01                 running    1024               8.00 7995\n\nroot@proxmox01:~# ha-manager add vm:102\n\nroot@proxmox01:~# ha-manager status\nquorum OK\nmaster proxmox01 (active, Fri Apr 29 14:46:26 2016)\nlrm proxmox01 (active, Fri Apr 29 14:46:20 2016)\nlrm proxmox02 (active, Fri Apr 29 14:46:28 2016)\nservice ct:100 (proxmox01, started)\nservice ct:101 (proxmox02, started)\nservice ct:103 (proxmox01, started)\nservice vm:102 (proxmox01, started)\n<\/code><\/pre>\n<p>Same operation can be executed via the GUI too.<\/p>\n<h2>Note on VM Templates<\/h2>\n<p>It is common procedure to create a Template from base image VM that we can then use to launch new VM&#8217;s of the same type, lets say Ubuntu-14.04 VM&#8217;s fast and easy. After creating a template from a VM it is best we remove the CD drive we used to mount the installation <code>iso<\/code> media, set <code>KVM hardware virtualization<\/code> to <code>no<\/code> (we don&#8217;t need this in VM&#8217;s launched in a already nested VM) and set <code>Qemu Agent<\/code> to <code>yes<\/code> which is used to freeze the guest file system when making a backup (assumes <code>qemu-guest-agent<\/code> package installed) under the <code>Options<\/code> tab for the template before we launch any instances from it.<\/p>\n<p>[serialposts]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The motivation for creating this setup is the possibility of having Encompass private virtualization cloud deployed in any third party infrastructure provider DC, like for example SoftLayer that we already use to host our product on Bare-Metal serves. The solution&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[17,22,13],"tags":[24,23],"class_list":["post-269","post","type-post","status-publish","format-standard","hentry","category-cluster","category-kvm","category-virtualization","tag-kvm","tag-proxmox"],"_links":{"self":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/269","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=269"}],"version-history":[{"count":6,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/269\/revisions"}],"predecessor-version":[{"id":432,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/269\/revisions\/432"}],"wp:attachment":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=269"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=269"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=269"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}