{"id":242,"date":"2016-03-01T13:59:32","date_gmt":"2016-03-01T02:59:32","guid":{"rendered":"https:\/\/icicimov.com\/blog\/?p=242"},"modified":"2017-01-02T18:24:46","modified_gmt":"2017-01-02T07:24:46","slug":"242","status":"publish","type":"post","link":"https:\/\/icicimov.com\/blog\/?p=242","title":{"rendered":"Highly Available iSCSI Storage with SCST, Pacemaker, DRBD and OCFS2 &#8211; Part1"},"content":{"rendered":"<p><div class=\"fx-toc fx-toc-id-242\"><h2 class=\"fx-toc-title\">Table of contents<\/h2><ul class='fx-toc-list level-1'>\n\t<li>\n\t\t<a href=\"https:\/\/icicimov.com\/blog\/?p=242#iscsi-target-servers-setup\">iSCSI Target Servers Setup<\/a>\n\t\t<ul class='toc-even level-2'>\n\t\t\t<li>\n\t\t\t\t<a href=\"https:\/\/icicimov.com\/blog\/?p=242#scst\">SCST<\/a>\n\t\t\t<\/li>\n\t\t\t<li>\n\t\t\t\t<a href=\"https:\/\/icicimov.com\/blog\/?p=242#drbd\">DRBD<\/a>\n\t\t\t<\/li>\n\t\t\t<li>\n\t\t\t\t<a href=\"https:\/\/icicimov.com\/blog\/?p=242#lvm\">LVM<\/a>\n\t\t\t<\/li>\n\t\t\t<li>\n\t\t\t\t<a href=\"https:\/\/icicimov.com\/blog\/?p=242#corosync-and-pacemaker\">Corosync and Pacemaker<\/a>\n\t\t\t<\/li>\n<\/ul>\n<\/ul>\n<\/div>\n<br \/>\n<a href=\"http:\/\/scst.sourceforge.net\/\">SCST<\/a> the generic SCSI target subsystem for Linux, allows creation of sophisticated storage devices from any Linux box. Those devices can provide advanced functionality, like replication, thin provisioning, deduplication, high availability, automatic backup, etc. SCST devices can use any link which supports SCSI-style data exchange: iSCSI, Fibre Channel, FCoE, SAS, InfiniBand (SRP), Wide (parallel) SCSI, etc.<\/p>\n<p>What we are going to setup is shown in the ASCII chart below. The storage stack will provide <code>iSCSI<\/code> service to the clients by exporting <code>LUN's<\/code> via single target. The clients (iSCSI initiators) will access the LUN&#8217;s via multipathing for high-availability and keep the files in sync via <code>OCFS2<\/code> clustered file system. On the server side, DRBD is tasked to sync the block storage on both servers and provide fail-over capacity. Both clusters will be managed by <code>Pacemaker<\/code> cluster stack.<\/p>\n<pre>\n                                                                        192.168.0.0\/24            +----------+\n--------------------------------------------------------------------------------------------------|  router  |-----------\n          drbd01                                 drbd02                                |          +----------+\n+----------+  +----------+             +----------+  +----------+                      |\n|  Service |  |  Service |             |  Service |  |  Service |                      |\n+----------+  +----------+             +----------+  +----------+                      |\n     ||            ||                       ||            ||                           |\n+------------------------+             +------------------------+                      |\n|         ocfs2          |< ~~~~~~~~~~>|          ocfs2         |                      |\n+------------------------+             +------------------------+                      |\n|       multipath        |             |       multipath        |                      |\n+------------------------+             +------------------------+                      |\n|        sdc,sdd         |             |        sdc,sdd         |                      |\n+------------------------+             +------------------------+                      |\n|         iscsi          |             |         iscsi          |                      |\n+------------------------+             +------------------------+                      |\n  |   |   |                               |   |   |                  10.10.1.0\/24      |\n----------+---------------------------------------+-------------------------------     |\n  |   |                                   |   |                      10.20.1.0\/24      |\n------+--------------+------------------------+--------------+--------------------     |\n  |                  |                    |                  |                         |\n--+-------------+-------------------------+-------------+-------------------------------\n                |    |                                  |    |\n+------------------------+             +------------------------+\n|         iscsi          |             |          iscsi         |\n+------------------------+             +------------------------+\n|        lv_vol          |             |         lv_vol         |\n+------------------------+             +------------------------+\n|   volume group vg1     |             |    volume group vg1    |\n+------------------------+             +------------------------+\n|     physical volume    |             |     physical volume    |\n+------------------------+             +------------------------+\n|        drbd r0         |< ~~~~~~~~~~>|        drbd r0         |\n+------------------------+             +------------------------+\n|          sdb           |             |          sdb           |\n+------------------------+             +------------------------+\n        centos01                               centos02\n    <\/pre>\n<p>Each client server (drbd01 and drbd02) has three interfaces each connected to three separate networks of which only the external one, <code>192.168.0.0\/24<\/code> is routed. The other two, <code>10.10.1.0\/24<\/code> and  <code>10.20.1.0\/24<\/code>, are private networks dedicated to the DRBD and iSCSI traffic respectively. All 4 servers are connected to the <code>10.20.1.0\/24<\/code> network ie the storage network.<\/p>\n<p>Here is the hosts configuration in the <code>\/etc\/hosts<\/code> file on all 4 nodes:<\/p>\n<pre><code>...\n10.10.1.17      drbd01.virtual.local    drbd01\n10.10.1.19      drbd02.virtual.local    drbd02\n10.20.1.17      centos01.virtual.local  centos01\n10.20.1.11      centos02.virtual.local  centos02\n<\/code><\/pre>\n<h1><span id=\"iscsi-target-servers-setup\">iSCSI Target Servers Setup<\/span><\/h1>\n<p>The servers will be running CentOS-6.7 since the iSCSI and Pacemaker are generally better supported on RHEL\/CentOS distributions. There are several iSCSI providers ie IET, STGT, LIO and SCST and I&#8217;ve chosen SCST for it&#8217;s stability, speed and array of features that others don&#8217;t provide ie ALUA. Unfortunately it has not been merged in the upstream Linux kernel, LIO got that privilege in the latest kernels, so it needs installation from source and kernel patching for best results. Helper scripts are provided in the source to make this otherwise tedious task very simple.<\/p>\n<p>The <code>sdb<\/code> virtual disk attached to the server VM&#8217;s and used as backing device for the iSCSI volumes are created with <code>write-through<\/code> cache. This, or no cache at all, is the best choice when data security is of highest importance on expanse of speed. We&#8217;ll be using the <code>vdisk_fileio<\/code> mode in SCST devices since it performs better in virtual environments over <code>vdisk_blockio<\/code> although we are still presenting the LUN&#8217;s as block devices to the initiators letting them to format the drive.<\/p>\n<h2><span id=\"scst\">SCST<\/span><\/h2>\n<p>We start by installing some needed software (on both nodes, centos01 and centos02:<\/p>\n<pre><code>[root@centos01 ~]# yum install svn asciidoc newt-devel xmlto rpm-build redhat-rpm-config gcc make \\\n                       patchutils elfutils-libelf-devel elfutils-devel zlib-devel binutils-devel \\\n                       python-devel audit-libs-devel bison hmaccalc perl-ExtUtils-Embed rng-tools \\\n                       ncurses-devel kernel-devel\n<\/code><\/pre>\n<p>We need <code>rngd<\/code> since we are running in VM&#8217;s and have not enough entropy for generating Corosync authentication key for example. Edit the <code>\/etc\/sysconfig\/rngd<\/code> config file:<\/p>\n<pre><code>...\nEXTRAOPTIONS=\"-r \/dev\/urandom\"\n<\/code><\/pre>\n<p>start it up:<\/p>\n<pre><code>[root@centos01 ~]# service rngd start\n[root@centos01 ~]# chkconfig rngd on\n<\/code><\/pre>\n<p>Fetch the SCST source from trunk:<\/p>\n<pre><code>[root@centos01 ~]# svn checkout svn:\/\/svn.code.sf.net\/p\/scst\/svn\/trunk scst-trunk\n[root@centos01 ~]# cd scst-trunk\n<\/code><\/pre>\n<p>Then we run:<\/p>\n<pre><code>[root@centos01 scst-trunk]# .\/scripts\/rebuild-rhel-kernel-rpm\n<\/code><\/pre>\n<p>This script will build for us a version of the RHEL\/CentOS\/SL kernel we are running with the SCST patches applied on top. Then we install the new kernel and boot into it:<\/p>\n<pre><code>[root@centos01 scst-trunk]# yum -iVh ..\/*.rmp\n[root@centos01 scst-trunk]# shutdown -r now\n<\/code><\/pre>\n<p>Last step is to compile and install the scst services and modules we need:<\/p>\n<pre><code>[root@centos01 scst-trunk]# make scst scst_install\n[root@centos01 scst-trunk]# make iscsi iscsi_install\n[root@centos01 scst-trunk]# make scstadm scstadm_install\n<\/code><\/pre>\n<p>Now we check if everything is working properly. We need to set minimum config file to start with:<\/p>\n<pre><code>[root@centos01 ~]# vi \/etc\/scst.conf\nTARGET_DRIVER iscsi {\n    enabled 1\n}\n<\/code><\/pre>\n<p>and start the service:<\/p>\n<pre><code>[root@centos01 ~]# service scst start\nLoading and configuring SCST                               [  OK  ]\n<\/code><\/pre>\n<p>and check if all has been started and loaded properly:<\/p>\n<pre><code>[root@centos01 ~]# lsmod | grep scst\nisert_scst             73646  3\niscsi_scst            191131  4 isert_scst\nrdma_cm                36354  1 isert_scst\nib_core                81507  6 isert_scst,rdma_cm,ib_cm,iw_cm,ib_sa,ib_mad\nscst                 2117799  2 isert_scst,iscsi_scst\ndlm                   148135  1 scst\nlibcrc32c               1246  3 iscsi_scst,drbd,sctp\ncrc_t10dif              1209  2 scst,sd_mod\n\n[root@centos01 ~]# ps aux | grep scst\nroot      3008  0.0  0.0      0     0 ?        S    16:28   0:00 [scst_release_ac]\nroot      3009  0.0  0.0      0     0 ?        S    16:28   0:00 [scst_release_ac]\nroot      3010  0.0  0.0      0     0 ?        S    16:28   0:00 [scst_release_ac]\nroot      3011  0.0  0.0      0     0 ?        S    16:28   0:00 [scst_release_ac]\nroot      3012  0.0  0.0      0     0 ?        S&lt;   16:28   0:00 [scst_uid]\nroot      3013  0.0  0.0      0     0 ?        S    16:28   0:00 [scstd0]\nroot      3014  0.0  0.0      0     0 ?        S    16:28   0:00 [scstd1]\nroot      3015  0.0  0.0      0     0 ?        S    16:28   0:00 [scstd2]\nroot      3016  0.0  0.0      0     0 ?        S    16:28   0:00 [scstd3]\nroot      3017  0.0  0.0      0     0 ?        S&lt;   16:28   0:00 [scst_initd]\nroot      3019  0.0  0.0      0     0 ?        S&lt;   16:28   0:00 [scst_mgmtd]\nroot      3054  0.0  0.0   4152   648 ?        Ss   16:28   0:00 \/usr\/local\/sbin\/iscsi-scstd\n<\/code><\/pre>\n<p>All looks good so the SCST part is finished. At the end we check if SCST service has been added to auto-start and if yes we remove it since we want to be under Pacemaker control:<\/p>\n<pre><code>[root@centos01 scst-trunk]# chkconfig --list scst\n[root@centos01 scst-trunk]# chkconfig scst off\n[root@centos01 scst-trunk]# chkconfig --list scst\nscst            0:off   1:off   2:off   3:off   4:off   5:off   6:off\n<\/code><\/pre>\n<h2><span id=\"drbd\">DRBD<\/span><\/h2>\n<p>We start by installing DRBD-8.4 on both servers:<\/p>\n<pre><code># yum install -y drbd84-utils kmod-drbd84\n<\/code><\/pre>\n<p>Then we create our DRBD resource that will provide the backing device for the volume group and logical volume we want to create. Create new file <code>\/etc\/drbd.d\/vg1.res<\/code>:<\/p>\n<pre><code>resource vg1 {\n    startup {\n        wfc-timeout 30;\n        degr-wfc-timeout 20;\n        outdated-wfc-timeout 10;\n    }\n    syncer {\n        rate 40M;\n    }\n    disk {\n        on-io-error detach;\n        fencing resource-and-stonith;\n    }\n    handlers {\n        fence-peer              \"\/usr\/lib\/drbd\/crm-fence-peer.sh\";\n        after-resync-target     \"\/usr\/lib\/drbd\/crm-unfence-peer.sh\";\n        outdate-peer            \"\/usr\/lib\/heartbeat\/drbd-peer-outdater\";\n    }\n    options {\n        on-no-data-accessible io-error;\n        #on-no-data-accessible suspend-io;\n    }\n    net {\n        timeout 60;\n        ping-timeout 30;\n        ping-int 30;\n        cram-hmac-alg \"sha1\";\n        shared-secret \"secret\";\n        max-epoch-size 8192;\n        max-buffers 8912;\n        after-sb-0pri discard-zero-changes;\n        after-sb-1pri discard-secondary;\n        after-sb-2pri disconnect;\n    }\n    volume 0 {\n       device      \/dev\/drbd0;\n       disk        \/dev\/sdb;\n       meta-disk   internal;\n    }\n    on centos01 {\n       address     10.20.1.17:7788;\n    }\n    on centos02 {\n       address     10.20.1.11:7788;\n\n}\n<\/code><\/pre>\n<p>Then we start drbd on both nodes:<\/p>\n<pre><code># modprobe drbd\n<\/code><\/pre>\n<p>create the meta data:<\/p>\n<pre><code># drbdadm create-md vg1\n# drbdadm up vg1\n<\/code><\/pre>\n<p>and then on one server only we promote the resource to primary state:<\/p>\n<pre><code>[root@centos01 ~] drbdadm primary --force vg1\n<\/code><\/pre>\n<p>The initial sync of the block device will start and when finished we have:<\/p>\n<pre><code>[root@centos01 ~]# cat \/proc\/drbd\nversion: 8.4.7-1 (api:1\/proto:86-101)\nGIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6, 2016-01-12 13:27:11\n 0: cs:Connected ro:Secondary\/Primary ds:UpToDate\/UpToDate C r-----\n    ns:0 nr:25992756 dw:25992756 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0\n<\/code><\/pre>\n<p>All is <code>UpToDate<\/code> and we are done with DRBD, we can switch it off and disable it on both nodes since it will be managed by Pacemaker:<\/p>\n<pre><code># service drbd stop\n# chkconfig drbd off\n<\/code><\/pre>\n<p>We need to leave this though for after we execute the next step where we create the logical volume.<\/p>\n<h2><span id=\"lvm\">LVM<\/span><\/h2>\n<p>We use this layer so we can easily extend our storage. We just add new volume to the DRBD resource and extend the PV, VG and LV that we create on top of it. On both nodes:<\/p>\n<pre><code># yum install -y lvm2\n<\/code><\/pre>\n<p>then we tell LVM to look for VG&#8217;s on our system and DRBD device only by setting the filter option. We also make sure the locking is set properly, the file is <code>\/etc\/lvm\/lvm.conf<\/code>:<\/p>\n<pre><code>...\n    filter = [ \"a|\/dev\/sda*|\", \"a|\/dev\/drbd*|\", \"r|.*|\" ]\n    write_cache_state = 0\n    volume_list = [ \"vg_centos\", \"vg1\", \"vg2\" ]\n...\n<\/code><\/pre>\n<p>and remove some possibly stale cache file:<\/p>\n<pre><code># rm -f \/etc\/lvm\/cache\/.cache\n<\/code><\/pre>\n<p>then we can create our VG and LV (on one node only since DRBD will replicate this for us):<\/p>\n<pre><code>[root@centos01 ~]# vgcreate vg1 \/dev\/drbd0\n[root@centos01 ~]# lvcreate --name lun1 -l 100%vg vg1\n[root@centos01 ~]# vgchange -aey vg1\n<\/code><\/pre>\n<p>We are done with this part.<\/p>\n<h2><span id=\"corosync-and-pacemaker\">Corosync and Pacemaker<\/span><\/h2>\n<p>Corosync is the cluster messaging layer and provides communication for the Pacemaker cluster nodes. On both nodes:<\/p>\n<pre><code># yum install -y pacemaker corosync ipmitool openais cluster-glue fence-agents scsi-target-utils OpenIPMI OpenIPMI-libs freeipmi freeipmi-bmc-watchdog freeipmi-ipmidetectd\n<\/code><\/pre>\n<p>then we add the HA-Cluster repository to install <code>crmsh<\/code> from, create <code>\/etc\/yum.repos.d\/ha-clustering.repo<\/code> file:<\/p>\n<pre><code>[haclustering]\nname=HA Clustering\nbaseurl=http:\/\/download.opensuse.org\/repositories\/network:\/ha-clustering:\/Stable\/CentOS_CentOS-6\/\nenabled=0\ngpgcheck=0\n<\/code><\/pre>\n<p>and run:<\/p>\n<pre><code># yum --enablerepo haclustering install -y crmsh\n<\/code><\/pre>\n<p>Now we setup Corosync with dual ring and active mode in the <code>\/etc\/corosync\/corosync.conf<\/code> file:<\/p>\n<pre><code>totem {\n    version: 2\n\n    # How long before declaring a token lost (ms)\n    token: 3000\n\n    # How many token retransmits before forming a new configuration\n    token_retransmits_before_loss_const: 10\n\n    # How long to wait for join messages in the membership protocol (ms)\n    join: 60\n\n    # How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)\n    consensus: 3600\n\n    # Turn off the virtual synchrony filter\n    vsftype: none\n\n    # Number of messages that may be sent by one processor on receipt of the token\n    max_messages: 20\n\n    # Stagger sending the node join messages by 1..send_join ms\n    send_join: 45\n\n    # Limit generated nodeids to 31-bits (positive signed integers)\n    clear_node_high_bit: yes\n\n    # Disable encryption\n     secauth: off\n\n    # How many threads to use for encryption\/decryption\n     threads: 0\n\n    # Optionally assign a fixed node id (integer)\n    # nodeid: 1234\n\n    # This specifies the mode of redundant ring, which may be none, active, or passive.\n     rrp_mode: active\n\n     interface {\n        member {\n            memberaddr: 10.20.1.17\n        }\n        member {\n            memberaddr: 10.20.1.11\n        }\n        ringnumber: 0\n        bindnetaddr: 10.20.1.11\n        mcastaddr: 226.94.1.1\n        mcastport: 5404\n    }\n    interface {\n        member {\n            memberaddr: 192.168.0.178\n        }\n        member {\n            memberaddr: 192.168.0.179\n        }\n        ringnumber: 1\n        bindnetaddr: 192.168.0.179\n        mcastaddr: 226.94.41.1\n        mcastport: 5405\n   }\n   transport: udpu\n}\namf {\n    mode: disabled\n}\nservice {\n     # Load the Pacemaker Cluster Resource Manager\n     # if 0: start pacemaker\n     # if 1: don't start pacemaker\n     ver:       1\n     name:      pacemaker\n}\naisexec {\n        user:   root\n        group:  root\n}\nlogging {\n        fileline: off\n        to_stderr: yes\n        to_logfile: no\n        to_syslog: yes\n    syslog_facility: daemon\n        debug: off\n        timestamp: on\n        logger_subsys {\n                subsys: QUORUM\n                debug: off\n                tags: enter|leave|trace1|trace2|trace3|trace4|trace6\n        }\n}\n<\/code><\/pre>\n<p>We start the service on both nodes and check:<\/p>\n<pre><code># service corosync start\n\n# corosync-cfgtool -s\nPrinting ring status.\nLocal node ID 184620042\nRING ID 0\n    id    = 10.20.1.11\n    status    = ring 0 active with no faults\nRING ID 1\n    id    = 192.168.0.179\n    status    = ring 1 active with no faults\n<\/code><\/pre>\n<p>All looks good. We set Corosync to auto-start and we are done:<\/p>\n<pre><code># chkconfig corosync on\n# chkconfig --list corosync\ncorosync           0:off    1:off    2:on    3:on    4:on    5:on    6:off\n<\/code><\/pre>\n<p>Now we start Pacemaker on both nodes and check its status.<\/p>\n<pre><code>[root@centos01 ~]# service pacemaker start\n\n[root@centos01 ~]# crm status\nLast updated: Fri Feb 26 12:49:06 2016\nLast change: Fri Feb 26 12:48:47 2016\nStack: classic openais (with plugin)\nCurrent DC: centos01 - partition with quorum\nVersion: 1.1.11-97629de\n2 Nodes configured, 2 expected votes\n12 Resources configured\n\nOnline: [ centos01 centos02 ]\n<\/code><\/pre>\n<p>All looks good so we enable it to auto-start on both nodes:<\/p>\n<pre><code># chkconfig pacemaker on\n# chkconfig --list pacemaker\npacemaker          0:off    1:off    2:on    3:on    4:on    5:on    6:off\n<\/code><\/pre>\n<p>Now comes the main configuration. We need to setup all the resources we have created till now in Pacemaker. I got the OCF agents <code>SCSTLun<\/code> and <code>SCSTTarget<\/code> from <a href=\"https:\/\/github.com\/rbicelli\/scst-ocf\">scst-ocf<\/a> and placed them under new directory I created <code>\/usr\/lib\/ocf\/resources.d\/scst\/<\/code>, since I could see they were providing more functionality then the ones bundled in the SCST svn source.<\/p>\n<p>When I was done, the full config looked like this:<\/p>\n<pre><code>[root@centos01 ~]# crm configure show\nnode centos01\nnode centos02 \\\n        attributes standby=off\nprimitive p_drbd_vg1 ocf:linbit:drbd \\\n        params drbd_resource=vg1 \\\n        op start interval=0 timeout=240 \\\n        op promote interval=0 timeout=90 \\\n        op demote interval=0 timeout=90 \\\n        op notify interval=0 timeout=90 \\\n        op stop interval=0 timeout=100 \\\n        op monitor interval=30 timeout=20 role=Slave \\\n        op monitor interval=10 timeout=20 role=Master\nprimitive p_email_admin MailTo \\\n        params email=\"igorc@encompasscorporation.com\" subject=\"Cluster Failover\"\nprimitive p_ip_vg1 IPaddr2 \\\n        params ip=192.168.0.180 cidr_netmask=24 nic=eth1 \\\n        op monitor interval=10s\nprimitive p_ip_vg1_2 IPaddr2 \\\n        params ip=10.20.1.180 cidr_netmask=24 nic=eth2 \\\n        op monitor interval=10s\nprimitive p_lu_vg1_lun1 ocf:scst:SCSTLun \\\n        params iscsi_enable=true target_iqn=\"iqn.2016-02.local.virtual:virtual.vg1\" \\\n               iscsi_lun=0 path=\"\/dev\/vg1\/lun1\" handler=vdisk_fileio device_name=VDISK-LUN01 \\\n        additional_parameters=\"nv_cache=1 write_through=0 thin_provisioned=0 threads_num=4\" wait_timeout=60 \\\n        op monitor interval=10s timeout=120s\nprimitive p_lvm_vg1 LVM \\\n        params volgrpname=vg1 \\\n        op monitor interval=60 timeout=30 \\\n        op start timeout=30 interval=0 \\\n        op stop timeout=30 interval=0 \\\n        meta target-role=Started\nprimitive p_portblock_vg1 portblock \\\n        params ip=192.168.0.180 portno=3260 protocol=tcp action=block \\\n        op monitor timeout=10s interval=10s depth=0\nprimitive p_portblock_vg1_2 portblock \\\n        params ip=10.20.1.180 portno=3260 protocol=tcp action=block \\\n        op monitor timeout=10s interval=10s depth=0\nprimitive p_portblock_vg1_2_unblock portblock \\\n        params ip=10.20.1.180 portno=3260 protocol=tcp action=unblock \\\n        op monitor timeout=10s interval=10s\nprimitive p_portblock_vg1_unblock portblock \\\n        params ip=192.168.0.180 portno=3260 protocol=tcp action=unblock \\\n        op monitor timeout=10s interval=10s\nprimitive p_target_vg1 ocf:scst:SCSTTarget \\\n        params iscsi_enable=true iqn=\"iqn.2016-02.local.virtual:virtual.vg1\" \\\n               portals=\"192.168.0.180 10.20.1.180\" wait_timeout=60 additional_parameters=\"DefaultTime2Retain=60 DefaultTime2Wait=5\" \\\n        op monitor interval=10s timeout=120s \\\n        meta target-role=Started\ngroup g_vg1 p_lvm_vg1 p_target_vg1 p_lu_vg1_lun1 p_ip_vg1 p_ip_vg1_2 p_portblock_vg1 p_portblock_vg1_unblock \\\n            p_portblock_vg1_2 p_portblock_vg1_2_unblock p_email_admin\nms ms_drbd_vg1 p_drbd_vg1 \\\n        meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 resource-stickiness=100 interleave=true notify=true target-role=Started\ncolocation c_vg1_on_drbd +inf: g_vg1 ms_drbd_vg1:Master\norder o_drbd_before_vg1 +inf: ms_drbd_vg1:promote g_vg1:start\norder o_lun_before_ip +inf: p_lu_vg1_lun1 p_ip_vg1\norder o_lvm_before_lun +inf: p_lvm_vg1 p_lu_vg1_lun1\nproperty cib-bootstrap-options: \\\n        dc-version=1.1.11-97629de \\\n        cluster-infrastructure=\"classic openais (with plugin)\" \\\n        expected-quorum-votes=2 \\\n        stonith-enabled=false \\\n        no-quorum-policy=ignore \\\n        last-lrm-refresh=1456448175\n<\/code><\/pre>\n<p>We can play with the SCST parameters to find the optimal setup for best speed, for example we can set <code>threads_num=4<\/code>  for the storage device on the fly, since we have 4 x CPUs, without stopping SCST and test its impact:<\/p>\n<pre><code>[root@centos01 ~]# scstadmin -set_dev_attr VDISK-LUN01 -attributes threads_num=4\n<\/code><\/pre>\n<p>The whole cluster in running state:<\/p>\n<pre><code>[root@centos01 ~]# crm status\nLast updated: Fri Feb 26 12:49:06 2016\nLast change: Fri Feb 26 12:48:47 2016\nStack: classic openais (with plugin)\nCurrent DC: centos01 - partition with quorum\nVersion: 1.1.11-97629de\n2 Nodes configured, 2 expected votes\n12 Resources configured\n\nOnline: [ centos01 centos02 ]\n\nFull list of resources:\n\n Master\/Slave Set: ms_drbd_vg1 [p_drbd_vg1]\n     Masters: [ centos02 ]\n     Slaves: [ centos01 ]\n Resource Group: g_vg1\n     p_lvm_vg1    (ocf::heartbeat:LVM):    Started centos02\n     p_target_vg1    (ocf::scst:SCSTTarget):    Started centos02\n     p_lu_vg1_lun1    (ocf::scst:SCSTLun):    Started centos02\n     p_ip_vg1    (ocf::heartbeat:IPaddr2):    Started centos02\n     p_ip_vg1_2    (ocf::heartbeat:IPaddr2):    Started centos02\n     p_portblock_vg1    (ocf::heartbeat:portblock):    Started centos02\n     p_portblock_vg1_unblock    (ocf::heartbeat:portblock):    Started centos02\n     p_portblock_vg1_2    (ocf::heartbeat:portblock):    Started centos02\n     p_portblock_vg1_2_unblock    (ocf::heartbeat:portblock):    Started centos02\n     p_email_admin    (ocf::heartbeat:MailTo):    Started centos02\n<\/code><\/pre>\n<blockquote><p>\n  <strong>NOTE<\/strong>: In production, usage of Fencing ie STONITH in Pacemaker is a MUST. We don&#8217;t use it here since our setup is running on VM&#8217;s but no production cluster should be running without it.\n<\/p><\/blockquote>\n<p>We can see that all resources have been started on the second node centos02 in this case. In short the DRBD backed volume is presented as LUN via SCST target and available via two VIP&#8217;s managed by Pacemaker, one per each of the public 192.168.0.0\/24 and the storage 10.20.1.0\/24 network. From the Ubuntu client nodes this will be seen as:<\/p>\n<pre><code>root@drbd01:~# iscsiadm -m discovery -I default -t st -p 192.168.0.180\n192.168.0.180:3260,1 iqn.2016-02.local.virtual:virtual.vg1\n10.20.1.180:3260,1 iqn.2016-02.local.virtual:virtual.vg1\n\nroot@drbd02:~# iscsiadm -m discovery -I default -t st -p 192.168.0.180\n192.168.0.180:3260,1 iqn.2016-02.local.virtual:virtual.vg1\n10.20.1.180:3260,1 iqn.2016-02.local.virtual:virtual.vg1\n<\/code><\/pre>\n<p>Then I logged in as well:<\/p>\n<pre><code>{% raw %}\nroot@drbd01:~# iscsiadm -m node -T iqn.2016-02.local.virtual:virtual.vg1 -p 192.168.0.180 --login\nLogging in to [iface: default, target: iqn.2016-02.local.virtual:virtual.vg1, portal: 192.168.0.180,3260] (multiple)\nLogin to [iface: default, target: iqn.2016-02.local.virtual:virtual.vg1, portal: 192.168.0.180,3260] successful.\n{% endraw %}\n<\/code><\/pre>\n<p>And could confirm new block device was created on the client:<\/p>\n<pre><code>root@drbd01:~# fdisk -l \/dev\/sdc\n\nDisk \/dev\/sdc: 21.5 GB, 21470642176 bytes\n64 heads, 32 sectors\/track, 20476 cylinders, total 41934848 sectors\nUnits = sectors of 1 * 512 = 512 bytes\nSector size (logical\/physical): 512 bytes \/ 4096 bytes\nI\/O size (minimum\/optimal): 4096 bytes \/ 524288 bytes\nDisk identifier: 0x00000000\nDisk \/dev\/sdc doesn't contain a valid partition table\n<\/code><\/pre>\n<p>which is our 20GB LUN from the target. We can now format and mount <code>\/dev\/sdc<\/code> as we would do with any block storage device.<\/p>\n<p>[serialposts]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>SCST the generic SCSI target subsystem for Linux, allows creation of sophisticated storage devices from any Linux box. Those devices can provide advanced functionality, like replication, thin provisioning, deduplication, high availability, automatic backup, etc. SCST devices can use any link&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[17,9,16],"tags":[21,18,19,20],"class_list":["post-242","post","type-post","status-publish","format-standard","hentry","category-cluster","category-high-availability","category-storage","tag-drbd","tag-iscsi","tag-ocfs2","tag-pacemaker"],"_links":{"self":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/242","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=242"}],"version-history":[{"count":5,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/242\/revisions"}],"predecessor-version":[{"id":253,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/242\/revisions\/253"}],"wp:attachment":[{"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=242"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=242"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/icicimov.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=242"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}