wiki:ForkLift-vm1-sea

Fork-Lifting VMs on vm1.sea.rg.net from ESXI to Debian with Ganeti

With the lessons learned from [*wiki:Fork-Lifting? VMs on vm0.sea.rg.net from ESXI to Debian with Ganeti], the intrepid crew foolishly embarks on doing the same on vm1.sea.rg.net.


vm1.sea.rg.net Hardware Platform

Cisco R210-2121605W - part 74-7341-02 - serial QCI1549A9AY

  • Effectively: 72GB RAM, 4TB datastore1, 8 cores
  • UCS C210 M2 Svr, 2x E5640, 2x4GB, SAS Expand, 1PS
  • 2 x 2.66GHz Xeon E5640 80W CPU/12MB cache/DDR3 1066MHz
  • 4 x 16GB DDR3-1066MHz RDIMM/PC3-8500/quad rank/Low-Dual Volt
  • 2 x 4GB DDR3-1333MHz RDIMM/PC3-10600/dual rank 1Gb DRAMs
  • LSI 6G MegaRAID 9261-8i card (RAID 0,1,5,6,10,60) - 512WC
  • 8 x 500GB 6Gb SATA 7.2K RPM SFF hot plug/drive sled mounted
  • Intel Quad port GbE Controller (E1G44ETG1P20)

First, record disk and memory allocation for each VM, configured size, not utilization.

Hostname RAM Disk IP Address Owner
build-u.rpki.net 1G 100G 147.28.0.28 Rob Austein <sra@…>
ca0.rpki.net 2G 100G 147.28.0.85 Randy Bush <randy@…>
cache0.sea.rpki.net 2G 100G 147.28.0.84 Randy Bush <randy@…>
hans.rg.net 3G 256G 147.28.0.42 Hans Kuhn <hak@…>
nic0.net.lb 2G 100G 147.28.0.44 Samer Khalil <samerk1@…>
nlring.sea.rg.net 2G 32G 147.28.0.89 Randy Bush <randy@…>
proto0.sea.rpki.net 2G 100G 147.28.0.100 Iain Phillips <I.W.Phillips@…>
xmpp.rg.net 2G 100G 147.28.0.6 Randy Bush <randy@…>

Users May or May Not Need to Pre-Configure

FreeBSD Users Hack Configuration Aspects Which Will Change

FreeBSD Disk and Network Interface naming may change from the ESXI guest environment to the Ganeti/KVM environment. Owners of FreeBSD guests should either

  • Make config changes just before shutting down their machines. Thus,

when they come back up in the new environment they will boot usefully. FreeBSD guests seem to use /dev/ada for the disk drives.

root@fbsd0:~ # more /etc/fstab
# Device        Mountpoint      FStype  Options Dump    Pass#
/dev/ada0p2     /               ufs     rw      1       1
/dev/ada0p3     none            swap    sw      0       0

FreeBSD drives on ESXI seem to be /dev/da. So users will have to change their /etc/fstab just before the fork-lift. s/da/ada/g

or tell VM SysAdmins so we can hack the ganeti configs so you can keep your old disk and NIC names.

Linux Guests Should Need No Modification

Linux Guests should be able to find their disks as UUIDs and mount as /dev/sdaN. And Ethernet seems to be a pretty constant eth0.

Copy VMs to an NFS Mounted Filesystem

Create a /data/nfs directory on raid1.psg.com and NFS export it to vm1.sea using hacks/advice from:

Mount raid1.psg.com:/data/nfs on vm1.sea.rg.net as an NFS datastore in Configuration / Storage / AddStorage?

Stop and power off all guest VMs on vm1.sea.rg.net. We can actually do this one by one.

Record the md5 checksum of each and every guest VM .vmdk file.

Use VMware vSphere Client on my laptop to move each guest VM from vm1.sea.rg.net:datastore1 to the NFS datastore.

Take the md5 checksum of each and every .vmdk file on the NFS datastore and compare to that of the original from vm1.sea.rg.net:datastore1.

It is now safe to destroy and rebuild vm1.sea.rg.net

Build a Debian/Ganeti System on vm1.sea.rg.net

Boot into Adaptec BIOS and configure the drives as one big RAID5. The hack to get an INSert key on the MacBook? is Windows, Accessories, EasyOfAccess?, On-ScreenKeyboard?

Install Debian

  • Boot Debian CD/ISO
  • Choose Install
  • Choose English, UK (so you can get UCT)
  • Choose American English
  • Name the host
  • Choose root password
  • Choose user name and password
  • Partition
    • Choose Manual Partitioning
    • Select the drive
    • Create new empty partition table
    • Select Free Space
    • Create new partition, primary, 1GB, begining, bios, no use, bios
    • Done
    • Select Free Space
    • Create new partition, primary, 1GB, begining, /boot, ext4, bootable
    • Done
    • Select Free Space again
    • Create a new partition
    • Accept whatever size is shown (the rest of the disk)
    • Primary, physical volume for LVM
    • Done
  • Configure LVM
    • Configure LVM accepting write changes to disks
    • Create volume group
      • Volume group name: ganeti
      • Devices for the new volume group: select only the LVM partition
    • Create Logical Volume: on ganeti, root, 16G
    • Create Logical Volume: on ganeti, swap, 16G
    • Create Logical Volume: on ganeti, var, 16G
    • Edit the Logical Volumes to be ext4 /, swap, and ext4 /var
    • Finish partitioning and write changes
  • Finish partitioning and write changes to disk
  • Be sure it will not boot CD-ROM, and Reboot from the installed system

Finish Debian Installation

Clean up from CDROM sources

vi /etc/apt/sources.list

and delete the two CDROM entries at the top

Install homey things (it's not a computer without emacs:)

apt-get update
apt-get upgrade
apt-get install emacs23-nox
apt-get install rsync
apt-get install gcc
apt-get install bridge-utils vlan
apt-get install sudo
apt-get install unbound
usermod -G sudo -a randy

Fix hostname

echo vm1.sea.rg.net > /etc/hostname
hostname `cat /etc/hostname`

Fix /etc/unbound/unbound.conf

        access-control: 127.0.0.0/8 allow
        access-control: 147.28.0.0/16 allow
        access-control: 198.180.150.0/24 allow
        access-control: 198.180.152.0/24 allow
        access-control: 0.0.0.0/0 refuse
        access-control: ::1 allow
        access-control: ::ffff:127.0.0.1 allow
        access-control: 2001:418:1::0/48 allow
        access-control: 2001:418:3807::0/48 allow
        access-control: 2001:418:8006::0/48 allow
        access-control: ::0/0 refuse

Install Unattended Upgrading

Debian Ganeti Specific Configuration

Edit /etc/hosts to have the real address of the host, e.g.

127.0.0.1       localhost
147.28.0.3      vm0.sea.rg.net  vm0
147.28.0.15     vm1.sea.rg.net vm1
147.28.0.100    gnt0.sea.rg.net gnt0

Fix /etc/network/interfaces

Make eth0 hang off of whatever your bridge will be called

# The loopback network interface
auto lo
iface lo inet loopback

# Management interface
auto eth0
iface eth0 inet manual

auto br-lan
iface br-lan inet static
        address         147.28.0.15
        netmask         255.255.255.0
        gateway         147.28.0.1
        bridge_ports    eth0
        bridge_stp      off
        bridge_fd       0
        bridge_maxwait  0

# VLAN 100
auto eth0.100
iface eth0.100 inet manual

auto br-rep
iface br-rep inet static
        address         147.28.0.101
        netmask         255.255.255.0
        bridge_ports    eth0.100
        bridge_stp      off
        bridge_fd       0
        bridge_maxwait  0

auto eth0.255
iface eth0.255 inet manual

# VLAN 255
auto br-svc
iface br-svc inet manual
        bridge_ports    eth0.255
        bridge_stp      off
        bridge_fd       0
        bridge_maxwait  0

Check /etc.resolv.comf

In theory, this looks like

               -------------+--------------
                            |
                          br-lan 
                            |         this host
                  +---------+---------+
                  |        eth0       |
                  |                   |
                  |eth0.255   eth0.100|
                  +--+-----------+----+
                     |           |
                   br-svc      br-rep
                     |           |
         VMs --------+           +------> to other ganeti hosts

Also, put the following in /etc/sysctl.conf:

net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

Install Ganeti

Set up to get Ganeti from backports

cat >> /etc/apt/sources.list.d/wheezy-backports.list
deb http://cdn.debian.net/debian/ wheezy-backports main

And then install it

apt-get update
apt-get install ganeti/wheezy-backports

Fix up drbd

echo "options drbd minor_count=128 usermode_helper=/bin/true" > /etc/modprobe.d/drbd.conf
rmmod drbd      # ignore any error
modprobe drbd

Add vm1 to the Ganeti Cluster

On vm0.sea.rg.net, the existing ganeti single-node cluster, run

gnt-node add vm1.sea.rg.net

Which will SSH as root to vm1, set up ssh keys, do all the right things to make vm1 part of the cluster.

Then set "PermitRootLogin" to "without-password" in vm1's /etc/ssh/sshd_config

Fix VNC passwording

echo 'clusture' > /etc/ganeti/vnc-cluster-password
gnt-cluster modify -H kvm:vnc_password_file=/etc/ganeti/vnc-cluster-password

As vm0 was pretty loaded, make vm1 the master. So, on vm1, the new master, run

gnt-cluster master-failover

Load the ESXI Images

Mount the NFS system that has the guest VMs.

On vm1, add the following line to /etc/fstab

147.28.0.64:/data/nfs   /nfs-data     nfs     defaults          0     0

and then

mkdir /nfs-data
mount /nfs-data

Install Ganeti Instance Management

Install ganeti-instance-image

wget https://code.osuosl.org/attachments/download/2169/ganeti-instance-image_0.5.1-1_all.debcd
dpkg -i ganeti-instance-image_0.5.1-1_all.deb

Install qemu utilities (though they likely came in with other installs)

apt-get install qemu-utils

And force latest version of qemu-image

apt-get install qemu-utils/wheezy-backports

Aside: if you also want ganeti-instance-debootstrap then version 0.14 is now in wheezy-backports. You don't need to install from source. You'll only want ganeti-instance-debootstrap to create images from scratch where it installs Debian or a Debian-related OS automatically.

Create the Guest VM Instances

For each VM, run the following:

#!/bin/sh

# makeVM diskGB ramGB nameFQDN

DISK=$1
RAM=$2
NAME=$3
NODE=vm1.sea.rg.net

gnt-instance add \
     -t plain \
     -o image+default \
     -s ${DISK}G \
     -B minmem=${RAM}G,maxmem=$((${RAM}*2))G \
     -n $NODE \
     -H kvm:vnc_bind_address=0.0.0.0 \
     --no-install \
     --no-start \
     --no-ip-check \
     --no-name-check \
     ${NAME}

This produces

vm1.sea.rg.net:/root# ./do-add 200 4 <instance-name>
Tue Apr 22 23:15:35 2014 * disk 0, size 200.0G
Tue Apr 22 23:15:35 2014 * creating instance disks...
Tue Apr 22 23:15:38 2014 adding instance <instance-name> to cluster config
Tue Apr 22 23:15:38 2014  - INFO: Waiting for instance <instance-name> to sync disks
Tue Apr 22 23:15:39 2014  - INFO: Instance <instance-name>'s disks are in sync

If it is a FreeBSD VM, also do

gnt-instance modify -H disk_type=scsi <instance-name>

so that da devices still work at boot.

Get the UUIDs of all VMs, and fill out the table.

gnt-instance list -o name,disk.uuid/0

Load the Stored VM VMDK Files into the Ganeti Images

As root, mount the raid1 nfs filesystem

Convert the ESXI Images to Ganeti Guest Images

gnt-instance info --all | egrep 'Instance name|on primary'

Will show the primary device for each ganeti VM

      on primary: /dev/xenvg/95d2bb8f-063f-498d-b98a-9c03acea991f.disk0 (252:2)

which we use as the output UUID

Check the type of image we have

qemu-img info <vmdk-filename>

If it is a Flat Raw Image

For -flat.vmdk files, you should be able to

dd bs=4096k if=<vmdk-filename> of=/dev/ganeti/<disk0 from gnt-instance info>

If it is a Real VMDK

For -s001.vmdk files, then you should be able, for each VMDK

qemu-img convert -f vmdk -O raw <input_file> <output UUID>

Try the VMs!

You can use the built in console or come over VNC over ssh, of course).

Start the image

gnt instance start <image name>

And come im over the text console or VNC

Direct Text Console

gnt-instance console <image name>

Over VNC for Graphics

gnt-instance list -o +network_port

To get

Instance     Hypervisor OS            Primary_node  Status  Memory Network_port
minibsd-test kvm        image+default deb64.psg.com running   256M 11001

Then run a VNC to the base system port number in that report e.g. 11001, e.g. (notice port 11001)

ssh -N -L 5900:127.0.0.1:11001 vm1.sea.rg.net

And get ready to start your VNC session (in this case, I would be using Chicken of the VNC to VNC display localhost:0, aka localhost port 5900).

To give each user a different password, do it at the instance level:

echo 'wombat' >/etc/ganeti/vnc-password-<username>
gnt-instance modify -H vnc_password_file=/etc/ganeti/vnc-password-<username> foobar

Or make a directory /etc/ganeti/passwords and stash them there.

If FreeBSD Does Not Mount Root

If the system boots but does did not mount the root filesystem, and leaves you at the mountroot prompt. It seems as if FreeBSD

/dev/da0p2

may become

/dev/vtbd0p2

If you do the mountroot to

ufs:/dev/vtbd0p2

the root mounts and the system comes up.

sra reminds us that it is good idea to do an fsck of / at single user, before enabling write to the / filesystem.

Of course, the filesystem will be image dependent.

Converting a FreeBSD Guest to Paravirtual I/O

FreeBSD systems will run better and be kinder to the underlying virtualization system if they run paravirtual I/O for both disk and network. To hack this,

Add to /boot/loader.conf.local

virtio_load=YES
virtio_pci_load=YES

As advised in [http:http://freebsd.1045724.n5.nabble.com/kvm-vlan-virtio-problem-tp5757713p5757788.html http://freebsd.1045724.n5.nabble.com/kvm-vlan-virtio-problem-tp5757713p5757788.html], In /etc/sysctl.conf add

net.inet.tcp.tso=0

Hack config in /etc/rc.conf changing the interface name and disabling tso

ifconfig_vtnet0="147.28.0.8/24 -tso"
ifconfig_vtnet0_ipv6="inet6 2001:418:1::8/64"

And hack /etc/fstab to

# Device         Mountpoint      FStype  Options  Dump    Pass#   
/dev/vtbd0s1a    /               ufs     rw       1       1
/dev/vtbd0s1b    none            swap    sw       0       0
/dev/vtbd0s1d    /root           ufs     rw       2       2
/dev/vtbd0s1e    /var            ufs     rw       2       2
/dev/vtbd0s1f    /var/spool      ufs     rw       2       2
/dev/vtbd0s1g    /usr            ufs     rw       2       2

Then the VM admin has to

gnt-instance shutdown <guestname>
gnt-instance modify -H nic_type=paravirtual,disk_type=paravirtual <guestname>
gnt-instance start <guestname>

To revert, the VM Admin can

gnt-instance shutdown <guestname>
gnt-instance modify -H nic_type=e1000,disk_type=scsi <guestname>
gnt-instance start <guestname>

It would also be helpful to enable the 9600 baud serial console so that admins can see your VM boot.

Optionally Convert plain to drbd

For each instance

$ gnt-instance stop <instance name>
$ gnt-instance modify \
    -t drbd \
    --no-wait-for-sync \
    -n <name of node for replica> \
    <instance name>
$ gnt-instance start <instance name>

To watch the paint drying,

cat /proc/drbd

Node           DTotal DFree MTotal MNode MFree Pinst Sinst
vm0.sea.rg.net   5.4T  3.2T  31.5G 24.3G  8.2G    14     0
vm1.sea.rg.net   5.9T  5.3T  70.9G 26.1G 64.9G     6     0

Instance                     Primary_node   ConfigMaxMem DiskUsage
adrilankha.hactrn.net        vm0.sea.rg.net         4.0G    260.0G
archive.psg.com              vm0.sea.rg.net         1.0G    100.0G
build-u.rpki.net             vm1.sea.rg.net         1.0G    100.0G
ca0.rpki.net                 vm1.sea.rg.net         2.0G    100.0G
cache0.sea.rpki.net          vm1.sea.rg.net         2.0G    100.0G
chezrandy.x0.dk              vm0.sea.rg.net         768M    100.0G
hans.rg.net                  vm0.sea.rg.net         3.0G    250.0G
hiroshima.bogus.com          vm0.sea.rg.net         4.0G    256.0G
linear.algebras.org          vm0.sea.rg.net         1.0G    100.0G
nagasaki.bogus.com           vm0.sea.rg.net         4.0G    258.0G
nic0.net.lb                  vm1.sea.rg.net         2.0G    100.0G
nlring.sea.rg.net            vm1.sea.rg.net         2.0G     32.0G
okui.psg.com                 vm0.sea.rg.net         1.0G    100.0G
proto0.sea.rpki.net          vm1.sea.rg.net         2.0G    100.0G
r1.securerouting.org         vm0.sea.rg.net         2.0G    100.0G
rip1.psg.com                 vm0.sea.rg.net         2.0G     36.0G
turing.worldpowersystems.com vm0.sea.rg.net         2.0G    256.0G
xmpp.rg.net                  vm0.sea.rg.net         2.0G    100.0G
zoe.dns.gh                   vm0.sea.rg.net         1.0G    200.0G
zzyzx.sigpipe.org            vm0.sea.rg.net         2.0G    100.0G
Last modified 5 years ago Last modified on Mar 20, 2015, 9:49:41 PM