KVM virtualization with Arch Linux as host system (qemu/virtio/hugepages/systemd)

Virtualizing machines using KVM and qemu is pretty darn easy.

This is a quick guide how to setup KVM virtualization on a host system running Arch Linux, NOT using livbirt. Of course this applies to other distibutions as well, but the commands covered will be Arch and/or systemd specific. To improve performance of the guests, I will use some parameters from IBMs “Best practices for KVM” documents, such as using VirtIO paravirtualized drivers and huge pages backed memory.

First of all we need a recent kernel (>=2.6.20) and a CPU which supports virtualization extensions like Intel VT or AMD-V and available resources for the guest(s).

I will only cover IPv4 address (a post on IPv6 will probably follow) and I assume you have one useable IP address (with separate MAC addresses – ask your provider or use 00:16:3E:XX:XX:XX) for each guest and one for the host.

Install qemu & socat

# pacman -S qemu socat

qemu emulates our guest machines and makes use of KVM when the parameter ‘-enable-kvm’ is used. socat is needed by the qemu systemd service file.

Network setup

/etc/netctl/br0 Setup a network bridge, br0, with the host IP:

Description="Bridge for the guest machines"
Interface=br0
Connection=bridge
BindsToInterfaces='eth0'
IP=static
Address='ip_addr/net_mask'
Gateway='gw_ip'
DNS=('dns_ip_1' 'dns_ip2')
## Ignore (R)STP and immediately activate the bridge
SkipForwardingDelay=yes

Assuming you use the same IP for the host and the brigde, you can remove /etc/netctl/eth0 as netsctl will bring up eth0 anyways and the bridge binds to it. Activate the newly configured bridge with netctl or reboot.

/etc/qemu/bridge.conf Allow qemu to access to the bridge br0:

allow br0

Tip: To use the old udev device naming schema (ethX instead of enp3sX) you need to do the following:

For systemd >=209:  # ln -s /dev/null /etc/udev/rules.d/80-net-setup-link.rules

For systemd <=208: # ln -s /dev/null /etc/udev/rules.d/80-net-name-slot.rules

Note that with multiple NICs the device naming can change when you reboot, you probably only want to do this when you have only one NIC installed.

Setting up huge pages

First you need to know how much memory you will assign to all guests in total.

For example you have 32GB RAM and want to run 3 guests with 10GB RAM for each one (leaving 2GB for the host system – don’t forget that the host needs RAM too ;), you need to allocate a little bit more (to play safe) than 30GB as huge pages – 30,2GB. A detailed description can be found here. The standard huge page size is 2048kB – 2MB (some systems also support 1GB huge pages but I don’t want to go into deep details here). So for 30,2GB hugepages we calculate: 30,2GB * 1024 / 2 = 15462,4 Rounded up we need 15463 huge pages.

Now we allocate the amount of needed huge pages:

# sysctl vm.nr_hugepages=15463

And make it permanent:

# echo "vm.nr_hugepages=15463" >> /etc/sysctl.d/10-kvm.conf

Verify that the correct amount of huge pages has been allocated:

# cat /proc/meminfo | grep HugePages_Total

If it shows less than the amount we have tried to allocate, the memory is allocated by other processes already and you need to reboot your system.

Later, qemu needs to know where to find the hugetlbfs via the ‘-mem-path’ parameter. If qemu can’t allocate RAM you probably used to few hugepages.

Arch Linux automatically mounts the hugetlbfs to /dev/hugepages – check with ‘mount’ if it is mounted.

Add a user for qemu

qemu should not run as root, we need to create a user (and a primary group for the user) so it can drop it’s privileges:

# groupadd kvm

# useradd -g `grep kvm /etc/group | awk -F':' '{print $3}'` -s /usr/bin/nologin kvm

Prepearing the disk images for the guest(s)

RAW images provide the best performance, qcow2 is a little bit less performant but required if you want to use snapshotting and/or overlays. Block devices provide the best performance but for the sake of simplicity I won’t cover them.

Example to create a 100GB filesystem in the RAW format:

# qemu-img create -f raw /vms/machine1.img 10G

qemu-img automatically creates the images as sparse files if your filesystem supports them.

I will not use native asynchronus I/O, as I came across this page (which is dead old, but better safe than sorry) – I/O is fast enough for me.

Start a guest, mount an ISO and install the OS via VNC

Let’s install a FreeBSD 10 guest:

# wget ftp://ftp.freebsd.org/pub/FreeBSD/releases/amd64/amd64/ISO-IMAGES/10.0/FreeBSD-10.0-RELEASE-amd64-bootonly.iso

# qemu-system-x86_64 -runas kvm -display none -vnc :1,password -monitor stdio -k de -m 10G -smp 4 -enable-kvm -balloon virtio -net nic,model=virtio,macaddr=<MAC> -net bridge,br=br0 -drive file=/vms/machine1.img,if=virtio -mem-path /dev/hugepages -cpu host -cdrom FreeBSD-10.0-RELEASE-amd64-bootonly.iso -boot menu=on

This will start a x86_64 guest as the user ‘kvm’ with: no local display output, passworded VNC enabled (we set the password in the next step), a console interface to qemu, German keymap, 10GB RAM, 4 CPU threads, KVM enabled, VirtIO memory ballooning enabled, a VirtIO network card (change <MAC> to the MAC address for the guest), bridged networking, our image file (with the VirtIO block device driver), the memory path where our hugepages are mounted, the same CPU model as the host system with all CPU flags passed through, the FreeBSD 10 ISO in the virtual CDROM drive and finally a boot menu.

Now we have to set the VNC password:

(qemu) change vnc password

Use a VNC client and connect to the IP of the host system, enter the password you have set and just install as you are used to.

FreeBSD 10+ supports all VirtIO devices out of the box and if your machine and network speed are good, the guest is installed in less than 5 minutes.

Debian 7.4.0 also supports everything out of the box.

Not all Windows versions support VirtIO. Always check the KVM guest support status page or the manufacturer/distributions documentation first.

If you install Arch Linux in your guest, make sure to add the following modules to /etc/mkinitcpio.conf:

MODULES="virtio_blk virtio_pci virtio_net"

before you run:

# mkinitcpio -p linux

Edit: This is not needed anymore – only virtio_net has to be included if you use network in early userspace.

When you have finised the installation, reboot and make sure everything works, then shut down the guest (because we want systemd to start and stop our VMs).

Start and stop guests via systemd

Install the qemu service file provided by toerb to /etc/systemd/system/qemu@.service

Adjust the file to suit your needs, i. e. change ‘qemu-system-x86_64′ to whatever you need.

Now grab the arguments you just sarted qemu with and put it in /etc/conf.d/qemu/<name_of_the_guest_machine>, leaving out the VNC and CDROM stuff and prepending ‘args=’.

For our example FreeBSD 10 guest /etc/conf.d/qemu/machine1 will read:

args=-runas kvm -display none -k de -m 10G -smp 4 -enable-kvm -balloon virtio -net nic,model=virtio,macaddr=<MAC> -net bridge,br=br0 -drive file=/vms/machine1.img,if=virtio -mem-path /dev/hugepages -cpu host

From now on the guest can be started with:

systemctl start qemu@machine1.service

Stopping the guest with systemctl stop will actually shut down the guest machine. This is why we have installed socat at the beginning. The magic is done in the qemu service file.

To autostart and stop the guest when the host system boots or goes down, just:

systemctl enable qemu@machine1.service

Rinse and repeat for any additional guests.

General tips and recommendations

  • Leave at least one CPU core/thread to the host
  • Leave a reasonable amount of RAM for the host system
  • If you get problems in your guests (errors in dmesg, whatever) try ‘-cpu kvm64′ or -‘cpu kvm32′, depending on your CPU bits as passing ‘-cpu host’ does not have to but can lead to problems. Also check -cpu help, maybe you can find a good alternative beweeen host and kvm*
  • Overcommitting resources can lead to performance imporvements but is complicated and if not done right it does more harm than good

Links

Happy virtualization!
facebooktwittergoogle_plusredditpinterestlinkedintumblrmail