Notice: Undefined offset: 1 in /usr/local/src/wordpress/wp-content/themes/montezuma/includes/parse_php.php on line 79

HowTo: Xen, for the Everyday Microkernel

~~   Forward   ~~

xen-logoMost people think of Xen as only being applicable to large organizations like Amazon’s AWS, RackSpace and other clouds, and various clustering applications.   Why is Xen such a good model of virtualization, clustering and security?   Because it’s the closest we have for now, to a production microkernel architecture.

~~   The Microkernel Model   ~~

The microkernel operating system model is one which rethinks the very core of the way operating systems work.   With microkernel, very few functions are actually handled by the core kernel in privileged mode, and the kernel itself is simple, compact, and fast.   The minimal functions handled by the microkernel are low-level address space management, thread management, and inter-process communication.   All other OS functions, including device drivers, protocol stacks, file systems, etc, are handled in user space.   If there is a buffer overflow or other vuln in a driver of the microkernel system, the best a cracker could do is get to the non-privileged user that driver is running as, inside the virtual machine it’s running in.

Compare this with a monolithic kernel like Windows or Linux, where all of these OS-related functions are handled by the kernel in privileged mode.   With a monolithic kernel, any flaw in any one of the drivers or exception handlers, etc means the possibility to compromise the total system to its core.

In the late 1980’s Prof. Andrew Tannenbaum (lit: “Prof. ChristmasTree” – German) of Vrije Universiteit created Minix, the most advanced microkernel OS of the time, and lobbied Linus Torvalds to give Linux a microkernel architecture.   However Torvalds decided on a monolithic kernel.   There followed a major debate over the merits of microkernel vs monolithic, but in the end Torvalds prevailed, not I think for technical merit, but through force of will and limitations of hardware at the time. :/

ukernelToday there are some examples of microkernel OS’ about, including Gnu Hurd, Minix, and seL4, although the only one with any real traction these days is seL4, and it’s mainly targeted for embedded devices.   Taking the idea of microkernels to an extreme is the ‘ExoKernel‘, although this puts too much demand on developer resources and so may only be viable for supercomputing.

~~   Xen’s Role   ~~

  • Small footprint and interface (~1MB in size) – Because Xen uses a microkernel design, with a small memory footprint and limited interface to the guest, it is more robust and secure than other hypervisors, like KVM.
  • OS-agnostic – Most installations run Linux as the main control stack (domain 0), although a number of other operating systems can be used, including NetBSD and OpenSolaris.
  • Driver Isolation – Xen allows the main device driver for a system to run inside a virtual machine.   If the driver crashes or is compromised, the VM containing the driver can be rebooted and the driver restarted without affecting the rest of the system.
  • Paravirtualization – Fully paravirtualized guests are optimized to run as a virtual machine.   This allows the guests to run much faster than with hardware extensions (HVM).   Additionally, Xen in this mode can run on hardware that doesn’t support virtualization extensions.
  • So we will first install standard Debian onto a standard system, although with limited drive space and resources.   This will be our “dom0” (‘domain0’) — in microkernel parlance, executing in ‘ring 0’, meaning privileged mode.   dom0 will then control our virtual machines, which all execute in ring 3 (‘user mode’) and are called “domU”, or domain-User.

    domU’s can be one of two flavors:

  • ParaVirtualized Machine (PVM) – is an efficient and lightweight virtualization which does not require virtualization extensions from the host CPU.   However, PV guests and control domains require kernel support and drivers, usually in Linux.   It is not possible to run Windows in a PVM.
  • Hardware Virtualized Machine (HVM) – CPUs that support virtualization enable the use of unmodified guests, including proprietary operating systems (such as MS Windows).   Both Intel and AMD have developed Xen modifications supporting their respective Intel VT-x and AMD-V architecture extensions.   HVM emulates devices using QEMU to provide I/O virtualization for virtual machines.   This approach emulates hardware with a patched QEMU “device manager” (qemu-dm) daemon running in dom0.   This means that the virtual machines see an emulated version of a fairly basic PC.   Needless to say, QEMU isn’t the swiftest in the world, nor is any emulator.
  • So a PVM virtual machine is fast, and HVM is compatible.   But there is actually a third mode:

  • PV-on-HVM – Where performance is critical, disk and network drivers can actually be used during normal guest operation, so the emulated PC ‘hardware’ is used mainly to boot.   This allows HVM guests with some minor modifications, to have the benefits of paravirtualised I/O where it matters, in disk and network access.
  • In summary for virtual machines, for best performance, PVM.   If you must run Windows, or have direct access to certain hardware on the machine such as a video decoder, then HVM, and to improve HVM’s performance add PV extensions to speed disk and network access.

    xenpandaAnother wonderful feature of Xen is that it can be configured such that dom0 and domUs will automatically fail over to other machines.   So if you have 1,000 blade servers, your Xen system can move around dynamically according to conditions.   And further, each domU can dynamically (and cooperatively) expand its memory and number of CPUs to as many as you have in your cluster!   This is the meaning of an “Elastic Compute Cloud“.   And finally you can add to your LVM filesystem, clustering extensions, so each of the machines in your cluster can access one dataset.

    ~~   Base Installation   ~~

    All the above, nobody will actually tell you.   Maybe because it’s a secret…   (I’ve asked, and gotten nothing, over and over)   Or maybe because they only vaguely understand it or have it at an instinctual level.   I mean, some of the raw information is out there, but no context or perspective;   plus Debian’s implementation is busted, so this article is for the 90% of you who couldn’t make Xen work on Debian.   Now for the good stuff.   This HowTo is Debian-centric.

    I always use the Network Install CD, so as to get the latest of everything.   And I always install 64bit Debian Testing.

  • Go to this out-of-the-way page and under “netinst (generally 150-280 MB) CD images” get the amd64 image.   Always check the image’s SHA or PGP signature, although unfortunately in this case they don’t make those available, lol.
  • Burn the image to a CD (with verify), pop it in your target machine, and boot.
  • I always choose Advanced Options|Graphical Expert Mode as you have more control, and it’s not hard.
  • When you get to disk partitioning, choose Manual, and if it asks what type of partition, msdos.   On your target disk set up four partitions:
    /boot 200 MB ext3 Primary
    / 30 GB jfs Primary
    swap 2 GB swap Primary
    /media/guests {the rest of the space} jfs Primary

    (I still don’t trust ext4 after the data-loss problems they had)
    If you’ll be using a ZFS or other array for your guests, of course don’t set up /media/guests here.

  • In the disk partitioner select Encrypt Partitions, and check partitions 2, 3, & 4.   Set partition 3 to Random Key instead of password and Finish encryption.
  • Now in the disk partitioner, for the encrypted partitions change partition 2 from ext4 to JFS and set it to /.   JFS is just better.
  • In the disk partitioner select Setup Logical Volume Manager (LVM), and choose partition sd?4_crypt.   Create a new volume group there called xen-vg.   It would be nice to also set up our HVM volumes here, but unfortunately Debian’s installer is busted in this way and wouldn’t let you complete the install.   Finish LVM.
  • Finish partitioning.
  • If you don’t like Gnome (and I don’t) when you get to the Package Selection stage in the install, select KDE, XFCE, or whatever.
  • When the installer asks whether you want a generic initrd or one specific to your system, choose generic, or else it won’t boot due to LUKS.
  • Complete the installation, and reboot into your new OS.
  • Now, the boot partition is sd?1 /boot, the root partition is sd?2 /, the swap partition is sd?3 swap, and the Xen LVM volume group is sd?4.   swap never needs a password, as it uses a different random key on every boot.   We have almost all the ingredients for our dom0.

    ~~   Configuring dom0   ~~

    The root partition (sd?2) and LVM partition (sd?4) will need a password on every boot, so let’s save a little trouble and set sd?4 to use a binary file as a key.   After this, on boot when you enter the password for /, it’s unlocked and makes available the binary file for sd?4 so that can then be unlocked automatically.   Decide on a binary file which won’t change with updates, maybe one of your own in /usr/local/bin (if you comply with Posix, which you should), or a photo or something that won’t ever move its location and that you won’t forget is your key.

    # cryptsetup luksAddKey /dev/sd?4 /usr/local/bin/{yourbinary}
    The partition will still unlock with your password, but now it will also open with the binary file as a key.

    Now edit /etc/crypttab and change:

    sd?4_crypt UUID={UUID} none luks

    to:

    sd?4_crypt UUID={UUID} /usr/local/bin/{yourbinary} luks

    So on boot, sd?4 will unlock automatically as soon as sd?2 is unlocked.   Leave your UUIDs as they are, if you know what’s good for you.

    Next we’re going to change Debian over from SysV to SystemD, since that’s the coming thing.

    # apt-get update
    # apt-get install systemd libpam-systemd systemd-sysv systemd-gui linux-headers-amd64 command-not-found

    Edit /etc/default/grub and set:
    GRUB_CMDLINE_LINUX_DEFAULT=”ipv6-disable=1 init=/lib/systemd/systemd
    # update-grub

    Now we Xenify:
    # apt-get install xen-linux-system-amd64 xen-tools xen-utils-4.4 bridge-utils libvirt-bin shorewall shorewall-init rkhunter firmware-linux-free liblog-message-perl synaptic markdown

    Edit /etc/default/grub and set:

    GRUB_DEFAULT=2
    GRUB_TIMEOUT=3

    This ensures that the Xen kernel is booted automatically, and that you don’t have to wait longer than necessary.

    # update-grub

    Reboot, and watch when Grub comes up to make sure that the Xen kernel is selected for boot.   It booted Ok, didn’t it?

    Congratulations, you’re now on microkernel!

    Next we need to know whether your CPU supports Virtualization, IOW if you can do HVM:
    # egrep `(vmx|svm)` /proc/cpuinfo
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
    Gives vmx for Intel, and svm for AMD.   You may get up to 8 of these stanzas, one for each multiprocessor. (usually 2 per core)   If you don’t get vmx or svm, it may simply be turned off in BIOS.   Look in the Processor submenu;   this setting may be hidden in the Chipset, Advanced CPU Configuration or Northbridge sections.   In my circa 2016 Asus mobo the setting is in BIOS at AdvancedMode|Advanced|CPUConfig|IntelVirt.   If you don’t need HVM, don’t worry about it.   If will be running HVM, also find IOMMU (aka VT-d) in the settings and turn it on, for PCI passthrough.   In my circa 2016 Asus mobo the setting is in BIOS at AdvancedMode|Advanced|SystemAgent|VT-d .   VT-d is only in more advanced processors like the Intel Xeon series.

    Networking.   We have to de-fang Network Mangler, er ‘Manager‘, as dom0 is going to serve as a bridge for all its guest VMs, and needs special configuration.   Run Synaptic and deinstall everything NetworkManager, except those few libs which would also remove key parts of the OS or KDE.   Then edit /etc/network/interfaces and make it look thus:

    # The loopback network interface
    auto lo
    iface lo inet loopback
    
    # Set up interfaces manually, avoiding conflicts with, e.g., network manager
    	auto xenbr
    	iface xenbr inet static
    		address 192.168.1.3
    		netmask 255.255.255.0
    		network 192.168.1.0
    		broadcast 192.168.1.255
    		gateway 192.168.1.1
    		bridge_ports eth0
    		# Disable Spanning Tree Protocol
    		bridge_stp off       
    		# No delay before a port becomes available
    		bridge_waitport 0
    		# No forwarding delay
    		bridge_fd 0

    … of course adjusting to your LAN.   We also must set up Shorewall properly as well.   This will be Shorewall’s new configuration method (Format 2), so in /etc/shorewall you only need conntrack, interfaces, policy, rules, shorewall.conf, and zones files.   conntrack is not modified.   shorewall.conf doesn’t need to be modified unless you want to disable AdminAbsentMinded or IPV6.   Set the others as such:

    interfaces:

    ###############################################################################
    ?FORMAT 2
    ###############################################################################
    #ZONE	INTERFACE	OPTIONS
    -	lo		ignore
    # (newnotsyn & routefilter are in shorewall.conf)
    net		eth0		blacklist,nosmurfs,tcpflags
    net		xenbr	blacklist,nosmurfs,tcpflags
    net		vif7.0		blacklist,nosmurfs,tcpflags

    policy:

    ###############################################################################
    #SOURCE	DEST	POLICY		LOG	LIMIT:		CONNLIMIT:
    #				LEVEL	BURST		MASK
    $FW	all	REJECT		info(uid)
    net	all	DROP		info(uid)
    #local	all	REJECT		info(uid)
    all	all	REJECT		info(uid)

    rules (of course set IPs appropriate to your LAN):

    ######################################################################################################################################################################################################
    #ACTION		SOURCE		DEST		PROTO	DEST	SOURCE		ORIGINAL	RATE		USER/	MARK	CONNLIMIT	TIME		HEADERS		SWITCH		HELPER
    #							PORT	PORT(S)		DEST		LIMIT		GROUP
    ?SECTION ALL
    ?SECTION ESTABLISHED
    ?SECTION RELATED
    ?SECTION INVALID
    ?SECTION UNTRACKED
    ?SECTION NEW
    
    # Silently drop FIN scans, etc.  Do not put :info here or it will log
    # late-arriving RSTs from connections which have already been closed.
    Invalid(DROP)	net	all	tcp
    Invalid(DROP)	net	all	udp
    
    ACCEPT  $FW             net             tcp     ftp,git,hkp,http,https,svn    -
    # https(443/udp) is for dnscrypt.
    ACCEPT  $FW             net             udp     domain,https,ntp -
    
    # SSH
    ACCEPT  net:192.168.1.0/28    $FW     tcp     ssh -
    ACCEPT  $FW     net    tcp     ssh -
    
    #	Pinging
    #
    ACCEPT  net:192.168.1.1       $FW             icmp    8
    ACCEPT  net:192.168.1.2       $FW             icmp    8
    ACCEPT  net:192.168.1.4       $FW             icmp    8
    ACCEPT          $FW             net             icmp    8
    ACCEPT          $FW             net             icmp    3

    zones:

    ###############################################################################
    #ZONE	TYPE		OPTIONS		IN			OUT
    #					OPTIONS			OPTIONS
    fw	firewall
    net	ipv4
    #local	ipv4
  • Bridges and peth devices are all set to NOARP, while the eth and vif devices are all set to ARP ON.   None of the nics, vifs, or peths have IPs.
  • vifX.Y is the default naming convention for domU interfaces where X is the domU and Y is the eth# in the domU.
  • pethX is the physical Ethernet interface as renamed by the Xen network script.
  • And last we must set our toolstack to xl:
    Edit /etc/default/xen   and set TOOLSTACK=xl .

    So reboot dom0 and let’s check it.

    # systemctl status shorewall

    Active: active (exited) since Sat 2016-10-04 22:59:16 PDT; 1m ago

    # ifconfig
    xenbr Link encap:Ethernet HWaddr 74:d0:5b:e2:f1:2a
    inet addr:192.168.1.3 Bcast:192.168.1.255 Mask:255.255.255.0
    inet6 addr: fe80::75d0:13ff:fe79:be4a/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:445 errors:0 dropped:0 overruns:0 frame:0
    TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:0
    RX bytes:26737 (26.1 KiB) TX bytes:900 (900.0 B)

    eth0 Link encap:Ethernet HWaddr 74:d0:5b:e2:f1:2a
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:452 errors:0 dropped:7 overruns:0 frame:0
    TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:33779 (32.9 KiB) TX bytes:900 (900.0 B)

    lo Link encap:Local Loopback
    inet addr:127.0.0.1 Mask:255.0.0.0
    inet6 addr: ::1/128 Scope:Host
    UP LOOPBACK RUNNING MTU:65536 Metric:1
    RX packets:40 errors:0 dropped:0 overruns:0 frame:0
    TX packets:40 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:0
    RX bytes:2000 (1.9 KiB) TX bytes:2000 (1.9 KiB)

    #
    # brctl show

    bridge name bridge id STP enabled interfaces
    xenbr 8000.78d32b76fe2a no eth0

    # ping google.com
    PING google.com (74.125.198.102) 56(84) bytes of data.
    64 bytes from og-in-f102.1e100.net (74.125.198.102): icmp_seq=1 ttl=42 time=147 ms

    Perfect.   So now dom0 can serve as a bridge for all VMs, and get internet access itself.   Is Xen running?
    # xl list
    Name ID Mem VCPUs State Time(s)
    Domain-0 0 7820 8 r—– 143.9

    Yep.

    ~~   Configuring PVMs   ~~

    We will not be locking down CPU cores for dom0 or domU’s (freezing allocation), so they will be automatically allocated. (“ballooned”)   Likewise we will not be dedicating a fixed amount of memory for dom0, but make your own call.

    Each paravirtualized subsystem in the hypervisor consists of 2 parts:
    1.   the “backend” which lives in dom0, and
    2.   the “frontend” driver, within the guest domain.

    The backend is a daemon which uses special ring-buffer-based interfaces to transfer data to guests, be it to provide a virtual hard-disk, ethernet adapter, or even a generic SCSI device.   The frontend driver then takes this stream of data and converts it back into a device, to present to the guest operating system.

    /etc/xen-tools/xen-tools.conf contains the settings which determine how PVMs will be set up, so we must configure this file.   Change these especially:

    	lvm = xen-vg
    	install-method = debootstrap
    	size   = 50G       # Root disk, suffix (G, M, k) required
    	memory = 1G     # Suffix (G, M, k) required
    	swap   = 1G     # Suffix (G, M, k) required
    	fs     = ext3     # Default file system for any disk
    	dist   = `xt-guess-suite-and-mirror --suite`
                      # Default distribution is determined by Dom0's distribution
    	image  = sparse   # Specify sparse vs. full disk images (file based images only)
    	gateway    = 192.168.1.1
    	netmask    = 255.255.255.0
    	broadcast  = 192.168.1.255
    	bridge    = xenbr
    	passwd    = 1
    	arch      = amd64    # Must change arch = [x86|amd64]  to arch = amd64

    Find and fix all of these settings, and review the rest to make sure they meet your needs.   Remember, this will determine how every one of your PVMs will be set up.   And with respect to memory=1G, this is the minimum;   remember, it can balloon, cooperatively.

    In /etc/xen/xend-config.sxp comment out everything about networking, routing and vif except this line:
    (vif-script vif-bridge)

    If you’d like KDE for your PVMs instead of Gnome, then Copy /etc/xen-tools/role.d/gdm to kde and change it so:

    #!/bin/sh
    #
    #  Configure the new image to be a KDE X2Go server.
    #
    # Carl
    # --
    # https://carl.unofficial-tesla-tech.com/
    
    prefix=$1
    
    #  Source our common functions - this will let us install a Debian package.
    #
    if [ -e /usr/share/xen-tools/common.sh ]; then
        . /usr/share/xen-tools/common.sh
    else
        echo "Installation problem"
    fi
    
    chroot ${prefix} cat <> ${prefix}/etc/apt/sources.list
    #*************************************************************************
    # To add package key, apt-key adv --keyserver keyserver.ubuntu.com --recv-keys {key#}
    
    # X2Go Repository
    deb https://packages.x2go.org/debian jessie heuler main
    # X2Go Repository (sources)
    deb-src https://packages.x2go.org/debian jessie main
    
    #*************************************************************************
    EOF
    
    chroot ${prefix} /usr/bin/apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E1F958385BFE2B6E
    
    chroot ${prefix} /usr/bin/apt-get update
    
    #
    #  Install the packages
    #
    installDebianPackage ${prefix} xserver-xorg
    installDebianPackage ${prefix} xfonts-100dpi
    installDebianPackage ${prefix} xfonts-75dpi
    installDebianPackage ${prefix} xfonts-base
    installDebianPackage ${prefix} x2goserver
    installDebianPackage ${prefix} rxvt
    installDebianPackage ${prefix} kde-plasma-desktop
    
    #
    #  Fix busted Debian network setup
    /bin/cat <> ${prefix}/etc/network/interfaces
    gateway 192.168.111.5
    EOF
    

    And edit /usr/bin/xen-create-image and add, at line 714 (after the GDM entry):

    =item kde
    Install an X11 server, using KDE and X2Go.

    If you don’t do these things, you’ll be sorry…
    I’m no longer trying to get the hooks files fixed upstream, which cause these issues.

    Now reboot dom0, for good measure.

    ~~   Create a PVM   ~~

    We are finally ready to create an actual PVM virtual machine.   We will name it, set its IP, and make other necessary settings, then the script will automatically create two logical volumes:   /dev/xen-vg/{domUname}-disk and /dev/xen-vg/{domUname}-swap.   The script will then download the PV operating system matching dom0, and it will install KDE, X2Go, etc, then will set up the PVM to boot and run.

    First, in dom0 make darned sure you have internet:
    # ping google.com
    PING google.com (74.125.198.102) 56(84) bytes of data.
    64 bytes from og-in-f102.1e100.net (74.125.198.102): icmp_seq=1 ttl=42 time=147 ms
    64 bytes from og-in-f102.1e100.net (74.125.198.102): icmp_seq=2 ttl=42 time=89.4 ms
    64 bytes from og-in-f102.1e100.net (74.125.198.102): icmp_seq=3 ttl=42 time=91.7 ms

    # man xen-create-image
    # xen-create-image –hostname {domUname} –ip {IP} –role builder,kde,puppet,tmpfs,udev –pygrub –verbose

    Give it some time, as there’s alot to download and do;   it could take up to an hour.   You can watch as the script does its kung-fu, and hopefully you have some method of network monitoring in place to watch the downloads.

    Eventually the script will ask for the root password, and then thinks it’s done.   You should now have two LVM volumes:
    # lvdisplay
    — Logical volume —
    LV Path /dev/xen-vg/{domUname}-swap
    LV Name {domUname}-swap
    VG Name xen-vg
    LV UUID WWBflu-PsGf-YNd1-nPMH-Yx3N-TiYO-OlvoCz
    LV Write Access read/write
    LV Creation host, time {dom0name}, 2016-10-04 13:41:00 -0700
    LV Status NOT available
    LV Size 1.00 GiB
    Current LE 256
    Segments 1
    Allocation inherit
    Read ahead sectors auto

    — Logical volume —
    LV Path /dev/xen-vg/{domUname}-disk
    LV Name {domUname}-disk
    VG Name xen-vg
    LV UUID zfLTIe-3FJb-J3ZO-bJFN-2TjP-3vUh-zUcoxL
    LV Write Access read/write
    LV Creation host, time {dom0name}, 2016-10-04 13:41:00 -0700
    LV Status NOT available
    LV Size 50.00 GiB
    Current LE 12800
    Segments 1
    Allocation inherit
    Read ahead sectors auto

    # xen-list-images
    Name: {domUname}
    Memory: 1024 MB
    IP: {IP}
    Config: /etc/xen/{domUname}.cfg

    # xl list
    Name ID Mem VCPUs State Time(s)
    Domain-0 0 6671 8 r—– 175.5

    The xen-create-image script will leave a full log of the domain creation in /var/log/xen-tools/{domUname}.log.   You should go through this to look for problems, especially the first times.

    Made a mistake?
    # xen-delete-image {domUname}
    If you’d interrupted xen-create-image, it will automatically clean up after itself.

    But, um, is the PVM running?   No, we have to start it with xl.   You’ll notice that you now have a new file called /etc/xen/{domUname}.cfg.   This was created by the xen-create-image script, and is the configuration file for that particular PVM.   The PVM’s specific OS and user files reside at /dev/xen-vg/{domUname}-disk, and they won’t be erased when you ‘destroy’ (stop dead) a PVM.   To start that PVM:
    # xl create /etc/xen/{domUname}.cfg

    Now what?   Is it actually running?   Do we have a brand new bouncing baby PVM?
    # ping {domUIP}
    … and if it pings it certainly is running.
    # xl list
    … and you should see it in the list.

    Get a console in it with:
    # xl console {domUname}

    Before we go any further, we need to fix something.   Chances are, the domU crashed to Emergency Mode, so log in to it and edit /etc/fstab.   Remove the line that tries to mount /var/run, and then save fstab.   Mounting /var/run is not cool with systemd, which we are running in both dom0 and domU.   Exit out of domU with {ctrl} ] and restart it:
    # xl shutdown {domUname}
    # xl destroy {domUname}

    Normally you only need to shutdown and not destroy, but this first instantiation was broken due to no fault of our own…

    # xl create -c /etc/xen/{domUname}.cfg

    Notice that this time when creating the domU we added -c?   This takes you straight into domU’s console, so you can watch it boot.   If it boots, and clears the screen, then gives a login prompt, you’re golden.   This is a pure text console, so there’s no GUI here.   That’s next.

    ~~   Accessing domU with X2Go   ~~

    Many think that the primary way of displaying and interacting with Xen VMs is through VNC, but the fact is VNC is dog-slow when accessing a remote machine over The Internets.   So we are going to use X2Go, which is already installed and running in your PVM.   Running NoMachine’s NX is also a good option, but I don’t care for it because it is proprietary and closed, plus it requires that you open an extra port (4000+) for access.   X2Go works through the SSH daemon, so only port 22 needs to be open.

    These instructions also apply to accessing HVMs.

    In dom0, edit /etc/apt/sources.list and add:

    #*************************************************************************
    # To add package key, apt-key adv --keyserver keyserver.ubuntu.com --recv-keys {ka
    
    # X2Go Repository
    deb https://packages.x2go.org/debian jessie heuler main
    # X2Go Repository (sources)
    deb-src https://packages.x2go.org/debian jessie main
    
    #*************************************************************************

    # apt-key adv –keyserver keyserver.ubuntu.com –recv-keys E1F958385BFE2B6E
    # apt-get update
    # apt-get install x2goclient

    Now X2GoClient should be in your dom0 menu under Internet.   So run it and set up a Session at the IP of your domU;   on the first setup page, set it to a KDE session.   Give it a whirl, and if wonders never cease you should see KDE unfolding before you.   Likely you’ll need to right-click the desktop and create a new panel if you don’t see one, but now you’re off to the races.   Make your new PVMs into a TOR Gateway, a LAN firewall, a pen/vuln testing suite, a backups server, or whatever you need.   The PVMs’ ethernet interface is bridged over dom0, so each can have completely different firewall rules, but the flip-side of that is you’d better set up a firewall in each PVM.

    ~~   Configuring HVMs   ~~

    For HVMs there is no handy create script, nor nebulous setup files.   It’s an entirely manual affair.   But setting up an HVM is actually a bit simpler than PVMs.   We first create the LVM disks, then create the /etc/xen/{domUname}.cfg file.   And start the HVM to install our guest OS into it.

    # lvcreate -n{domUname}-disk -L100G xen-vg
    # lvcreate -n{domUname}-swap -L4G xen-vg
    # lvdisplay

    Messed up?
    # lvremove xen-vg/{domUname}-swap

    Edit a text file at /etc/xen/{domUname}.cfg and put in it something like this:

    name = "{domUname}"
    	builder = 'hvm'
    	loader = 'hvmloader'
    	bootloader = 'pygrub'
    
    # Keep time in domU, from dom0
            localtime=1
    
    # Enable PVHVM
    #	xen_platform_pci=1
    
    # Start the guest with MBYTES megabytes of RAM.
    	memory = 2048
    
    #  Disk device(s) using LVM
    	root = '/dev/xvda1 ro'
    	disk = [
    		'phy:/dev/xen-vg/{domUname}-disk,xvda,w',
    		'phy:/dev/xen-vg/{domUname}-swap,xvdb,w',
    		'phy:/dev/sdb,sdb,w',
    		'file:/home/{USER}/dl/debian-jessie.iso,sdc:cdrom,r'
    		]
    	# hard disk (c), cd-rom (d) or network/PXE (n).
    	boot="dc"
    
    # Peripherals
    	vif = ['mac=00:16:5e:02:a3:53, bridge=xenbr, model=e1000']
    #	usb=1
    # tablet (Adomax -for mouse pointer), R5000, and Canon scanner:
    #	usbdevice = [
    #		'tablet',
    #		'host:04b4:8613',
    #		'host:04a9:1716',
    #		]
    
    #	gfx_passthru=BOOLEAN
    
    # Host PCI devices to passthrough to this guest
    #	pci = [ '06:00.0,permissive=1' ]
    #	pci = [ '00:03.0','00:1b.0','06:00.0' ]
    
    	vga         = 'stdvga'
    
    # The display should not be presented via an X window (using Simple DirectMedia Layer).
    	sdl=0
    
    #  Behavior
    	on_poweroff = 'destroy'
    	on_reboot   = 'restart'
    	on_crash    = 'restart'
  • A key thing to notice in this cfg file is we are setting up an iso    file:   as an emulated CD, for the source of the OS to be fully installed in the domU.   And that we are booting first to that emulated ‘CD’, and then to the -disk.   Of course the -disk won’t boot yet as nothing’s been installed on it.
  • The grayed-out entries are options, as you improve this.
  • So save your {domUname}.cfg file and make sure you have the OS iso image in place, and we are ready to install.

    # xl create -c /etc/xen/{domUname}.cfg

    … and so the domU will boot to the ‘CD’ and should present you with an install screen.

    Want to install graphically?   Add this line into your {domUname}.cfg

    	vfb         = [ 'type=vnc,vncdisplay=00,vncpasswd=s3cr3t,keymap=en-us' ]

    Stop your domU and restart it, and now Xen’s VNC server is running.

    # apt-get install krdc
    … and Start|Internet|KRDC.   Point it at vnc 127.0.0.1:5900 and go.   Now you should have your graphical install screen for the OS, from the booted (emulated) ‘CD’.

    Proceed thee to install the OS into domU, with alacrity.

    Once your new OS is installed into the HVM, edit the {domUname}.cfg and comment out:

    		'file:/home/{USER}/dl/debian-jessie.iso,sdc:cdrom,r'

    Now when you create that domU it should boot right to its -disk.

    ~~   Configuring PV on HVM   ~~

    You’ll remember that a PVM is effectively an (isolated) extension of the dom0 OS which runs in user mode, so it is very fast.   And that an HVM is a creature of QEMU, and so has the drawbacks of any emulator, but it offers the ability to install any OS into its domU.   We will now make a hybrid of the two with “Para-Virtualisation on HVM” to get many of the benefits of both.

  • First set up for a regular HVM, and install the full guest OS.
  • Without starting the domU, edit its /etc/xen/{domUname}.cfg file.   You should already have the disk and network devices ready for PVHVM with:
    		'phy:/dev/xen-vg/{domUname}-disk,xvda,w',
    		'phy:/dev/xen-vg/{domUname}-swap,xvdb,w',
    ...
    	vif = ['mac={someMAC}, bridge=xenbr, model=e1000']
  • So uncomment this line:
    	xen_platform_pci=1

    … and that’s all there is to it!

    But does it actually work?   To find out:
    # xl create -c /etc/xen/{domUname}.cfg

    In the domU edit /etc/default/grub and append to the kernel command line, loglevel=9 thus:

    GRUB_CMDLINE_LINUX_DEFAULT="ipv6-disable=1 init=/lib/systemd/systemd loglevel=9"

    Save the file and:
    # update-grub
    {ctrl} ]
          (exit the domU)

    # xl shutdown {domUname}
    # xl create -c /etc/xen/{domUname}.cfg

    You’re now back in the PVHVM domU.
    # dmesg

    Reverse loglevel=9

    blkfront: xvda: flush diskcache: enabled

    Xen HVM callback vector for event delivery is enabled
    Ok, PV for disk is working.

    # ethtool -i eth0
    driver: vif
    And PV for networking is working.

    Just to be sure, let’s check that all partitions are paravirt (xvda):
    # cat /proc/partitions
    major minor #blocks name
    202 0 31457280 xvda
    202 1 512000 xvda1
    202 2 4194304 xvda2
    202 3 26749952 xvda3

    Yep, we’re golden.

    Now reverse your grub loglevel setting, or else you’ll get too much info in the logs and dmesg.

    ~~   Configuring PCI Passthrough on an HVM Guest   ~~

    There are often cases where you need direct hardware support of some device, in a guest machine, such as a hardware video decoder, sound card, or USB device.   In these cases you would use ‘PCI Passthrough‘.   This can only be done with HVM domUs, and when you pass through a PCI device to a given domU it can not be used by any other machine, including dom0.   You are dedicating that device to the domU.

    Check if IOMMU (aka VT-d) is activated in BIOS (on dom0):
    # xl dmesg |grep I/O
    I/O virtualisation enabled
    If it’s there, you’re good.   If not, go into BIOS and enable it (Northbridge section?) — in my circa 2016 Asus mobo the setting is in AdvancedMode|Advanced|SystemAgent|VT-d.   And if it’s not there upgrade your hardware.

    Load the pciback driver on boot.   Edit /etc/modules and add:

    xen-pciback

    Find out what driver is running the devices you want to pass through:
    # lspci -k

    00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller (rev 06)
    Subsystem: ASUSTeK Computer Inc. Device 8534
    Kernel driver in use: snd_hda_intel

    00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller (rev 04)
    Subsystem: ASUSTeK Computer Inc. Device 8575
    Kernel driver in use: snd_hda_intel

    05:00.0 Multimedia video controller: Conexant Systems, Inc. CX23885 PCI Video and Audio Decoder (rev 03)
    Subsystem: Hauppauge computer works Inc. Device 7911
    Kernel driver in use: cx23885

    Now tell pciback to grab the devices’ driver.   Edit /etc/modprobe.d/{USER}-xen-pci-passthrough.conf   … ({USER} so that later it’s easy to find files you’ve made) …   and put in it:

    		# Audio Controller
    		install snd_hda_intel /sbin/modprobe pciback ; /sbin/modprobe --first-time --ignore-install snd_hda_intel
    
    		# Hauppauge
    		install c23885 /sbin/modprobe pciback ; /sbin/modprobe --first-time --ignore-install c23885
    
    		options xen-pciback hide=(0000:00:03.0)(0000:00:1b.0)(0000:05:00.0)

    (Of course adapting everything to your use-case)

    # depmod
    # update-initramfs -u -k all
    # update-grub

    Reboot dom0.

    Check to make sure pciback has your devices:
    # lspci -k

    00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller (rev 06)
    Subsystem: ASUSTeK Computer Inc. Device 8534
    Kernel driver in use: pciback

    00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller (rev 04)
    Subsystem: ASUSTeK Computer Inc. Device 8575
    Kernel driver in use: pciback

    05:00.0 Multimedia video controller: Conexant Systems, Inc. CX23885 PCI Video and Audio Decoder (rev 03)
    Subsystem: Hauppauge computer works Inc. Device 7911
    Kernel driver in use: pciback
    # xl pci-assignable-list

    Good.   Now all that’s left is to tell the domU it owns these devicen.   Edit /etc/xen/{domUname}.cfg and add:

    		pci = [ '00:03.0','00:1b.0','05:00.0' ]

    (Of course adapting everything to your use-case)

    Start your domU and you should now have those hardware devices dedicated to it.

    For a Windows guest and passthrough of a video card, the card -must- support Function Level Reset (FLR).   In dom0:
    # lspci -vv
    … and make sure DevCap: has FLReset+, or AFCap: has FLR+. Not FLReset-.
    Only the newest cards have it.

    Also see, for PCI Passthrough of video to HTPC guest:
    https://wiki.xen.org/wiki/Secondary_GPU_Passthrough
    https://wiki.xen.org/wiki/Xen_VGA_Passthrough
    https://wiki.debian.org/VGAPassthrough

    ~~   Useful Commands   ~~

    We are using the xl Xen management tool, which is the broadest and most current of the three or so choices.   Actually XAPI is more advanced, but currently it must be compiled.   Most xl commands require root privileges to run, due to the communications channels used to talk to the hypervisor.   Running as non-root will return an error.   For troubleshooting add -v, for verbose reporting.

    Info about the hypervisor and dom0 including version, free memory etc:
    # xl info

    List running domains, their IDs, memory, state and CPU time consumed:
    # xl list

    List all VMs:
    # xen-list-images

    Start a virtual machine domain, and go into it on the command-line:
    # xl create -c /etc/xen/{domUname}.cfg

    Leave the guest domain virtual console, by pressing   ctrl+]
    Re-enter a running domain:
    # xl console {domUname}

    Show running domains in real time, similar to top:
    # xl top

    Shut down the domain:
    # xl shutdown {domUname}

    Stop a virtual machine immediately without shutting it down;   as if you switch off the power button:
    # xl destroy {domUname}

    Delete a virtual machine domain:
    # xen-delete-image {domUname}

    List of all commands:
    # xl help

    To start guests automatically on dom0 boot, in dom0:
    # systemctl enable xendomains
    # mkdir /etc/xen/auto
    # ln -s /etc/xen/{domUname}.cfg /etc/xen/auto

    ,'after' => '

    ') )