Using FAI to install a root Server

From FAIWiki
Jump to navigation Jump to search

A root server is a dedicated piece of server hardware which is rented by a web hosting company. A root server is usually placed in some remote data centre that you do not have physical access to. The monthly rental fee covers the hardware rental, power, the Internet connection and a public IP address. You are given the root password and you are entirely responsibe for setting up, maintaining and managing the server.

Many organizations hosting services on the Internet rent root servers rather than using colocation offers or even run their own data centre because root servers are very cost effective; starting at 49,00 EUR month all inclusive while just maintaining a SDSL line to your office alone would cost you at least ten times this amount.

Why use FAI for your root Server?

The server rental company usually offers a choice of images with popular Linux distros to stage the server you rented. Because you are provided with root privileges, you can install whatever piece of software you would like on top of that base image or manually remove pieces you don't want.

Nevertheless, there might be good reasons why you'd rather want to use FAI to install your root server(s):

  • You want to have a fully automated installation of your server in a working state, compared to being able to stage it with a default image and having to take it manually from there, which can be a quite cumbersome task.
  • You way want to use a different kernel than the one contained in the server rental company's image (for example, a Xen kernel)
  • You way want to use your own harddisk partition layout which is different from the layout that comes with the standard image.
  • You want to be sure what you install only the software you want to use, not whatever someone felt might be useful for you.

What to keep in mind

You might ask what makes a root server different from any other server you might install with FAI. It's some subtle things which might come in your way and their combination might make it difficult for you to find the problem.

  • You may or may not have console access to your server. Some server rental companies provide a serial console via SSH, which might indeed be very helpful, but some don't. If you don't even have a serial console, the whole process is kind of a blind flight. The good news is: There are techniques to diag installations problems even in that situation.
  • You either don't have DHCP or BOOTP at all or at least it will be doing only half of the job for you. Because you won't have access to parameters on the DHCP server in a foreign data centre, you can use DHCP to configure your server's network settings but you cannot use it to tell your server to boot the FAI installation kernel, unfortunately.

Instead, you will hopefully have two things you might not be used to having in your local infrastructure:

  • A rescue system that you can boot your server into.
  • A remote reset system.

Before starting any work on FAI on a root server, make sure you are familliar with the automatic remote reset and the rescue system. Also make sure that you are not charged for resets and that they work automatically. Otherwise you may not make yourself friends with the support staff in the data centre.

The modified installation process

To install a root server using FAI you at least need to have a second server which can be reached on the public Internet that will become your FAI install server. It might be doable to use a server in your premises for that with a combination of a DMZ and dynamic DNS. Keep in mind though that NFS does not work well through NAT. I have not taken that route, so you milage may vary. I used a 2nd server on the public Internet (in a different data centre) as an install server.

Note: When exporting filesystems on your install server via NFS onto the public Internet, keep any eye on security. Either understand firewall configurations or consider starting the NFS daemon only on-demand, when you plan to do an installation. Also make sure that no sensitive information is located on your install server.

The installation process as such remains the same as usually when using FAI: You boot an install kernel which mounts a root filesystem via NFS from your FAI install server and executes installation scripts.

What's different is:

  • The technique you use to boot the install kernel on the install client.
  • The extra care you need to apply to network interface configuration and DNS name resolution on the install client in case you don't have DHCP.

DHCP or not

We are renting servers from five different companies and none of them uses or even offers DHCP at all, which comes as a surprise. In case the company where you rent your root server allows you to use DHCP for your server configuration, things will be a lot easier for you because you don't have to worry about two problem:

  • Providing the eth0 of your server with a valid IP address, netmask, broadcast and default gateway address.
  • Providing DNS servers to resolve hostnames.

As this seems to be a luxory you cannot expect to enjoy, we assume you don't have DHCP available at all.

What to prepare in your /srv/fai/nfsroot for non-DHCP install clients

You don't have DNS name resolution, but your install client needs to find the install server.

Therefore make sure whatever name resolution you need works through /etc/hosts in the NFS root filesystem.

There should be at least three entries there:

  • your install server
  • the Debian mirror to install from
  • the Debian security update mirror to get security updates from

How to boot the install kernel on your root server

As you cannot boot the FAI installation kernel via tftp because you cannot set the relevant parameter in DHCP or BOOTP you have two options how to boot the install kernel:

  1. Use kexec. (Haven't tried that yet.)
  2. Copy the install kernel manually onto the install client (while it still runs the default image) and change to bootloader (Lilo or grub, whatever they use) to boot the install kernel, then reboot manually.

Because the install kernel will mount its root filesystem over NFS it will not mount the server's local harddisk. So you will not see any "/dev/hda1 is busy" kind of problems when FAI tries to re-partition the disk.

In order for the FAI install kernel to be able to mount the root filesystem via NFS, it needs to have a valid IP config on the ethernet interface. This cannot come from /etc/network/interfaces because that would be a chicken and egg problem. (In what filesystem is it going to look for /etc/network?)

The Linux kernel has an ip= parameter which allows to statically set the IP configuration. Use that and keep in mind that the kernel will not have any means to use DNS for name resolution, so your nfsroot= line needs to have the IP address of your install server, not its DNS name.

How to monitor your installation and find problems

If you don't have a serial console, it will be hard to tell if the install client even talks to the install server at all.

To try and watch the installation from your install server, consider faimond. (man faimond)

But it might be that you don't see anything there, so you will need to dig a bit deeper.

Watch your nfs daemon log. You should see the root server mounting /srv/fai/nfsroot some seconds after it boots. In case you don't see that log entry, the FAI install kernel is probably not booted at all or does not have the proper kernel parameters, especially the ip= and root NFS settings.

During the installation, you should see two mounts:

/srv/fai/nfsroot -> the kernel mounting the root filesystem /srv/fai/config -> the fai script mounting the config

In case you don't see the mount of /srv/fai/config, you probably have a DNS problem, because nfsroot is mounted using the IP address while config is mounted using the DNS name, which needs to be properly resolved through /etc/hosts (/srv/fai/nfsroot/etc/hosts when seen from your install server).

How to recover from a failed installation attempt

If an attempt to install your server using FAI failed before FAI was able to erase the server's harddisk, use the remote reset facility to start the installation all over.

If your installation fails after that, you server's disk is going to be in an unknown state, but most likely, it is not going to boot anymore. To fix this:

  • Use whatever technology your datacentre provides to stage the server with a new standard image.
  • Start from the very beginning. (Copying the install kernel on to the server again, modify boot loader or use kexec.)

What might still go wrong

Just a list of potential problems to think about if anything does not work:

  • The FAI install kernel does not recognize your server hardware properly. Especially if it does not recognize your server's network interface, you will fail early. As you probably don't even know what make and model your server is, you can either try and hope for the best or use lspci while the server is still running the original image provided by the datacentre.