Installation walkthrough: Difference between revisions
m (→multiple network interface: changed the wording a bit) |
(→test for LVM-related bugs: added debian bug report reference) |
||
(6 intermediate revisions by the same user not shown) | |||
Line 202: | Line 202: | ||
possible problem sources: DHCP, TFTP, NFS | possible problem sources: DHCP, TFTP, NFS | ||
''Please note:'' In previous 4.0 releases there was a problem were the FAI nfsroot did not properly configure NFSv4. The fai client mounted the nfsroot seemingly successfully but complained about missing file permissions. | ''Please note:'' In previous 4.0 releases there was a problem were the FAI nfsroot did not properly configure NFSv4. The fai client mounted the nfsroot seemingly successfully but complained about missing file permissions. You may or may not run into this issue when running 4.0.6 or later. (''The exact circumstances under which this issue triggers are unknown to me. [[User:ThomasNeumann|ThomasNeumann]] 21:45, 15 March 2013 (UTC)'') | ||
If you get an error similar to this one (emphasis added) | |||
[ 11.578824] dracut: Mounted root filesystem a.b.c.d:/srv/fai/wheezy/nfsroot-amd64 | |||
[ 11.581213] aufs: module is from the staging directory, the quality is unknown, you have been warned | |||
[ 11.582117] aufs 3.2-20120827 '''warning: can't open /etc/fstab: No such file or directory''' | |||
[ 12.010629] aufs test_add:261:mount[366]: uid/gid/perm /live/image 65534/65534/0755, 0/0/01777 | |||
[ 12.015811] type=1702 audit(1358959337.356:2): op=follow_link action=denied pid=371 comm="ls" [...] | |||
[ 12.015995] type=1702 audit(1358959337.356:2): op=follow_link action=denied pid=371 comm="ls" [...] | |||
[ 12.260523] dracut: Switching root pcbind: rpcbind terminating on signal. Restart with "rpcbind -u" [...] | |||
[ 12.262090] Kernel panic - not syncing: attempting to kill init! | |||
(The most relevant message is in line 3. The highlighted text hints at a permission problem due to NFSv4 not properly mapping the user id's.) | |||
then try to manually change the 'nfsboot'-setting in /srv/tftp/fai/pxelinux.cfg/default from | |||
append initrd=initrd.img-3.2.0-4-amd64 ip=dhcp root=/dev/nfs nfsroot=/srv/fai/nfsroot aufs [...] | |||
to | |||
append initrd=initrd.img-3.2.0-4-amd64 ip=dhcp root=/dev/nfs nfsroot=192.168.62.10:/srv/fai/nfsroot:vers=3 aufs [...] | |||
In other words change nfsroot directory from '''<directory>''' to '''<NFS-server IP>:<directory>:vers=3''' | |||
related Debian Bug reports: | |||
* http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=676883 | |||
== multiple network interface == | == multiple network interface == | ||
Line 294: | Line 319: | ||
Solution: No truly accepted solution. It's an unfixed bug in setup_storage. Inofficial patches against setup-storage are available. | Solution: No truly accepted solution. It's an unfixed bug in setup_storage. Inofficial patches against setup-storage are available. | ||
related Debian Bug reports: | |||
* http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=676882 | |||
= install a bootable client = | |||
For this we need some more files. This is the bare minimum you need to install a Debian Wheezy client from a Debian Wheezy server. I intentionally skip anything that is not essential. | |||
== /srv/fai/config/class/DEFAULT.var == | |||
<pre> | |||
# root password for the new installed linux system; md5 and crypt are possible | |||
# pw is "fai" | |||
ROOTPW='$1$kBnWcO.E$djxB128U7dMkrltJHPf6d1' | |||
</pre> | |||
== /srv/fai/config/disk_config/DEFAULT == | |||
<pre> | |||
# provide just a root partition | |||
# | |||
disk_config disk1 disklabel:msdos | |||
primary / 1024- ext3 rw | |||
</pre> | |||
== /srv/fai/config/package_config/DEFAULT == | |||
<pre> | |||
# a very simplistic package collection | |||
# (just install the bare minimum amount of packages) | |||
PACKAGES aptitude | |||
initramfs-tools | |||
linux-image-amd64 | |||
grub-pc | |||
# explicitely delete these bootloaders | |||
# (just in case the base tgz contains them) | |||
PACKAGES aptitude | |||
grub-legacy- | |||
lilo- | |||
</pre> | |||
== /srv/fai/config/scripts/DEFAULT/10-rootpw == | |||
<pre> | |||
#!/bin/sh | |||
# set root password | |||
$ROOTCMD usermod -p $ROOTPW root | |||
</pre> | |||
== /srv/fai/config/scripts/DEFAULT/11-grubpc == | |||
<pre> | |||
#!/bin/bash | |||
# support for GRUB version 2 (1.98-1) | |||
error=0; trap 'error=$(($?>$error?$?:$error))' ERR # save maximum error code | |||
set -a | |||
# during softupdate use this file | |||
[ -r $LOGDIR/disk_var.sh ] && . $LOGDIR/disk_var.sh | |||
[ -z "$BOOT_DEVICE" ] && exit 701 | |||
$ROOTCMD grub-mkdevicemap --no-floppy | |||
GROOT=$($ROOTCMD grub-probe -tdrive -d $BOOT_DEVICE) | |||
# see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=606035 | |||
GROOT=$(echo $GROOT | sed 's:md/:md:g') | |||
$ROOTCMD grub-install --no-floppy "$GROOT" | |||
echo "Grub installed on $BOOT_DEVICE = $GROOT" | |||
$ROOTCMD update-grub | |||
exit $error | |||
</pre> | |||
== go for it! == | |||
Start your test client and watch it go. Should proceed pretty fast, most of the time is spent with downloading packages and installing them. After about 2 minutes more or less the client reboots and presents you with a nice login prompt. At least it should. ;) | |||
= Conclusion = | = Conclusion = | ||
Congratulations. Your FAI-server is now fully functional and you can start your own experiments. | Congratulations. Your FAI-server is now fully functional and you can start your own experiments. |
Latest revision as of 21:56, 15 March 2013
Introduction
Why 4.0? It comes with out-of-the-box support for multiple interfaces thanks to dracut.
Why this document? Installing FAI is very simple, but there are a couple of common pitfalls. So - before you attempt to use any of the fancy features (like splitting the FAI-server into multiple components, installing multiple operating systems, etc.) make sure you got the basics right. This will help you to get up to speed and this will help the mailing list to avoid answering the same common problems over and over again. ;) This document intentionally describes a full bare-bones installation and not just the 'interesting' parts in order to (hopefully) avoid any kind of trouble.
Important This document assumes you have a nice little subnet all for yourself which is separated physically or via a VLAN from any other network. If you don't then you MUST not configure the DHCP Server as described or else your colleagues and your network administrators WILL most definitely hate you.
This document assumes the following network configuration:
Type | Value |
---|---|
Network | 192.168.62.0 / 255.255.255.0 |
DNS-Server | 192.168.68.2 |
DefaultRouter | 192.168.62.2 |
Server (TFTP, NFS, FAI) | 192.168.68.10 |
ClientIP range | 192.168.62.128 - 192.168.62.239 |
operating system
initial install from CD-ROM
Debian Squeeze 6.0.7 (64bit)
- minimal installation - sources: main + security (no contrib / non-free / volatile) - no additional software collections
We only need a very minimal operating system. It is strongly suggested to start your first steps into the FAI world by installing a Debian FAI-Server and Debian Clients. (Server and client can be mixed however you like - but for starters let's stick to the simplest and best supported case. Installing a 64bit server is preferred because it makes installing 32bit clients a bit easier then installing 64bit clients from a 32bit server.)
upgrade to wheezy (for fai 4.x)
edit /etc/apt/sources.list
- remove all sources relating to the installation CD/DVD
- replace all occurrences of 'squeeze' with 'wheezy'
aptitude update aptitude dist-upgrade aptitude clean
post-install configuration
- edit /etc/network/interfaces and configure a static ip address (makes remote login easier)
# This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.62.10 netmask 255.255.255.0 gateway 192.168.62.2
Install OpenSSH to access this machine remotely
aptitude install openssh-server
configure FAI server
install FAI software package
The current version as of 2013-02-26 is 4.0.6 (provided by debian repository and http://fai-project.org/download)
aptitude install fai-server fai-doc
configure DHCP daemon
# # configuration file for ISC dhcpd for Debian # # MAKE SURE YOU ARE THE ONLY DHCP-SERVER OR USE YOUR SITE'S EXISTING DHCP- # SERVER. THIS CONFIGURATION ~WILL~ MAKE AN EXISTING DHCP-CONFIG GO BOOM # (Going boom early is good. It makes debugging easier.) ddns-update-style none; authoritative; # Use this to send dhcp log messages to a different log file (you also # have to hack syslog.conf to complete the redirection). log-facility local7; subnet 192.168.62.0 netmask 255.255.255.0 { # network settings option domain-name "installnet.invalid"; option domain-name-servers 192.168.62.2; option routers 192.168.62.2; # client IP allocation range 192.168.62.128 192.168.62.239; default-lease-time 60; max-lease-time 720; # PXE boot server next-server 192.168.62.10; filename "fai/pxelinux.0"; }
This config file is stripped down to a minimum. Simply allocate an ip address to whoever is asking. If you run into trouble check '/var/log/syslog'.
configure TFTP server
The installation default is for the tftp-daemon to run in 'standalone' mode. If the netstat output does not show 'in.tftpd' then you are using either a different tftp server or it is handled via inetd/xinetd. The server in itself will continue to work just fine, but you may have to check other config files for the configured tftp root directory.
netstat -anp|grep :69
udp 0 0 0.0.0.0:69 0.0.0.0:* 29269/in.tftpd
This tftp server is running in standalone mode. Now let's check its configuration for the root directory:
cat /etc/default/tftpd-hpa
# /etc/default/tftpd-hpa TFTP_USERNAME="tftp" TFTP_DIRECTORY="/srv/tftp" TFTP_ADDRESS="0.0.0.0:69" TFTP_OPTIONS="--secure"
Exactly where I like it to be. Another possible value would be '/var/lib/tftpboot'. There's nothing wrong with that, but you have to make sure that FAI is configured for the proper directory or the PXE-boot process will fail.
configure FAI itself
install GPG key for fai-project.org
gpg -a --recv-keys DC13E54EAB9B66FD; gpg -a --export DC13E54EAB9B66FD | apt-key add -
Please note: This is slightly different then the description on <http://fai-project.org/download/>.
create NFS directory layout, configure /etc/exports, setup log user
fai-setup echo '/srv/fai/config 192.168.62.10/24(async,ro,no_subtree_check,no_root_squash)' >> /etc/exports /etc/init.d/nfs-kernel-server restart
(One could use 'exportfs -r' instead of the restart, but I want to make sure that the NFS server is truly started.)
create minimal configspace directory layout
for dir in class disk_config package_config scripts ; do mkdir -p /srv/fai/config/$dir done
(This is just enough to make the installation not fail due to missing directories - this will NOT produce a useable installation!)
generate a fai-client configuration for ALL clients
fai-chboot -B -I default
(This ~will~ attempt to reinstall any PXE-booting client!)
testing the installation
The following minimal client configuration is assumed:
- a CPU
- enough RAM to start the installation (at least about 200MB??)
- an empty disk containing at least 250MB of disk space
- a network interface connected to the FAI install network
- boot device order: disk, network
single network interface
the most basic of configurations
- create a client with a single network interface connected to the fai network
- boot the client
-> expected result: fai starts, but drops out into a 'root@(none):/#' prompt
possible problem sources: DHCP, TFTP, NFS
Please note: In previous 4.0 releases there was a problem were the FAI nfsroot did not properly configure NFSv4. The fai client mounted the nfsroot seemingly successfully but complained about missing file permissions. You may or may not run into this issue when running 4.0.6 or later. (The exact circumstances under which this issue triggers are unknown to me. ThomasNeumann 21:45, 15 March 2013 (UTC))
If you get an error similar to this one (emphasis added)
[ 11.578824] dracut: Mounted root filesystem a.b.c.d:/srv/fai/wheezy/nfsroot-amd64 [ 11.581213] aufs: module is from the staging directory, the quality is unknown, you have been warned [ 11.582117] aufs 3.2-20120827 warning: can't open /etc/fstab: No such file or directory [ 12.010629] aufs test_add:261:mount[366]: uid/gid/perm /live/image 65534/65534/0755, 0/0/01777 [ 12.015811] type=1702 audit(1358959337.356:2): op=follow_link action=denied pid=371 comm="ls" [...] [ 12.015995] type=1702 audit(1358959337.356:2): op=follow_link action=denied pid=371 comm="ls" [...] [ 12.260523] dracut: Switching root pcbind: rpcbind terminating on signal. Restart with "rpcbind -u" [...] [ 12.262090] Kernel panic - not syncing: attempting to kill init!
(The most relevant message is in line 3. The highlighted text hints at a permission problem due to NFSv4 not properly mapping the user id's.)
then try to manually change the 'nfsboot'-setting in /srv/tftp/fai/pxelinux.cfg/default from
append initrd=initrd.img-3.2.0-4-amd64 ip=dhcp root=/dev/nfs nfsroot=/srv/fai/nfsroot aufs [...]
to
append initrd=initrd.img-3.2.0-4-amd64 ip=dhcp root=/dev/nfs nfsroot=192.168.62.10:/srv/fai/nfsroot:vers=3 aufs [...]
In other words change nfsroot directory from <directory> to <NFS-server IP>:<directory>:vers=3
related Debian Bug reports:
multiple network interface
Since FAI 4.0 / dracut-network supports netbooting with multiple network interfaces. We're going to test that thoroughly. The main goal is to prove whether any given client can boot from the network regardless if the 'primary' or 'secondary' network interface is connected to the FAI install network. (Common failure symptoms are 'installation did not start' and 'Client hangs / times out during installation' )
- add a second network interface to the client
- connect second interface to somewhere different (It must NOT able to reach the fai install network!)
- boot the client
-> expected result: fai starts after a long timeout, but drops out into a 'root@(none):/#' prompt
- switch configured network interfaces with each other
- boot the client
-> expected result: fai starts after a long timeout, but drops out into a 'root@(none):/#' prompt
- configure both network interfaces so they can't reach the fai install network
- boot the client
-> expected result: client does not PXE-boot
(Untested: Connect both interfaces to the FAI install network.)
accessing the config space
update the client config to use a config space
fai-chboot -B -I -u nfs://192.168.62.10/srv/fai/config default
- Create a client with a single network interface connected to the fai network
- boot the client
-> expected result: fai starts, and reports about actually starting the installation, but finally drops out into a 'root@(none):/#' prompt
possible problem sources: NFS
using a minimal disk layout
create and edit '/srv/fai/config/disk_config/DEFAULT'
# an extremely simple layout # disk_config disk1 disklabel:msdos primary / 250 ext3 rw
- reuse previous client
- boot the client
-> expected result: fai does a complete installation and reboots the client afterwards
(Due to lack of configuration a dysfunctional client is installed. It must be manually terminated.)
modify '/srv/fai/config/disk_config/DEFAULT'
# configure LVM # disk_config disk1 disklabel:msdos primary - 250 - - disk_config lvm vg vg_system disk1.1 vg_system-root / 100 ext4 rw vg_system-var /var 100 ext4 rw
- create a new client or make sure the client performs a netboot
- boot the client
-> expected result: fai stops installation because tar runs out of diskspace
- reuse the previous client, make sure the client performs a netboot
- boot the client
-> expected result: dracut finds a volume group and does NOT mount nfsroot
Solution:
- modify client pxeboot configuration
fai-chboot -B -I -u nfs://192.168.62.10/srv/fai/config -k rd_NO_LVM default
- reuse the previous client, make sure the client performs a netboot
- boot the client
-> expected result: fai starts again, but stops with a 'FATAL ERROR'
Solution: No truly accepted solution. It's an unfixed bug in setup_storage. Inofficial patches against setup-storage are available.
related Debian Bug reports:
install a bootable client
For this we need some more files. This is the bare minimum you need to install a Debian Wheezy client from a Debian Wheezy server. I intentionally skip anything that is not essential.
/srv/fai/config/class/DEFAULT.var
# root password for the new installed linux system; md5 and crypt are possible # pw is "fai" ROOTPW='$1$kBnWcO.E$djxB128U7dMkrltJHPf6d1'
/srv/fai/config/disk_config/DEFAULT
# provide just a root partition # disk_config disk1 disklabel:msdos primary / 1024- ext3 rw
/srv/fai/config/package_config/DEFAULT
# a very simplistic package collection # (just install the bare minimum amount of packages) PACKAGES aptitude initramfs-tools linux-image-amd64 grub-pc # explicitely delete these bootloaders # (just in case the base tgz contains them) PACKAGES aptitude grub-legacy- lilo-
/srv/fai/config/scripts/DEFAULT/10-rootpw
#!/bin/sh # set root password $ROOTCMD usermod -p $ROOTPW root
/srv/fai/config/scripts/DEFAULT/11-grubpc
#!/bin/bash # support for GRUB version 2 (1.98-1) error=0; trap 'error=$(($?>$error?$?:$error))' ERR # save maximum error code set -a # during softupdate use this file [ -r $LOGDIR/disk_var.sh ] && . $LOGDIR/disk_var.sh [ -z "$BOOT_DEVICE" ] && exit 701 $ROOTCMD grub-mkdevicemap --no-floppy GROOT=$($ROOTCMD grub-probe -tdrive -d $BOOT_DEVICE) # see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=606035 GROOT=$(echo $GROOT | sed 's:md/:md:g') $ROOTCMD grub-install --no-floppy "$GROOT" echo "Grub installed on $BOOT_DEVICE = $GROOT" $ROOTCMD update-grub exit $error
go for it!
Start your test client and watch it go. Should proceed pretty fast, most of the time is spent with downloading packages and installing them. After about 2 minutes more or less the client reboots and presents you with a nice login prompt. At least it should. ;)
Conclusion
Congratulations. Your FAI-server is now fully functional and you can start your own experiments.