Setup-storage: Difference between revisions

From FAIWiki
Jump to navigation Jump to search
Line 282: Line 282:


* Implementation details
* Implementation details
** The code of create_volume_group for the case of an existing vg must be thoroughly tested and probably improved
** mdadm --misc --zero-superblock /dev/hdx may be necessary
** mdadm --misc --zero-superblock /dev/hdx may be necessary
** not too sure whether the virtual option is completely implemented, to be checked
** not too sure whether the virtual option is completely implemented, to be checked

Revision as of 19:52, 22 December 2007

Introduction

As setup_harddisks is not portable due to its dependency on sfdisk and the lack of support for RAID and LVM, it has been re-implemented from scratch. However, it is not yet integrated with FAI and deserves rigorous testing.

To test the current implementation, keep reading. There are some TODOs left and probably a lot more will arise as soon as people actually start testing it. Before you start, just let me clarify that this is a very dangerous piece of software that has not been sufficiently tested for any cases where you want to retain any of your data. It will definitely destroy any data on any of your disks in your system. You have been warned.

If you're still interested, the following steps are required next:

  • install libparse-recdescent-perl, liblinux-lvm-perl (as of 2007-11-18, this is only available in Debian unstable, download today's version here), parted, lvm2 within your NFSROOT and add lvm2 to one of your client classes as well (it must be available on the target system, if you define LVMs in your disk_config). In FAI >= 3.2, these packages are installed into the NFSROOT by adding them to /etc/fai/NFSROOT.
  • grab setup-storage from the SVN: [1] (get setup-storage and the entire lib/ directory)
  • for a first shot copy the files somewhere into your NFSROOT and adapt the @INC in setup-storage to the path of the lib/ directory. Alternatively, move the files to your config space to avoid copying again in case you run make-fai-nfsroot. For the time being, it is assumed that you checked out the files to <CONFIGSPACE>/store/, that is you have
michael@demo[16:31]:~$ ls -R /srv/fai/config/store/
/srv/fai/config/store/:
lib  setup-storage

/srv/fai/config/store/lib:
commands.pm  exec.pm  fstab.pm  init.pm  parser.pm  sizes.pm  volumes.pm

and in /srv/fai/config/store/setup-storage you need to have line 74 as follows

unshift @INC, "/var/lib/fai/config/store/lib";
  • create a disk_config for some class MY_TEST_CLASS from the old one apart from (at least) the following changes:
    • the filesystem is now given as the 4th column, the mount options are now in column 5
    • there is no ;-hack anymore -- anything given after the mount options is passed on to mkfs.<filesystem>
    • Partitions are marked as bootable using the bootable:<nr> option in the disk_config line
    • The other things should at least work like they did before, but there are some new things as well, like RAID support (which is surely incomplete), LVM support, mounting by label or UUID.
    • Please take a look at the attached examples to get an idea of the new format.
    • All details of the implemented syntax and examples are given below.
  • create a hook partition.MY_TEST_CLASS.source as follows (make sure you don't forget the .source). In case you checked out setup-storage to some other path, please adapt the paths below.
#!/bin/sh

# load the device mapper module for LVM support
modprobe dm_mod

debug=1 /var/lib/fai/config/setup-storage
# if you really brave, you can get things written to disk instead
# by added -X
# debug=1 /var/lib/fai/config/setup-storage -X
# now define variable for root and boot partition and boot device
# this is necessary because we skip the original task
. $LOGDIR/disk_var.sh
# skip the original parititioning
skiptask partition

  • Give it a go (you have backed up all data, don't you?). There will be lots of debug output; if it finishes with the printout of an fstab file that is to your liking, it should have succeeded.


Some further infos may also be found on [2].

New configuration file syntax

In the following, we present a complete EBNF description of a modified configuration file syntax, as well as some examples.

file ::= <lines> EOF 

lines ::= EOL 
          /* empty lines or whitespace only */
          | <comment> EOL 
          | <config> EOL 

comment ::= #.* 

config ::= disk_config lvm 
           | disk_config raid
           | disk_config end 
           | disk_config disk[[:digit:]]+( <option>)*
           | disk_config [^[:space:]]+( <option>)*
           /* fully qualified device-path or short form, like hda, whereby full
            * path is assumed to be /dev/hda */
           | <volume>

option ::= /* empty */
           | preserve_always:[[:digit:]]+(,[[:digit:]]+)*
           /* preserve partitions -- always */
           | preserve_reinstall:[[:digit:]]+(,[[:digit:]]+)*
           /* preserve partitions -- unless the system is installed for the 
           first time */
           | resize:[[:digit:]]+(,[[:digit:]]+)*
           /* attempt to resize partitions */
           | disklabel:(msdos|gpt)
           /* write a disklabel - default is msdos */
           | bootable:[[:digit:]]+
           /* mark a partition bootable, default is / */
           | virtual
           /* do not assume the disk to be a physical device, use with xen */
           | fstabkey:(device|label|uuid)
           /* when creating the fstab, the key used for defining the device
           may be the device (/dev/xxx), a label given using -L, or the uuid
           */  

volume ::= <type> <mountpoint> <size> <filesystem> <mount_options> <fs_options>
           | vg <name> <size>
           /* lvm vg */

type ::= primary
         /* for physical disks only */
         | logical
         /* for physical disks only */
         | raid[0156]
         /* raid level */
         | [^/[:space:]]+-[^/[:space:]]+
         /* lvm logical volume: vg name and lv name*/

mountpoint ::= -
               /* do not mount */
               | swap
               /* swap space */
               | /[^[:space:]]*
               /* fully qualified path */

name ::= [^/[:space:]]+
         /* lvm volume group name */

size ::= [[:digit:]]+[kMGTP%]?(-([[:digit:]]+[kMGTP%]?)?)?(:resize)?
         /* size in kilo, mega (default), giga, tera or petabytes or %,
          * possibly given as a range; physical
          * partitions or lvm logical volumes only; */
         | -[[:digit:]]+[kMGTP%]?(:resize)?
         /* size in kilo, mega (default), giga, tera or petabytes or %,
          * given as upper limit; physical partitions
          * or lvm logical volumes only */
         | [^,:[:space:]]+(:(spare|missing))*(,[^,:[:space:]]+(:(spare|missing))*)*
         /* devices and options for a raid or lvm vg */

mount_options ::= [^[:space:]]+

filesystem ::= -
               | swap
               | [^[:space:]]
               /* mkfs.xxx must exist */

fs_options ::= .*
               /* options appended to mkfs.xxx call */

The major differences to the prior format:

  • the disk_config ... line allows for the keywords lvm and raid
  • options may need to be appended to the disk_config line
  • the ";" is not used anymore, the options that were given there have now been split up
    • the filesystem is now an explicit parameter; note, that the order of filesystem/mount-options is the same /etc/fstab, as opposed to the previous format of disk_config
    • any options to mkfs.xxx may be given
    • the "preserveX" and "boot" options are one of the options now given on the disk_config line
  • support for LVM and RAID is completely new :-)
  • resizing partitions is supported

Some examples


# Configure the device /dev/hda
disk_config hda   preserve:6,7   disklabel:msdos  bootable:3
# preserve the 6th and the 7th partition. The disklabel is msdos, which is the default
# for x86. Furthermore the 3rd partition is made bootable. 
primary /boot     20-100        ext3            rw
# create a primary partition /dev/hda1 with a size between 20 and 100 MB and mount it
# read-write as /boot; it is formatted using ext3 filesystem
primary swap      1000     swap       sw
# /dev/hda2 will be a swap space of 1000 MB
primary /         12000      ext3           rw        -b 2048
# /dev/hda3 should be formatted using ext3 filesystem; when calling mkfs.ext3
# the option "-b 2048" is appended.
logical /tmp      1000      ext3            rw,nosuid
# create the logical partition /dev/hda5
logical /usr      preserve6      ext3          rw
logical /var      10%-      ext3               rw
# make /dev/hda7 at least 10% of the disk size
logical /nobackup 0-        xfs                rw
# use mkfs.xfs to format the partition


# Configure the virtual device /dev/sda (for, e.g., Xen setups), all
# partitions are primary and sizes are ignored (so one could as well specify
# anything other than 0)
disk_config sda virtual
primary /boot     0            ext3                 rw
primary /         0            ext3                 rw
primary /tmp      0            ext3                 rw
primary /usr      0            ext3                 rw
primary /var      0            ext3                 rw

# resizing partitions is possible by appending :resize to any given size
disk_config /dev/scsi/host0/bus0/target1/lun0 
primary /      3000-6000:resize         ext3         rw
# resize to any value between 3000 and 6000 MB, as limited by the disk size,
# the desired sizes of the other partitions, and the data on the partition
primary /tmp   1000                     ext3         rw


# Create a softRAID
disk_config raid
raid1        /    sda1,sdd1  ext2        rw,errors=remount-ro
# create a RAID-1 on /dev/sda1 and /dev/sdd1, format using mkfs.ext2 and mount
# it as /
raid0        -    disk2.2,sdc1,sde1:spare:missing  ext2       default
# create a RAID-0 on the second partition of the second disk, /dev/sdc1, and
# /dev/sde1 as a spare partition, which (may?) me missing


A pretty complex setup looks as follows:

disk_config sda

primary  -    256         -    -  
primary  swap 1024        swap       sw
primary  -		0-				 -    -

disk_config sdb
primary  -		0-				 - -

disk_config sdc
primary  -		0-				 - -

disk_config sdd
primary  -             	256         - -
primary  -          	1024        -                   -
primary  -		0-				 - -

disk_config sde
primary  -		0-				 - -

disk_config sdf
primary  -		0-				 - -

disk_config raid

raid1        /    sda1,sdd1  ext2    rw,errors=remount-ro   
raid1        swap sda2,sdd2  swap    rw                     
raid1        -    sda3,sdd3  ext2    default                

raid0        -    sdb1,sde1  ext2    default
raid0        -    sdc1,sdf1  ext2    default

# config the LVM
disk_config lvm
vg  my_pv   md2,md3   
my_pv-_usr	/usr			2048        ext3  rw -O dir_index,resize_inode
my_pv-_var	/var			600         ext3  rw -O dir_index,resize_inode
my_pv-_e_h	/export/home		10240       reiser          rw,notail
my_pv-_e_s	/export/sites		2048        reiser          rw,notail
my_pv-_v	/vservers		2048        ext3    rw 

Implementation

The current implementation is found at [3] and has still some TODOs left:

Missing features and regressions

  • Missing features
    • lazyformat
    • some auto mode (something like auto:server, auto:desktop?) might be desirable
    • man page
    • crypto support would be nice to have; we could implement it by an additional (optional) argument to the mount point specification. A nice tutorial on crypt setups is found at [4]
    • there needs to be a proper order of events. you need to tear down lvm, then raid and finally partitions and build them up the other way around. Right now things are just sorted in the disk, raid, lvm order and both teardown and buildup take place right after each other within each section.
    • it should be possible to preserve logical volumes, etc. using preserve instead of giving a size
    • once everything is done, some checks should be performed, e.g., test that all partitions were created
    • Disks must be selectable by their ID or the like [5]
    • Having /boot on a SW-RAID or LVM should be tested, I think it doesn't work
  • Implementation details
    • The code of create_volume_group for the case of an existing vg must be thoroughly tested and probably improved
    • mdadm --misc --zero-superblock /dev/hdx may be necessary
    • not too sure whether the virtual option is completely implemented, to be checked
    • move most of the comments to the end of the documented code line (saves space)
    • INTERNAL ERROR should provide more info
    • preserve must retain the partition id and the flags (bootable, etc.)
    • resize should imply resizing the filesystem as well (unless this is done by parted already, needs to be checked)
    • LVMs definitely require resizing the filesystem
    • how to detect old-style config files? migration strategies?
    • implement disklabels other than msdos and gpt
    • the RAID commands are surely incomplete and lack any management of unanticipated situations
    • try to get libparted-swig-perl and use that one instead of the manual parsing
    • volume groups may need to be cleaned, which could require some magic, see partman [6], dm_wipe_lvm
    • create a proper perl namespace
    • chop the long functions into many smaller ones
    • more error messages must be caught by shdd2-exec, e.g.,
(CMD) parted -s /dev/sdc unit TiB print 1> /tmp/xkxZndMHZ7 2> /tmp/qaYSzz9BLr
(STDERR) No Implementation: Partition 1 isn't aligned to cylinder boundaries.  This is still unsupported.

Changes required for integration with main-line FAI

  • libparse-recdescent-perl and liblinux-lvm-perl must get installed in nfsroot (add Depends: to fai-nfsroot package)
  • closes #380629, #330915, #277045, #356862, #416633
  • no-bug: #364763

Documentation of internal data structures

The hash of all configurations specified in the disk_config file

PHY_<DEVICE>
  virtual (0|1)
  disklabel STRING
  bootable -1..n
  partitions
    <1..n>
      size
        extended (0|1)
        preserve (0|1)
        resize (0|1)
        range
        eff_size
      number 1..n
      maps_to_existing 1..n
      start_byte
      end_byte
      mountpoint
      mount_options
      filesystem
      fs_options
      label
VG_<NAME>
  devices
  estimated_size
  volumes
    <logical-volume-name>
      size
        preserve (0|1)
        resize (0|1)
        range
        eff_size
      mountpoint
      mount_options
      filesystem
      fs_options
      label
RAID
  volumes
    <0..n>
      mode
      devices
        /dev/<device-name>
          options
            spare (0|1)
            missing (0|1)
      mountpoint
      mount_options
      filesystem
      fs_options
      label

The current disk configuration

<DEVICE>
  bios_cylinders
  bios_heads
  bios_sectors_per_track
  sector_size
  disklabel
  begin_byte
  end_byte
  partitions
    <1..n>
      begin_byte
      end_byte
      count_byte
      is_extended
      filesystem

The current LVM configuration

<VG>
  physical_volumes
  size
  volumes
    <lv-name>
      size

The current RAID configuration

<0..n>
  devices
  mode

Other implementations for RAID and LVM

Meanwhile you might also want to look at