bilibop-lockfs -------------- 1. OVERVIEW =========== Bilibop-lockfs is a collection of shell scripts whose the main goal is to use the system without modify it in any way: high-level write access is disallowed by mounting filesystems as readonly branches of aufs(5), forcing all changes to be written on the temporary writable branches, i.e. in the RAM; additionally, low-level write access is also forbidden, by setting the block devices (including the whole disk itself) as readonly with the 'read_only_volume_list' setting in lvm.conf(5) for Logical Volumes, and with blockdev(8) for all others. There, 'without modify it in any way' means no log files, no cookies, no timestamp or data changes, but also no modification of the boot sectors or partition table, no changes of the LUKS headers, LVM metadata, and so on. This package was initially designed to be installed on operating systems embedded on removable and writable media. This includes Flash Memory sticks and external HDDs. Now, *bilibop-lockfs* can also be used on any internal disk (HDD or SSD), if the root filesystem is not hosted on more than one disk (by using LVM or RAID). It can be used: - to not decrease the lifetime of the flash media (USB stick, SD card, Solid State Disk) which the system is installed on (such media have limited write cycles). - to perform tests: all that is not the initramdisk can be temporarily modified at system or user level, and then tested without risk to affect original configurations: they will be active after the next boot. - as a tool in anti-forensics strategies, as explained above. NOTE that the bilibop-lockfs scripts depend on the bilibop-common functions, which may need Linux kernel 2.6.37 or higher to work properly. See bilbiop-common documentation for details. 2. CONFIGURATION ================ All available configuration options are described in the bilibop.conf(5) manual page. They allow to: - Enable/Disable 'lockfs' from the configuration file, from the boot commandline, or by physically locking the drive with a switch. In last instance, a heuristic is used to enable 'lockfs' on USB sticks. - Apply a hard policy (as described above; this is the default), or a soft policy allowing the admin to manually modify both high-level and low-level data or metadata. If the drive is physically locked, the hard policy is automatically applied; in fact, in such cases there is no choice, but this avoids some errors by avoiding write attempts on the drive by some low-level programs (e2fsck for example). - Disable 'lockfs' for only specified mountpoints or filesystems. This is a 'whitelist' based feature. Obviously, this is bypassed when the drive is physically locked. - Apply a specific policy for swap filesystems: use them as they are set; enable them manually; don't use them at all; enable only encrypted swap devices; or even, enable only swap devices encrypted with a random key. Here again, the settings will be overridden if the drive is physically locked. - Send a notification to the user about the 'lockfs' status. This can be done both during system boot and at desktop session startup. At boot time (through Plymouth), a message is sent to say that bilibop-locks is enabled (with hard|soft policy) or not, or if an error occurs. At desktop session startup, a notification is send to say that filesystems are locked or not. More exactly, the notifications say that changes under such or such directories will be kept or lost at shutdown. See the lockfs-notify(1) manpage for details. 3. BOOT OPTIONS =============== Several variables can be set or overridden from the boot commandline, with the following keywords/parameters (the last one given in the boot commandline overrides the previous ones): - 'nolockfs' - disable bilibop-lockfs features: BILIBOP_LOCKFS="false" - 'lockfs' - enable bilibop-lockfs features: BILIBOP_LOCKFS="true" - 'lockfs=force' - enable bilibop-lockfs features when system boots in single-user mode: BILIBOP_LOCKFS="true" - 'lockfs=hard' - enable bilibop-lockfs features, with restrictive policy: BILIBOP_LOCKFS="true" BILIBOP_LOCKFS_POLICY="hard" - 'lockfs=soft' - enable bilibop-lockfs features, with permissive policy: BILIBOP_LOCKFS="true" BILIBOP_LOCKFS_POLICY="soft" - 'lockfs=' - enable bilibop-lockfs features, and allocate SIZE of tmpfs for the root filesystem (/). SIZE must be digits (not beginning by zero) suffixed with 'k', 'K', 'm', 'M', 'g', 'G', or '%'. Default is 50% of the RAM. BILIBOP_LOCKFS="true" BILIBOP_LOCKFS_SIZE="/= ${BILIBOP_LOCKFS_SIZE}" - 'lockfs=all' - enable bilibop-lockfs features and blank the list of devices to not lock: BILIBOP_LOCKFS="true" BILIBOP_LOCKFS_WHITELIST="" - 'lockfs=default' - enable bilibop-lockfs features with their default values; all settings of the configuration file will be overridden: BILIBOP_LOCKFS="true" BILIBOP_LOCKFS_POLICY="hard" BILIBOP_LOCKFS_WHITELIST="" BILIBOP_LOCKFS_SWAP_POLICY="" (fallbacks to 'hard' or 'crypt') BILIBOP_LOCKFS_SIZE="" (means '50%' for each tmpfs) BILIBOP_LOCKFS_NOTIFY_POLICY="" - 'lockfs=-/foobar' - enable bilibop-lockfs features and add /foobar to the list of whitelisted mountpoints: BILIBOP_LOCKFS="true" BILIBOP_LOCKFS_WHITELIST="${BILIBOP_LOCKFS_WHITELIST} /foobar" - 'noswap' - if bilibop-lockfs is enabled, then apply a more restrictive policy than does the checkroot.sh initscript: comment lines about swap in /etc/fstab and also /etc/crypttab if necessary. BILIBOP_LOCKFS_SWAP_POLICY="hard" Unknown keywords are silently ignored. Parameters can be used together, separated by comma; for examples: lockfs=soft,30%,all will apply a soft policy to all mountpoints, allocating 30% of the RAM to /. lockfs=default,-/var/spool/apt-mirror will reset all settings to their default values, and then whitelist the /var/spool/apt-mirror mountpoint. My favorite. 4. HOW IT WORKS =============== If: one of the devices registered in fstab(5) is a Logical Volume and if: the keyword 'nolockfs' is not used in the boot commandline or: a drive is physically locked (takes precedence over the 'nolockfs' boot option) then: a first initramfs script is used to modify lvm.conf(5) inside the initramfs BEFORE LV are activated; then they are activated read-only (their metadata are not updated, and block devices are set readonly). After what, when the root of the system has been discovered and mounted from the initramfs, another initramfs script (the main) is used to lock the root filesystem and all its (virtual or physical) parent block devices. Other mountpoints are managed by a mount helper script. Here is an example of a partition scheme on a USB stick of 16GB (those are the outputs of the drivemap(1) command, when bilibop-lockfs is disabled): $ drivemap -m / /dev/sdb /dev/sdb1 /dev/dm-0 /dev/dm-1 /dev/dm-2 /dev/dm-3(*) /dev/dm-4 /dev/sdb2 $ drivemap -pin / /dev/sdb [ usb-_Xporter_Memory_07B3100100182DD2-0:0 | 16GB ] /dev/sdb1 ............................... [ LVM2_member | 8GB ] /dev/mapper/xporter-boot ................... [ ext3 | 255MB ] /boot /dev/mapper/xporter-luks ............ [ crypto_LUKS | 7751MB ] /dev/mapper/peevee .............. [ LVM2_member | 7750MB ] /dev/mapper/veegee-root ............ [ ext4 | 6996MB ] / /dev/mapper/veegee-home ............ [ ext4 | 750MB ] /home /dev/sdb2 ...................................... [ vfat | 8GB ] The first primary partition (/dev/sdb1) is a Physical Volume used as member of the Volume Group 'xporter', which is divided into two Logical Volumes: 'boot' and 'luks'. /dev/mapper/xporter-luks (or /dev/xporter/luks) contains a Physical Volume 'peevee' used as member of the Volume Group 'veegee' that contains two Logical Volumes: 'root' and 'home'. The second primary partition is used to be mountable, readable and writable on any computer: it contains a vfat (FAT32) filesystem of 8 GB and is not automatically mounted (not registered in /etc/fstab). 4.1. First stage ---------------- One time the device that is normally used as the root of the system has been discovered and mounted read-only on a temporary mountpoint (stored in the 'rootmnt' variable) in the initramfs environment, the bilibop-lockfs script is executed. a. It checks if the 'lockfs' feature is enabled or not. If not, it exits. One of the checks is to verify that the drive is physically locked or not; if it is, all BILIBOP_LOCKFS_* variables are reset to values compatible with the physical lock, and stored in /run/bilibop/plocked. b. It checks if ${rootmnt} is already an aufs mountpoint. If yes, it exits. This is done to not conflict with other programs such as 'fsprotect'. c. It checks the policy to apply: 'hard' or 'soft'. If 'hard', then the block device mounted on ${rootmnt} and the drive hosting this device are set read-only. Additionally, all parent block devices of the root device are set read-only too. With the partition scheme described above, this should give: sdb __ sdb1 __ dm-1 __ dm-2 __ dm-3 : RO (disk > PV > LV=LUKS > PV > LV=/) | | |__ dm-4 : rw (/home) | |__ dm-0 : rw (/boot) |__ sdb2 : rw (FAT32) Now the root filesystem (/dev/dm-3 on ${rootmnt}) is fully protected: it is not possible to dd(1) or whatever dm-3, dm-2 (that contains dm-3), dm-1 (that contains dm-2), sdb1 (that contains dm-1) nor sdb (that contains sdb1). At this step, only three block devices are not yet locked: /dev/sdb2, /dev/dm-0 and /dev/dm-4. d. Several mount operations are performed, sometimes with --bind or --move options, to obtain that ${rootmnt} is now an aufs mountpoint with dm-3 (/dev/mapper/veegee-root) mounted on ${rootmnt}/aufs/ro as the lower and readonly branch, and tmpfs mounted on ${rootmnt}/aufs/rw as the upper and writable branch. If the global policy is 'soft', the lower branch is set 'ro'; otherwise, 'rr' (real readonly). If, for any reason, something goes wrong, then all that has been done before is undone (especially the blockdev commands) and the boot process will continue as if bilibop-lockfs was disabled. An error message is sent to plymouth. e. Two files are created: - /run/bilibop/lock is a marker: if it don't exist, some bilibop-lockfs helper scripts will exit immediately. It is also used to store a list of the files modified by bilibop-lockfs. - ${rootmnt}/fastboot is also a marker: if it exists, filesystem checks at startup are skipped. f. ${rootmnt}/etc/fstab is modified: - The entry about the root filesystem is commented to forbid further possible management of / by initscrits. - Entries about swap devices are kept as is, commented, or modified, depending on the policy to apply (soft, noauto, crypt, random, hard). - Entries about mountpoints that have not been whitelisted in bilibop.conf(5) are modified: the fstype (third field) is replaced by 'lockfs', and options are also modified to remember the real fstype to use. This makes the original line: UUID=a82267c0-fe18-6c44-0acf-d11a5904d7ae /boot ext3 noatime,nodev,noexec,nosuid 0 2 is commented and replaced by: UUID=a82267c0-fe18-6c44-0acf-d11a5904d7ae /boot lockfs fstype=ext3,noatime,nodev,noexec,nosuid 0 0 NOTE that because some filesystems may not exist at this time, filesystem metadata such as LABEL, UUID or TYPE cannot be queried to know if a filesystem is whitelisted or not. Only mountpoints, devices and metadata matching the fstab entries are checked at this step. This means, with the previous example, that if you don't want to modify the /boot entry in fstab, you should use '/boot' or 'UUID=a82267c0-fe18-6c44-0acf-d11a5904d7ae'. '/dev/mapper/xporter-boot' or 'LABEL=boot' will not work here. 'TYPE=ext3' is too generic and can match other mountpoints. g. ${rootmnt}/etc/lvm/lvm.conf is modified (optional): Due to the power of the LVM commands, a last step can be necessary when BILIBOP_LOCKFS_POLICY is not set to 'soft'. Some commands such as 'vgchange -ay', which is run by the lvm2 initscript, can reset the 'ro' flag on Logical Volumes. This is a case of breakage of the lockfs 'hard' policy. To avoid this infamous result, the lvm.conf(5) file is modified: - in the 'global' section: locking_type = 4 metadata_read_only = 1 - in the 'activation' section: the content of (initrd)/etc/lvm/bilibop is used to set 'read_only_volume_list'. - in the 'devices' section: the PV we want to protect from further LVM investigations are filtered by the 'filter' option. The variable 'read_only_volume_list' applies to Logical Volumes. The variable 'filter' applies to Physical Volumes. Modify both 'read_only_volume_list' and 'filter' is a kind of defense in depth. 4.2. Second stage ----------------- Now /, the root of the system, is what it was previously named ${rootmnt}. /sbin/init is running and initscripts are executed. Due to the changes in lvm.conf, the 'vgchange -ay' from the lvm initscript has no effect on the protected devices: they are even not seen. When 'mount -a' is called, it parses /etc/fstab and for each entry it encounters with a 'lockfs' filesystem type, it calls the helper mount.lockfs(8) with the proper options and arguments. /sbin/mount.lockfs does something very close to what the initramfs script did for the root of the system. This mount helper script can not be used manually. a. If the parent process of the script is not /bin/mount, then it exits immediately. b. It checks if: - / is an aufs mountpoint - /run/bilibop/lock exists - what has to be mounted is really a block device, or a regular file (that will be associated to a loop device) - the filesystem to mount is not whitelisted If one of these tests fails, then a normal mount is executed and the corresponding entry in /etc/fstab is replaced by something very close to the original one, to reflect the actual mount. Here, we call that: 'mount_fallback'. 'very close' ? Since mount(8) can resolve the device name when it is called by its LABEL or UUID, the first argument given to the mount helpers is always the device name (or a symlink to it), never LABEL=* or UUID=*, even if the fstab entry uses this format. Options are preserved. So, if the original line was: UUID=a82267c0-fe18-6c44-0acf-d11a5904d7ae /boot ext3 noatime,nodev,noexec,nosuid 0 2 and replaced by the initramfs script by: UUID=a82267c0-fe18-6c44-0acf-d11a5904d7ae /boot lockfs fstype=ext3,noatime,nodev,noexec,nosuid 0 0 in case of 'mount_fallback' the new one is: /dev/mapper/xporter-boot /boot ext3 noexec,nosuid,nodev,noatime 0 0 This can happen if '/dev/mapper/xporter-boot' or 'LABEL=boot' has been whitelisted instead of 'UUID=a82267c0-fe18-6c44-0acf-d11a5904d7ae' or simply '/boot': the bilibop-lockfs initramfs script doesn't understand that this device is whitelisted and modifies the corresponding fstab entry; after what the mount helper script, understanding that the device is whitelisted, restores the fstab entry. NOTE that the replacement of the last field (here '2') by '0' is less than minor: the /fastboot file created by the initramfs script already disables filesystem checks. c. Now the script checks if the global policy is 'hard' or 'soft'. If it is 'hard', then the block device is set read-only with blockdev(8). If '/usr/local' is the target mountpoint, then the readonly branch is mounted on /aufs/ro/usr/local. '/aufs/ro' is the mountpoint of the readonly branch of the root filesystem and is used as prefix for all other mountpoints of readonly branches. If mount fails, then what it has been done before is undone, and a 'mount_fallback' is executed (see above). d. The script checks if a specified size has to be allocated to the writable branch, creates the mountpoint for the writable branch and mount it with proper options (nodev, noexec, nosuid and ro if they were specified in the original fstab entry). If '/usr/local' is the target mountpoint, then the writable branch is mounted on /aufs/rw/usr/local. '/aufs/rw' is the mountpoint of the writable branch of the root filesystem and is used as prefix for all other mountpoints of writable branches. If mount fails, then what has been done before is undone, and a 'mount_fallback' is executed. The ownership and permissions of the writable branch are modified if necessary, to match those of the readonly branch. e. The aufs is mounted. If the global lockfs policy is 'hard', then the readonly branch is set 'rr' instead of 'ro'. If mount fails, then what has been done before is undone, and a 'mount_fallback' is executed. f. The last step is to modify /etc/fstab to make it matches /proc/mounts: this can be important for clean unmounts at shutdown, for the case a readonly filesystem is remounted 'rw' during a session. This needs the global policy (BILIBOP_LOCKFS_POLICY) set to 'soft', or run blockdev(8) manually to set the block device as writable. The entry corresponding to the target mountpoint is replaced by a block of three lines: readonly branch, writable branch and the aufs itself. 4.3. Results ------------ bilibop-lockfs is enabled with default options (bilibop.conf is empty): $ drivemap -i / /dev/sdb [ usb-_Xporter_Memory_07B3100100182DD2-0:0 | 16GB ] /dev/sdb1 ............................... [ LVM2_member | 8GB ] /dev/dm-0 .................................. [ ext3 | 255MB ] aufs/ro/boot /dev/dm-1 ........................... [ crypto_LUKS | 7751MB ] /dev/dm-2 ....................... [ LVM2_member | 7750MB ] /dev/dm-3 .......................... [ ext4 | 6996MB ] /aufs/ro /dev/dm-4 .......................... [ ext4 | 750MB ] /aufs/ro/home /dev/sdb2 ...................................... [ vfat | 8GB ] $ for i in /dev/sdb* /dev/dm-[0-4] ; do printf "$i\tro=" ; cat /sys/class/block/${i##*/}/ro ; done /dev/sdb ro=1 /dev/sdb1 ro=1 /dev/sdb2 ro=0 /dev/dm-0 ro=1 /dev/dm-1 ro=1 /dev/dm-2 ro=1 /dev/dm-3 ro=1 /dev/dm-4 ro=1 This last command line says /dev/sdb2 (vfat fs, and not listed in fstab) is writable, other block devices are read-only. -- bilibop project Tue, 17 Apr 2012 03:03:52 +0200 -- bilibop project Sun, 27 Oct 2013 04:48:23 +0000