1.3. System Prerequisites

LOCKSS requires a Linux host (either a physical machine or a virtual machine), with at least some locally-attached storage. This section discusses the host (CPU, memory, Linux, etc.) and storage prerequisites for LOCKSS 2.0-beta2 NOT YET RELEASED, which vary depending on the scope of your application.

1.3.1. Host Prerequisites

1.3.1.1. CPU Prerequisites

LOCKSS 2.0-beta2 NOT YET RELEASED runs on a 64-bit CPU with at least 4 CPU cores, preferably 8, depending on how many Stack Components you choose to run.

1.3.1.2. Memory Prerequisites

Likewise, the memory requirements also depend on which Stack Components you choose to run. We recommend 32 GB of memory for typical applications, or more for hosts involved in computing-heavy applications like the Global LOCKSS Network (GLN) or CLOCKSS.

1.3.1.3. Operating System Prerequisites

LOCKSS requires a Linux distribution that is compatible with K3s, the Kubernetes distribution the LOCKSS stack runs on. More specifically, LOCKSS 2.0-beta2 NOT YET RELEASED uses K3s 1.31.

This prerequisite is met by operating systems listed in the Compatible Operating Systems appendix (with some adjustments for some versions), including AlmaLinux OS, Arch Linux, CentOS Stream, Debian, Fedora Linux, Linux Mint, OpenSUSE Leap, OpenSUSE Tumbleweed, Oracle Linux, Red Hat Enterprise Linux (RHEL), Rocky Linux, SUSE Linux Enterprise Server (SLES), and Ubuntu.

1.3.1.4. Linux Kernel Prerequisites

Most versions of Linux distributions in the Compatible Operating Systems appendix satisfy the Linux kernel prerequisites for LOCKSS 2.0-beta2 NOT YET RELEASED out of the box:

  1. Linux kernel 5.4 or later is required

    Linux kernel 5.4 or later is required.

    You can check the Linux kernel version by typing:

    uname --kernel-release
    

    at the host's console.

    • If version 5.4 or later is output, then the host satisfies the Linux kernel version requirement.

    • If version 5.3 or earlier is output, then the host does not satify the Linux kernel version requirement.

    If the host does not satify the Linux kernel version requirement, see Upgrading to Linux Kernel 5.4 or Later.

    Tip

    Upgrading to Linux kernel 5.4 or later is expected to be needed for some operating systems in the in the RHEL 8 family (AlmaLinux OS 8, Oracle Linux 8, Red Hat Enterprise Linux (RHEL) 8, and Rocky Linux 8).

  2. ip_tables loadable kernel module is required

    The ip_tables loadable kernel module is required, even if iptables or nftables are not installed on the host.

    You can check if the ip_tables loadable kernel module is installed by typing:

    modinfo ip_tables
    
    • If technical information about the module is output, then the host satisfies the ip_tables loadable kernel module requirement.

      Example
      filename:       /lib/modules/6.19.11-arch1-1/kernel/net/ipv4/netfilter/ip_tables.ko.zst
      description:    IPv4 packet filter
      author:         Netfilter Core Team <coreteam@netfilter.org>
      license:        GPL
      srcversion:     E76A147B434207E5FEF1739
      depends:        x_tables
      intree:         Y
      name:           ip_tables
      retpoline:      Y
      vermagic:       6.19.11-arch1-1 SMP preempt mod_unload 
      sig_id:         PKCS#7
      signer:         Build time autogenerated kernel key
      sig_key:        04:9E:10:5D:31:B6:6A:31:CC:0C:00:55:DC:F9:CB:CE:3A:CF:8D:E4
      sig_hashalgo:   sha512
      signature:      30:65:02:30:06:32:B4:5C:28:11:85:0F:16:A3:96:DB:27:85:B8:BA:
      		9E:AA:06:53:63:74:B4:EE:94:A9:6A:7F:7D:1E:76:6D:40:3E:F5:C4:
      		BE:55:54:1D:11:06:E4:71:12:27:25:C0:02:31:00:B0:51:4E:5D:F4:
      		59:7B:39:17:40:DA:F3:E6:2D:76:E2:44:AE:B8:6D:14:BA:5F:DD:FB:
      		48:A6:E4:5F:33:91:72:3E:BA:BC:67:01:C9:F7:4D:D7:C9:AC:F4:A3:
      		FC:6D:B3
      
    • If an error message similar to modinfo: ERROR: Module ip_tables not found. is output, then the host does not satisfy the ip_tables loadable kernel module requirement.

    If installing the ip_tables loadable kernel module is required, see Installing the ip_tables Loadable Kernel Module.

    Tip

    Installing the ip_tables loadable kernel module is expected to be needed for some operating systems in the in the RHEL 10 family (AlmaLinux OS 10, Oracle Linux 10, Red Hat Enterprise Linux (RHEL) 10, and Rocky Linux 10).

1.3.1.5. System Software Prerequisites

Installing LOCKSS and Upgrading LOCKSS uses the LOCKSS Downloader, which has basic system software prerequisites that are met by almost any Linux distribution:

  1. Curl or Wget is required

    At least one of Curl or Wget is required. You can check by typing curl --version or wget --version at the host's command line.

    • If either outputs a valid version message, then the host satisfies the fetcher requirement.

    • If both output an error message, then the host does not satisfy the fetcher requirement. See Installing Curl or Installing Wget.

  2. Tar and Unzip are required

    Both Tar (tar or gtar) and Unzip (unzip) are required. You can check by typing tar --version (or gtar --version) and unzip --version at the host's command line.

    • If both output a valid version message, then the host satisfies the archiver requirement.

    • If either outputs an error message, then the host does not satisfy the archiver requirement. See Installing tar or Installing unzip.

1.3.2. Storage Prerequisites

The LOCKSS system makes use of three kinds of storage:

Storage type

Primary use

Prerequisites section

System storage

Software and related assets

Section 1.3.2.1 (System Storage Prerequisites)

Operating storage

Internal stack data (other than preserved content)

Section 1.3.2.2 (Operating Storage Prerequisites)

Content storage

Preserved content

Section 1.3.2.3 (Content Storage Prerequisites)

1.3.2.1. System Storage Prerequisites

The LOCKSS stack's system storage is the storage space needed for installed software, downloaded containers, data generated by K3s, etc.

The LOCKSS stack makes use of system storage in three ways:

System storage area

Primary use

Configuration step

K3s data directory

Downloaded containers, K3s configuration

Section 2.3.8 (Installing K3s)

LOCKSS Installer Directory

LOCKSS Installer software, LOCKSS stack configuration

Section 2.2.3 (Running the LOCKSS Downloader)

Miscellaneous system storage

System software that may be installed as part of Installing LOCKSS

n/a

The most significant portion of system storage used by the LOCKSS stack is the K3s data directory.

System storage prerequisites are as follows:

  1. K3s data directory must be local

    The K3s data directory must be local, in other words cannot be backed by NFS or other non-local filesystems.

  2. K3s data directory cannot be backed by legacy XFS with ftype=0

    The K3s data directory cannot be backed by a legacy XFS filesystem with ftype=0.

    This is expected to be an issue only for installations of LOCKSS 1.x using XFS filesystems old enough to have ftype=0, looking to install LOCKSS 2.0-beta2 NOT YET RELEASED as part of a same-host migration. Alternatives in this case include a new-host migration, or a potential workaround; see Troubleshooting OverlayFS with XFS.

  3. K3s data directory size requirements

    The K3s data directory can grow large and requires at least 50 GB of space.

1.3.2.2. Operating Storage Prerequisites

The LOCKSS stack's operating storage is the storage space devoted to its internal operating needs, such as database data, state files, log files, temporary files, etc.

Operating storage consists of three storage areas:

Operating storage area

Primary use

Configuration step

State data storage area

Database data, state files

Section 4.8.2 (State Data Storage Area Settings)

Log storage area

Log files

Section 4.8.3 (Log Storage Area Settings)

Temporary storage area

Temporary files and other working data

Section 4.8.4 (Temporary Storage Area Settings)

Operating storage prerequisites are as follows:

  1. Local operating storage is strongly recommended

    For all operating storage, local storage is strongly recommended, in other words NFS or other non-local filesystems are strongly discouraged, as remote filesystems for operating storage can negatively impact the performance of the LOCKSS stack.

  2. State data storage area size requirements

    State data storage usage is dominated by the PostgreSQL database, which can grow to very different sizes depending on the scope of your application, for example how many individual artifacts (data objects) are preserved in the LOCKSS Repository Service and to what extent the LOCKSS Metadata Service is used. We propose the following guidelines:

    • Small state data profile: A modest application with thousands of archival units, with reasonable file sizes (for example, less than 2 GB) and number of files per AU. For this profile, we recommend at least 50 GB of state data storage.

    • Medium state data profile: An application that is expected to exceed one of the parameters for a small state data profile, perhaps involving tens of thousands of AUs, some large files in the tens of gigabytes, or AUs with tens of thousands of files. For this profile, we recommend at least 150 GB of state data storage.

    • Large state data profile: An intensive application with extensive metadata extraction, perhaps like the Global LOCKSS Network (GLN) or CLOCKSS. For this profile, we recommend at least 500 GB of state data storage.

  3. Temporary storage area size requirements

    Depending on the characteristics of the preservation activities undertaken by the system, content processing may require a substantial amount of temporary space. As a guideline, we recommend either 50 GB of temporary storage, or about three times as much temporary storage as the largest individual file expected to be preserved, whichever is larger.

1.3.2.3. Content Storage Prerequisites

The LOCKSS stack's content storage is the storage space devoted to the content being preserved by the system, consisting of one or more content storage areas.

Content storage can be backed by NFS or other non-local filesystems, although locally-attached storage is more performant.

In total, the content storage areas need to be large enough to hold all the content to be preserved in the node.

The list of content storage areas is configurable in Section 4.8.1 (Content Storage Area Settings).

1.3.3. Miscellaneous Prerequisites

1.3.3.1. Networking Considerations

Internally, K3s uses the private subnets 10.42.0.0/16 and 10.43.0.0/16 (the IP addresses from 10.42.0.0 through 10.43.255.255) to allocate IP addresses to containers. The networking infrastructure at some institutions may include host-based firewall rules that can inadvertently block K3s' internal communication mechanisms. If your institution does this, you will need to exclude 10.42.0.0/16 and 10.43.0.0/16 (or more succinctly, 10.42.0.0/15) from these rules. What to do will vary depending on your individual situation.

1.3.3.2. Configuration Management Considerations

Some firewall and iptables enforcement features of configuration management systems such as Puppet, Ansible, Chef, or Salt, are incompatible with K3s. For example, Puppet's puppetlabs-firewall module sometimes causes K3s to be non-functional after a reboot, requiring killing and restarting the K3s systemd service. What to do will vary depending on your individual situation.


What's the Minimum for Experimentation?

To review the installation instructions and test the installation of K3s in various operating systems, we routinely install and bring up minimal LOCKSS 2.0-beta2 NOT YET RELEASED, with no metadata services or Web replay engines, and with empty embedded PostgreSQL database, in Vagrant virtual machines with Virtualbox, using 2 CPU cores and 3 GB of memory. These minimal VMs would not support a production load, but it can be a useful tool to try out the installation instructions or evaluate the system.