4. Configuring LOCKSS

Note

Commands in this section are run as the lockss user.

After installing the LOCKSS stack, you will configure it with the LOCKSS Installer's configure-lockss script [1].

4.1. Gathering Configuration Information

You will need to gather information to answer configuration questions asked by configure-lockss, including:

  • The name (FQDN) of the host, the IP address of the host, and if behind NAT, the external IP address for NAT.

  • The mail relay host, and optionally mail credentials, for sending e-mail from the host, and the e-mail address for the administrator of the system.

  • The configuration URL and preservation group or groups corresponding to the LOCKSS network your system is joining.

  • The paths for one or more content storage areas, the state data storage area, the temporary storage area, and the log storage area. See the Storage Prerequisites sections for important information about requirements for these storage areas.

    Caution

    Each of these paths needs to be writable by the lockss user. If this is not the case, set them up as root before running configure-lockss.

  • Username and password for the Web user interfaces.

  • A password for the PostgreSQL database.

    • Alternatively, if using an existing PostgreSQL database, the host name, port, schema, username and password for the external PostgreSQL database, as well as a prefix for database names.

  • Which Stack Components you wish to use.

Some notes about using configure-lockss:

  • When run the first time, some of the questions asked by the script will have a suggested or default value, displayed in square brackets; either type the desired value and then hit Enter, or hit Enter to accept the value in square brackets.

  • Any subsequent runs will use the previous values as the suggested value in square brackets; either type a new desired value and then hit Enter, or hit Enter to leave the value in square brackets unchanged.

    • Use configure-lockss --replay to prompt only for information not already found in the configuration file.

    • Password prompts will not display the previous value, but passwords can still be left unchanged with Enter.

4.2. Invoking configure-lockss

Tip

  • When run the first time, some of the questions asked by configure-lockss will have a suggested or default value, displayed in square brackets; either type the desired value and then hit Enter, or just hit Enter to accept the value in square brackets.

  • Any subsequent runs of configure-lockss will use the previously-entered value as the suggested value in square brackets; either type a new desired value and then hit Enter, or just hit Enter to leave the value in square brackets unchanged.

    • Use configure-lockss --replay to prompt only for information not already found in the configuration file.

    • Password prompts will not display the previous value, but passwords can still be left unchanged by just hitting Enter.

To invoke configure-lockss, follow these steps:

  1. Establish a lockss shell session. Double-check that you are acting as lockss by typing:

    whoami
    

    and verifying that the output is lockss.

  2. Navigate to the LOCKSS Installer Directory, symbolically:

    cd <LOCKSS_INSTALLER_DIR>

  3. Run this command:

    scripts/configure-lockss
    

The script will begin with the first series of configuration questions from Section 4.3 (Kubernetes Settings).

4.3. Kubernetes Settings

Prompt: Command to use to execute kubectl commands

Enter the fully-qualified command to invoke kubectl in your environment. If you are using K3s, the Kubernetes environment that ships with LOCKSS, the proposed value is already correct and you can simply hit Enter to accept the suggested value in square brackets.

Hint

Have you arrived at this spot in this manual because you are following the instructions in the LOCKSS 1.x to 2.x Migration Guide? If you are performing a new-host migration, be mindful that there will be a prompt here, namely Location of copied LOCKSS 1.x config.dat file, that does not occur when running configure-lockss normally. See the LOCKSS 1.x to 2.x Migration Guide for details.

4.4. Network Settings

4.4.1. Hostname

Prompt: Fully qualified hostname (FQDN) of this machine

Enter the machine's fully-qualified hostname (meaning with its domain name), for example locksstest.myuniversity.edu.

4.4.2. IP Address

Prompt: IP address of this machine

  • If the host is on the Internet, enter its publicly routable IP address.

  • If the host is behind network address translation (NAT), enter its internal IP address. You will be asked the other IP address later in Section 4.4.5 (Network Address Translation).

4.4.3. Initial UI Subnet

Prompt: Initial subnet(s) for admin UI access, separated by ';'

Enter a semicolon-separated list of subnets in CIDR or mask notation that should initially have access to the LOCKSS Web user interfaces (UIs). The access list can be modified later via the LOCKSS Configuration Service UI.

4.4.4. LCAP Port

Prompt: LCAP port

Enter the port on the that will be used to receive LCAP traffic. Historically, most LOCKSS nodes use 9729.

Hint

Have you arrived at this spot in this manual because you are following the instructions in the LOCKSS 1.x to 2.x Migration Guide? If you are performing a same-host migration, be mindful that there will be an additional prompt here, namely Temporary LOCKSS 2.x LCAP port, that does not occur when running configure-lockss normally. See the LOCKSS 1.x to 2.x Migration Guide for details.

4.4.5. Network Address Translation

  1. Prompt: Is this machine behind NAT?

    If the host is behind network address translation (NAT), enter Y for "yes"; otherwise enter N for "no".

  2. If you answered Y because the host is behind NAT, you will be asked an additional configuration question:

    External IP address for NAT

    Enter the publicly routable IP address of the NAT router. This complements the other IP address you entered in Section 4.4.2 (IP Address).

4.5. Mail Settings

4.5.1. Mail Relay

Prompt: Mail relay for this machine

Enter the hostname of the host's outgoing mail server, for example smtp.myuniversity.edu, or localhost if the host system is running a local mail daemon.

4.5.2. Mail Relay Credentials

  1. Prompt: Does the mail relay <mailhost> need a username and password?

    If the outgoing mail server requires password authentication, enter Y for "yes"; otherwise, enter N for "no".

  2. If you answered Y because the outgoing mail server requires password authentication, you will be asked additional configuration questions:

    1. Prompt: User for <mailhost>

      Enter the username for the mail server.

    2. Prompt: Password for <mailuser>@<mailhost>

      Enter the password for the username on the mail server.

    3. Prompt: Password for <mailuser>@<mailhost> (again)

      Re-enter the password for the username on the mail server. If the two passwords do not match, the password will be asked again.

4.5.3. Administrator Email

Prompt: E-mail address for administrator

Enter the e-mail address of the person or team who will administer the LOCKSS node.

4.6. Preservation Network Settings

4.6.1. Configuration URL

  1. Prompt: Configuration URL

    Enter the URL of your LOCKSS network's configuration file, for example https://admin.mynetwork.org/lockss.xml. This URL will be given to you by your LOCKSS network administrator. For example in the Global LOCKSS Network (GLN), this is https://props.lockss.org/daemon/lockss.xml.

  2. If the configuration URL begins with https:, you will be asked additional configuration questions:

    1. Prompt: Verify configuration server authenticity?

      Enter Y for "yes" if you would like to check the authenticity of the configuration server using a custom keystore; otherwise enter N for "no".

    2. If you answered Y because you would like to check the authenticity of the configuration server using a custom keystore, you will be asked an additional configuration question:

      Server certificate keystore

      Enter the path of a Java keystore used to verify the authenticity of the configuration server.

4.6.2. Configuration Proxy

Prompt: Configuration proxy (host:port)

If the configuration URL can be reached directly, hit Enter to leave this blank; otherwise, if a proxy server is required to reach the configuration URL, enter its host and port in host:port format (for example proxy.myuniversity.edu:8888).

4.6.3. Preservation Groups

Prompt: Preservation group(s), separated by ';'

Enter a preservation group identifier (or a semicolon-separated list of preservation group identifiers). This will be given to you by your LOCKSS network administrator. For example in the Global LOCKSS Network (GLN), this is prod.

4.7. Web User Interface Settings

  1. Prompt: User name for web UI administration

    Enter a username for the primary administrative user in LOCKSS Web user interfaces.

  2. Prompt: Password for web UI administration user <uiuser>

    Enter a password for the primary administrative user.

  3. Prompt: Password for web UI administration user <uiuser> (again)

    Re-enter the password for the primary administrative user. If the two passwords do not match, the password will be asked again.

4.7.1. Container Subnet

  1. If configure-lockss detects a discrepancy between a previously used subnet for inter-container communication in the system and the subnet it would choose now, you may either see the warning:

    Container subnet has changed from <former_subnet> to <new_subnet>

    or be asked the question:

    Container subnet was <former_subnet>, we think it should now be <new_subnet>. Do you want to change it?

    in which case you should enter Y for "yes" (recommended), or N for "no".

  2. Prompt: LOCKSS subnet for inter-service access control

    Enter the subnet used for inter-container communication. We recommend accepting the proposed value by hitting Enter.

4.8. Storage Area Settings

The LOCKSS stack needs several kinds of content storage and operating storage, as described in the Storage Prerequisites section. (See also important information about performance requirements for these storage areas in that section.)

Depending on your host system's layout, these storage areas may or may not be the same mount points or paths. Each path must be writable by the lockss user.

Subdirectories will be created in each storage area to fit the needs of each stack component; for example lockss-stack-cfg-data is the LOCKSS Configuration Service's state data directory in the state data storage area, and lockss-stack-repo-logs is the LOCKSS Repository Service's log directory in the log storage area.

4.8.1. Content Storage Area Settings

  1. Prompt: Paths of the content storage areas, separated by ';'

    Enter a semicolon-separated list of full paths of directories to be used as content storage areas.

  2. If the answer to the question is different than that from a previous configuration run, you will see the warning:

    Content storage areas have changed. Artifact reindexing may be needed; see manual.

    This message indicates that changing the configured content storage areas may trigger the need for artifacts to be reindexed internally. This process happens in the background while the stack runs but may take an extended period of time.

4.8.2. State Data Storage Area Settings

Prompt: Path of the state data storage area

Enter the desired path for the state data storage area, which by default is the same as the first content storage area (see Content Storage Area Settings).

4.8.3. Log Storage Area Settings

Prompt: Path of the log storage area

Enter the desired path for the log storage area, which by default is the same as the state data storage area (see State Data Storage Area Settings).

4.8.4. Temporary Storage Area Settings

Prompt: Path of the temporary storage area

Enter the desired path for the temporary storage area, which by default is the same as the state data storage area (see State Data Storage Area Settings).

4.9. Database Settings

For the PostgreSQL database, you have two options:

  • You can choose to use the LOCKSS stack's embedded PostgreSQL database, meaning a PostgreSQL database container will be run and managed as part of the LOCKSS stack. This is the recommended option.

  • Alternatively, you can choose to use an external PostgreSQL database. Select this option if you wish to use an existing PostgreSQL database provisioned by your institution, or one that you run and manage yourself.

You will receive the following prompt:

Use embedded LOCKSS PostgreSQL DB Service?

Follow these steps if you entered Y to use the embedded PostgreSQL database:

  1. Prompt: Password for PostgreSQL database

    Enter the password for the embedded PostgreSQL database.

    Caution

    This prompt is used to record the PostgreSQL database password in the LOCKSS stack's configuration. If you change the value of the PostgreSQL database password here without actually changing the PostgreSQL database password, the LOCKSS system components will no longer be able to connect to the PostgreSQL database. See Working with PostgreSQL for details.

  2. Prompt: Password for PostgreSQL database (again)

    Re-enter the password for the embedded PostgreSQL database. If the two passwords do not match, the password will be asked again.

Follow these steps if you entered N to use an external PostgreSQL database:

  1. Prompt: Fully qualified hostname (FQDN) of PostgreSQL host

    Enter the hostname of the external PostgreSQL database, for example postgres.myuniversity.edu.

  2. Prompt: Port used by PostgreSQL host

    Enter the port where the external PostgreSQL database can be reached, for example 5432.

  3. Prompt: Schema for PostgreSQL database

    Enter the schema name to be used by the LOCKSS system. The schema name used in the embedded PostgreSQL database is LOCKSS, but your database administrator may assign a different schema name to you.

  4. Prompt: Database name prefix for PostgreSQL database

    Enter the prefix to use for any LOCKSS-related database names in the schema. The database name prefix in the embedded PostgreSQL databse is Lockss (note the uppercase/lowercase), but your database administrator may assign a different database name prefix.

  5. Prompt: Login name for PostgreSQL database

    Enter the username for the external PostgreSQL database. The username in the embedded PostgreSQL database is LOCKSS, but your database administrator may assign a different username to you.

  6. Prompt: Password for PostgreSQL database

    Enter the password for the username in the external PostgreSQL database.

    Caution

    This prompt is used to record the PostgreSQL database password in the LOCKSS stack's configuration. If you change the value of the PostgreSQL database password here without actually changing the PostgreSQL database password, the LOCKSS system components will no longer be able to connect to the PostgreSQL database. Contact your PostgreSQL database administrator for details.

  7. Prompt: Password for PostgreSQL database (again)

    Re-enter the password for the username in the external PostgreSQL database. If the two passwords do not match, the password will be asked again.

4.10. Stack Component Settings

4.10.1. Crawler Service Settings

  1. Prompt: Use LOCKSS Crawler Service?

    Enter Y for "yes" if you want the LOCKSS Crawler Service to be run as part of your LOCKSS stack, otherwise enter N for "no". (The only situation where a crawler service is not needed is LOCKSS networks that are exclusively using direct deposit to store content; most LOCKSS networks need the crawler service.)

  2. If you answer Y to the previous question, you will see these additional questions:

    1. Prompt: Enable classic LOCKSS crawler?

      Enter Y for "yes" if you want to run the classic LOCKSS crawler, otherwise enter N for "no". (Most LOCKSS networks using the crawler service use the classic LOCKSS crawler.)

    2. Prompt: Enable Wget crawler?

      Enter Y for "yes" if you want to enable the usage of the external Wget crawler, otherwise enter N for "no".

4.10.2. Metadata Service Settings

Prompt: Use LOCKSS Metadata Service?

Enter Y for "yes" if you want the LOCKSS Metadata Service to be run as part of your LOCKSS stack, otherwise enter N for "no".

4.10.3. SOAP Compatibility Service Settings

Prompt: Use LOCKSS SOAP Compatibility Service?

Enter Y for "yes" if you want the LOCKSS SOAP Compatibility Service to be run as part of your LOCKSS stack, otherwise enter N for "no". (This is only needed if you have external tools using the LOCKSS' legacy SOAP Web Services.)

4.11. Web Replay Settings

4.11.1. Pywb Settings

Prompt: Use LOCKSS Pywb Service?

Enter Y for "yes" to run Pywb as part of your LOCKSS stack; otherwise, enter N for "no".

4.11.2. OpenWayback Settings

Prompt: Use LOCKSS OpenWayback Service?

Enter Y for "yes" to run OpenWayback as part of your LOCKSS stack; otherwise, enter N for "no".

4.12. Final Steps of configure-lockss

  1. Prompt: OK to proceed?

    Enter Y for "yes" if the configuration values are to your liking; otherwise, enter N for "no" to make edits.

  2. If you answer Y to accept the configuration values, configure-lockss will perform the final configuration steps. You may be asked to confirm before directories are created for the first time:

    <directory> does not exist; do you want to create it?

    or before directory permissions are changed:

    <directory> is not writable; do you want to make it writable?

    In each case, enter Y for "yes" and N for "no".

    Error conditions and what to do about them

    During the process of creating directories or changing directory permissions, you may see the following types of error messages if an error occurs:

    <directory> not writable by user <user>. Please make it so (check parent dir execute bits). LOCKSS will not run properly without it.

    Please create <directory> and make it writable by user <user>; LOCKSS will not run properly without it.

    <directory> still not writable by user <user>. Please make it so (check parent dir execute bits). LOCKSS will not run properly without it.

    Please ensure that <directory> is writable by user <user>; LOCKSS will not run properly without it.

    The script will end with this warning:

    Storage directories have not been set up correctly. Either fix the ownership/permission problems then run scripts/configure-lockss -r, or re-run scripts/configure-lockss and specify different directories.

    You can either fix any ownership and permission issues encountered and run scripts/configure-lockss --replay (or scripts/configure-lockss -r for short), or alternatively you can re-run scripts/configure-lockss but specify different directories.


Footnotes