4. Configuring LOCKSS
Note
Commands in this section are run as the lockss user.
After installing the LOCKSS stack, you will configure it with the LOCKSS Installer's configure-lockss script [1].
4.1. Gathering Configuration Information
You will need to gather information to answer configuration questions asked by configure-lockss, including:
The name (FQDN) of the host, the IP address of the host, and if behind NAT, the external IP address for NAT.
The mail relay host, and optionally mail credentials, for sending e-mail from the host, and the e-mail address for the administrator of the system.
The configuration URL and preservation group or groups corresponding to the LOCKSS network your system is joining.
The paths for one or more content storage areas, the state data storage area, the temporary storage area, and the log storage area. See the Storage Prerequisites sections for important information about requirements for these storage areas.
Caution
Each of these paths needs to be writable by the
lockssuser. If this is not the case, set them up asrootbefore running configure-lockss.Username and password for the Web user interfaces.
A password for the PostgreSQL database.
Alternatively, if using an existing PostgreSQL database, the host name, port, schema, username and password for the external PostgreSQL database, as well as a prefix for database names.
Which Stack Components you wish to use.
Some notes about using configure-lockss:
When run the first time, some of the questions asked by the script will have a suggested or default value, displayed in square brackets; either type the desired value and then hit Enter, or hit Enter to accept the value in square brackets.
Any subsequent runs will use the previous values as the suggested value in square brackets; either type a new desired value and then hit Enter, or hit Enter to leave the value in square brackets unchanged.
Use
configure-lockss --replayto prompt only for information not already found in the configuration file.Password prompts will not display the previous value, but passwords can still be left unchanged with Enter.
4.2. Invoking configure-lockss
Tip
When run the first time, some of the questions asked by configure-lockss will have a suggested or default value, displayed in square brackets; either type the desired value and then hit Enter, or just hit Enter to accept the value in square brackets.
Any subsequent runs of configure-lockss will use the previously-entered value as the suggested value in square brackets; either type a new desired value and then hit Enter, or just hit Enter to leave the value in square brackets unchanged.
Use
configure-lockss --replayto prompt only for information not already found in the configuration file.Password prompts will not display the previous value, but passwords can still be left unchanged by just hitting Enter.
To invoke configure-lockss, follow these steps:
Establish a
lockssshell session. Double-check that you are acting aslockssby typing:whoami
and verifying that the output is
lockss.Navigate to the LOCKSS Installer Directory, symbolically:
cd <LOCKSS_INSTALLER_DIR>Run this command:
scripts/configure-lockss
The script will begin with the first series of configuration questions from Section 4.3 (Kubernetes Settings).
4.3. Kubernetes Settings
Prompt: Command to use to execute kubectl commands
Enter the fully-qualified command to invoke kubectl in your environment. If you are using K3s, the Kubernetes environment that ships with LOCKSS, the proposed value is already correct and you can simply hit Enter to accept the suggested value in square brackets.
Hint
Have you arrived at this spot in this manual because you are following the instructions in the LOCKSS 1.x to 2.x Migration Guide? If you are performing a new-host migration, be mindful that there will be a prompt here, namely Location of copied LOCKSS 1.x config.dat file, that does not occur when running configure-lockss normally. See the LOCKSS 1.x to 2.x Migration Guide for details.
4.4. Network Settings
4.4.1. Hostname
Prompt: Fully qualified hostname (FQDN) of this machine
Enter the machine's fully-qualified hostname (meaning with its domain name), for example locksstest.myuniversity.edu.
4.4.2. IP Address
Prompt: IP address of this machine
If the host is on the Internet, enter its publicly routable IP address.
If the host is behind network address translation (NAT), enter its internal IP address. You will be asked the other IP address later in Section 4.4.5 (Network Address Translation).
4.4.3. Initial UI Subnet
Prompt: Initial subnet(s) for admin UI access, separated by ';'
Enter a semicolon-separated list of subnets in CIDR or mask notation that should initially have access to the LOCKSS Web user interfaces (UIs). The access list can be modified later via the LOCKSS Configuration Service UI.
4.4.4. LCAP Port
Prompt: LCAP port
Enter the port on the that will be used to receive LCAP traffic. Historically, most LOCKSS nodes use 9729.
Hint
Have you arrived at this spot in this manual because you are following the instructions in the LOCKSS 1.x to 2.x Migration Guide? If you are performing a same-host migration, be mindful that there will be an additional prompt here, namely Temporary LOCKSS 2.x LCAP port, that does not occur when running configure-lockss normally. See the LOCKSS 1.x to 2.x Migration Guide for details.
4.4.5. Network Address Translation
Prompt: Is this machine behind NAT?
If the host is behind network address translation (NAT), enter Y for "yes"; otherwise enter N for "no".
If you answered Y because the host is behind NAT, you will be asked an additional configuration question:
External IP address for NAT
Enter the publicly routable IP address of the NAT router. This complements the other IP address you entered in Section 4.4.2 (IP Address).
4.5. Mail Settings
4.5.1. Mail Relay
Prompt: Mail relay for this machine
Enter the hostname of the host's outgoing mail server, for example smtp.myuniversity.edu, or localhost if the host system is running a local mail daemon.
4.5.2. Mail Relay Credentials
Prompt:
Does the mail relay <mailhost> need a username and password?If the outgoing mail server requires password authentication, enter Y for "yes"; otherwise, enter N for "no".
If you answered Y because the outgoing mail server requires password authentication, you will be asked additional configuration questions:
Prompt:
User for <mailhost>Enter the username for the mail server.
Prompt:
Password for <mailuser>@<mailhost>Enter the password for the username on the mail server.
Prompt:
Password for <mailuser>@<mailhost> (again)Re-enter the password for the username on the mail server. If the two passwords do not match, the password will be asked again.
4.5.3. Administrator Email
Prompt: E-mail address for administrator
Enter the e-mail address of the person or team who will administer the LOCKSS node.
4.6. Preservation Network Settings
4.6.1. Configuration URL
Prompt: Configuration URL
Enter the URL of your LOCKSS network's configuration file, for example
https://admin.mynetwork.org/lockss.xml. This URL will be given to you by your LOCKSS network administrator. For example in the Global LOCKSS Network (GLN), this ishttps://props.lockss.org/daemon/lockss.xml.If the configuration URL begins with
https:, you will be asked additional configuration questions:Prompt: Verify configuration server authenticity?
Enter Y for "yes" if you would like to check the authenticity of the configuration server using a custom keystore; otherwise enter N for "no".
If you answered Y because you would like to check the authenticity of the configuration server using a custom keystore, you will be asked an additional configuration question:
Server certificate keystore
Enter the path of a Java keystore used to verify the authenticity of the configuration server.
4.6.2. Configuration Proxy
Prompt: Configuration proxy (host:port)
If the configuration URL can be reached directly, hit Enter to leave this blank; otherwise, if a proxy server is required to reach the configuration URL, enter its host and port in host:port format (for example proxy.myuniversity.edu:8888).
4.6.3. Preservation Groups
Prompt: Preservation group(s), separated by ';'
Enter a preservation group identifier (or a semicolon-separated list of preservation group identifiers). This will be given to you by your LOCKSS network administrator. For example in the Global LOCKSS Network (GLN), this is prod.
4.7. Web User Interface Settings
Prompt: User name for web UI administration
Enter a username for the primary administrative user in LOCKSS Web user interfaces.
Prompt:
Password for web UI administration user <uiuser>Enter a password for the primary administrative user.
Prompt:
Password for web UI administration user <uiuser> (again)Re-enter the password for the primary administrative user. If the two passwords do not match, the password will be asked again.
4.7.1. Container Subnet
If configure-lockss detects a discrepancy between a previously used subnet for inter-container communication in the system and the subnet it would choose now, you may either see the warning:
Container subnet has changed from <former_subnet> to <new_subnet>or be asked the question:
Container subnet was <former_subnet>, we think it should now be <new_subnet>. Do you want to change it?in which case you should enter Y for "yes" (recommended), or N for "no".
Prompt: LOCKSS subnet for inter-service access control
Enter the subnet used for inter-container communication. We recommend accepting the proposed value by hitting Enter.
4.8. Storage Area Settings
The LOCKSS stack needs several kinds of content storage and operating storage, as described in the Storage Prerequisites section. (See also important information about performance requirements for these storage areas in that section.)
Depending on your host system's layout, these storage areas may or may not be the same mount points or paths. Each path must be writable by the lockss user.
Subdirectories will be created in each storage area to fit the needs of each stack component; for example lockss-stack-cfg-data is the LOCKSS Configuration Service's state data directory in the state data storage area, and lockss-stack-repo-logs is the LOCKSS Repository Service's log directory in the log storage area.
4.8.1. Content Storage Area Settings
Prompt: Paths of the content storage areas, separated by ';'
Enter a semicolon-separated list of full paths of directories to be used as content storage areas.
If the answer to the question is different than that from a previous configuration run, you will see the warning:
Content storage areas have changed. Artifact reindexing may be needed; see manual.This message indicates that changing the configured content storage areas may trigger the need for artifacts to be reindexed internally. This process happens in the background while the stack runs but may take an extended period of time.
4.8.2. State Data Storage Area Settings
Prompt: Path of the state data storage area
Enter the desired path for the state data storage area, which by default is the same as the first content storage area (see Content Storage Area Settings).
4.8.3. Log Storage Area Settings
Prompt: Path of the log storage area
Enter the desired path for the log storage area, which by default is the same as the state data storage area (see State Data Storage Area Settings).
4.8.4. Temporary Storage Area Settings
Prompt: Path of the temporary storage area
Enter the desired path for the temporary storage area, which by default is the same as the state data storage area (see State Data Storage Area Settings).
4.9. Database Settings
For the PostgreSQL database, you have two options:
You can choose to use the LOCKSS stack's embedded PostgreSQL database, meaning a PostgreSQL database container will be run and managed as part of the LOCKSS stack. This is the recommended option.
Alternatively, you can choose to use an external PostgreSQL database. Select this option if you wish to use an existing PostgreSQL database provisioned by your institution, or one that you run and manage yourself.
You will receive the following prompt:
Use embedded LOCKSS PostgreSQL DB Service?
To use the embedded PostgreSQL database, enter Y for "yes", then follow the steps in the Embedded PostgreSQL Database section below.
To use an external PostgreSQL database, enter N for "no", then follow the steps in the External PostgreSQL Database section below.
Follow these steps if you entered Y to use the embedded PostgreSQL database:
Prompt: Password for PostgreSQL database
Enter the password for the embedded PostgreSQL database.
Caution
This prompt is used to record the PostgreSQL database password in the LOCKSS stack's configuration. If you change the value of the PostgreSQL database password here without actually changing the PostgreSQL database password, the LOCKSS system components will no longer be able to connect to the PostgreSQL database. See Working with PostgreSQL for details.
Prompt: Password for PostgreSQL database (again)
Re-enter the password for the embedded PostgreSQL database. If the two passwords do not match, the password will be asked again.
Follow these steps if you entered N to use an external PostgreSQL database:
Prompt: Fully qualified hostname (FQDN) of PostgreSQL host
Enter the hostname of the external PostgreSQL database, for example
postgres.myuniversity.edu.Prompt: Port used by PostgreSQL host
Enter the port where the external PostgreSQL database can be reached, for example
5432.Prompt: Schema for PostgreSQL database
Enter the schema name to be used by the LOCKSS system. The schema name used in the embedded PostgreSQL database is
LOCKSS, but your database administrator may assign a different schema name to you.Prompt: Database name prefix for PostgreSQL database
Enter the prefix to use for any LOCKSS-related database names in the schema. The database name prefix in the embedded PostgreSQL databse is
Lockss(note the uppercase/lowercase), but your database administrator may assign a different database name prefix.Prompt: Login name for PostgreSQL database
Enter the username for the external PostgreSQL database. The username in the embedded PostgreSQL database is
LOCKSS, but your database administrator may assign a different username to you.Prompt: Password for PostgreSQL database
Enter the password for the username in the external PostgreSQL database.
Caution
This prompt is used to record the PostgreSQL database password in the LOCKSS stack's configuration. If you change the value of the PostgreSQL database password here without actually changing the PostgreSQL database password, the LOCKSS system components will no longer be able to connect to the PostgreSQL database. Contact your PostgreSQL database administrator for details.
Prompt: Password for PostgreSQL database (again)
Re-enter the password for the username in the external PostgreSQL database. If the two passwords do not match, the password will be asked again.
4.10. Stack Component Settings
4.10.1. Crawler Service Settings
Prompt: Use LOCKSS Crawler Service?
Enter Y for "yes" if you want the LOCKSS Crawler Service to be run as part of your LOCKSS stack, otherwise enter N for "no". (The only situation where a crawler service is not needed is LOCKSS networks that are exclusively using direct deposit to store content; most LOCKSS networks need the crawler service.)
If you answer Y to the previous question, you will see these additional questions:
Prompt: Enable classic LOCKSS crawler?
Enter Y for "yes" if you want to run the classic LOCKSS crawler, otherwise enter N for "no". (Most LOCKSS networks using the crawler service use the classic LOCKSS crawler.)
Prompt: Enable Wget crawler?
Enter Y for "yes" if you want to enable the usage of the external Wget crawler, otherwise enter N for "no".
4.10.2. Metadata Service Settings
Prompt: Use LOCKSS Metadata Service?
Enter Y for "yes" if you want the LOCKSS Metadata Service to be run as part of your LOCKSS stack, otherwise enter N for "no".
4.10.3. SOAP Compatibility Service Settings
Prompt: Use LOCKSS SOAP Compatibility Service?
Enter Y for "yes" if you want the LOCKSS SOAP Compatibility Service to be run as part of your LOCKSS stack, otherwise enter N for "no". (This is only needed if you have external tools using the LOCKSS' legacy SOAP Web Services.)
4.11. Web Replay Settings
4.11.1. Pywb Settings
Prompt: Use LOCKSS Pywb Service?
Enter Y for "yes" to run Pywb as part of your LOCKSS stack; otherwise, enter N for "no".
4.11.2. OpenWayback Settings
Prompt: Use LOCKSS OpenWayback Service?
Enter Y for "yes" to run OpenWayback as part of your LOCKSS stack; otherwise, enter N for "no".
4.12. Final Steps of configure-lockss
Prompt: OK to proceed?
Enter Y for "yes" if the configuration values are to your liking; otherwise, enter N for "no" to make edits.
If you answer Y to accept the configuration values, configure-lockss will perform the final configuration steps. You may be asked to confirm before directories are created for the first time:
<directory> does not exist; do you want to create it?or before directory permissions are changed:
<directory> is not writable; do you want to make it writable?In each case, enter Y for "yes" and N for "no".
Error conditions and what to do about them
During the process of creating directories or changing directory permissions, you may see the following types of error messages if an error occurs:
<directory> not writable by user <user>. Please make it so (check parent dir execute bits). LOCKSS will not run properly without it.Please create <directory> and make it writable by user <user>; LOCKSS will not run properly without it.<directory> still not writable by user <user>. Please make it so (check parent dir execute bits). LOCKSS will not run properly without it.Please ensure that <directory> is writable by user <user>; LOCKSS will not run properly without it.The script will end with this warning:
Storage directories have not been set up correctly. Either fix the ownership/permission problems then run scripts/configure-lockss -r, or re-run scripts/configure-lockss and specify different directories.You can either fix any ownership and permission issues encountered and run
scripts/configure-lockss --replay(orscripts/configure-lockss -rfor short), or alternatively you can re-runscripts/configure-lockssbut specify different directories.
Footnotes