(advanced-backups)=

# Backups

The Scrutinizer filesystem includes utilities that automate the process of creating or restoring system backups.

:::{NOTE}
- These utilities are recommended for most long-term backup scenarios, because they include all database configuration and historical data for a Scrutinizer instance. Native snapshots may still be used as a short-term recovery option when there is no need to store the data, e.g., when upgrading the instance.
- For Scrutinizer instances deployed on AWS, backups should be created and/or restored using native AWS functionality.
:::

These utilities allow several types of backup and restore operations to be performed by the user.

(backups-full)=

## Full backups

Full or comprehensive backups are disaster-recovery-grade images of a Scrutinizer instance and include the following elements of the filesystem:

- Application data and collected NetFlow in the PostgreSQL database
- Host index data in BadgerDB databases
- Scrutinizer's third-party encryption key - `/etc/plixer.key`
- Web Server TLS certificate and key

:::{IMPORTANT}
- The license key (if the instance is a primary reporter) and the TLS certificates and keys generated by Scrutinizer are **not** backed up and **cannot be restored**.

- Any files not included in full backups must be manually backed up and restored, including:

  - Custom threat lists created under `/home/plixer/scrutinizer/files/threats`
  - Custom notifications created under `/home/plixer/scrutinizer/files`
  - LDAP authentication certificates
:::

(full-save)=

### Creating full backups

The Scrutinizer filesystem includes the `backup.sh` utility, which automates the creation of full backups. This script is located under `home/plixer/scrutinizer/files`.

:::{NOTE}
The default runmode of `backup.sh` {ref}`saves the backup file locally <full-local>`. Due to the size of full backup files, however, the remote method outlined below is recommended.
:::

The following instructions cover the process of creating and saving full Scrutinizer instance backups to a specified remote host:

````{dropdown} View instructions

1. SSH to the Scrutinizer server to be backed up and start a [tmux session](resource-terminal) to prevent timeouts:

   ```cfg
   tmux new -s backup
   ```

2. Allow others to use FUSE mounts:

   ```cfg
   sudo grep -Eq "^user_allow_other" /etc/fuse.conf || \
   sudo sed -i '$ a user_allow_other' /etc/fuse.conf
   ```

3. Create the backup directory locally and mount it to an empty directory on the remote host:

   ```cfg
   BACKUPDIR=/mnt/backup
   sudo mkdir -p $BACKUPDIR
   sudo chown plixer:plixer $BACKUPDIR
   sshfs -o allow_other -o reconnect REMOTE_USER@REMOTE_HOST:REMOTE_DIRECTORY $BACKUPDIR
   ```

   :::{IMPORTANT}
   Verify that the remote directory to be used is empty and there is sufficient storage available, before running the backup script in the next step. For a rough estimate of the backup file size, run the following on the Scrutinizer instance:

   ```cfg
   df -h /var/db | awk '!/^Filesystem/ {print "Space Required: "$3}'
   ```
   :::

4. Run `backup.sh` as the `plixer` user, with the mounted remote directory set as the backup file location:

   ```cfg
   BACKUPDIR=/mnt/backup ~plixer/scrutinizer/files/backup.sh
   ```

5. Once the script confirms that the backup file has been saved, unmount the remote backup directory:

   ```cfg
   BACKUPDIR=/mnt/backup
   fusermount -u $BACKUPDIR
   sudo rmdir $BACKUPDIR
   ```

Full backup files are created as `scrutinizer-VERSION-backup-DATE.tar.gz` at the specified location and owned by the `plixer user`.

````

:::{NOTE}
- A {ref}`second Scrutinizer instance <full-second>` can be used as the remote backup host, provided it has sufficient disk space available and is running the same Scrutinizer version as the instance to be backed up. However, doing so is only recommended for redundancy.
- If no remote hosts are available, backups can be {ref}`saved locally on the same Scrutinizer instance <full-local>`. However, this will limit the amount of storage available for system functions and is not recommended.
:::

For further details or assistance with issues, contact {ref}`Plixer Technical Support <resource-technical-support>`.

#### Backing up additional files

When creating a full backup of a Scrutinizer server, any files not {ref}`covered by the script <backups-full>` must be manually backed up and should be stored on an external host/system.

These files should also be manually restored, after running the {ref}`restore script <full-restore>`.

(full-restore)=

### Restoring from a full backup

To restore a Scrutinizer instance from a full backup file, use the `restore.sh` utility located under `home/plixer/scrutinizer/files`.

The script will fully restore {ref}`all backed up elements <backups-full>` of a Scrutinizer instance, provided the following conditions are met:

- A valid full backup file is accessible by the `plixer` user at the specified (\$BACKUPDIR) remote location.
- The Scrutinizer instance to be used for the restore has been freshly deployed.
- The version of the backup matches the version of the fresh Scrutinizer instance to restore *to* (e.g. a 19.3.0 backup can only be restored to a new 19.3.0 instance).

:::{IMPORTANT}
- A restore completely overwrites the state of the target instance and deletes the source backup file. It is highly recommended to always restore from a **copy** of a backup file.
- If the restore target is the primary reporter in a distributed cluster, contact {ref}`Plixer Technical Support <resource-technical-support>` for assistance.
:::

The following instructions cover the process of restoring from a backup file on a remote host to a fresh Scrutinizer deployment:

````{dropdown} View instructions

1. SSH to the target Scrutinizer server for the restore, and start a [tmux session](resource-terminal) to prevent timeouts:

   ```cfg
   tmux new -s restore
   ```

2. Allow others to use FUSE mounts:

   ```cfg
   sudo grep -Eq "^user_allow_other" /etc/fuse.conf || \
   sudo sed -i '$ a user_allow_other' /etc/fuse.conf
   ```

3. Create the backup directory locally and mount the remote directory containing the backup file(s):

   ```cfg
   BACKUPDIR=/mnt/backup
   sudo mkdir -p $BACKUPDIR
   sudo chown plixer:plixer $BACKUPDIR
   sshfs -o allow_other -o reconnect REMOTE_USER@REMOTE_HOST:REMOTE_DIRECTORY $BACKUPDIR
   ```

4. Run `restore.sh` as the `plixer` user, with the remote directory set as the backup file location:

   ```cfg
   BACKUPDIR=/mnt/backup ~plixer/scrutinizer/files/restore.sh
   ```

5. When prompted, enter `yes` to select the backup file to use for the restore or `no` to have the script continue searching (if the backup file was not previously specified).

   :::{HINT}
   To specify the file to use for the restore, use `BACKUPDIR=/mnt/backup BACKUP=restore_filename.tar.gz ~plixer/scrutinizer/files/restore.sh` at the previous step instead.
   :::

6. Once the script confirms that the restore has been completed, unmount the remote backup directory:

   ```cfg
   BACKUPDIR=/mnt/backup
   fusermount -u $BACKUPDIR
   sudo rmdir $BACKUPDIR
   ```

````

:::{IMPORTANT}
The `restore.sh` utility does not restart Scrutinizer services after it completes running.
:::

Based on the role of the Scrutinizer instance, proceed to finalize setup of the restored server:

- If the restored instance is a **standalone server**, run the following to restart all services and register it:

  ```cfg
  scrut_util --services --name all --switch restart
  scrut_util --set selfregister --reset
  ```
  
  These commands may take several minutes to complete.

- If the restored instance is a **remote collector** in a [distributed cluster](guides-distributed), run the following on the primary reporter to register it:

  ```cfg
  scrut_util --set registercollector --ip RESTORED_INSTANCE_IP
  ```

- If the restored instance is a **primary reporter** in a distributed cluster or a standalone server, and its Machine ID is different from that of the backup file, contact {ref}`Plixer Technical Support <resource-technical-support>` to obtain a new license key.

(full-alt)=

### Alternative backup methods

Because full backup files are extremely large and intended for use in disaster recovery scenarios, saving and storing backup files to remote hosts serving ssh is highly recommended.

In scenarios where this is not possible, the following alternative backup methods can be used:

(full-second)=

#### Backup to a second Scrutinizer instance

If a separate host is not available to save backups to, a second Scrutinizer instance can be used for backup file storage instead. The versions of the two instances must match.

:::{IMPORTANT}
Due to how Scrutinizer is designed to optimize the use of all available disk space, it will likely be necessary to add more storage and/or modify the {ref}`data retention settings <a-settings-history>` of the second instance. For assistance, contact {ref}`Plixer Technical Support <resource-technical-support>`.
:::

The following instructions cover the additional steps required for creating backups on a second Scrutinizer instance (using the default location):

````{dropdown} View instructions

1. Set the location/directory to use for backup files:

   ```cfg
   BACKUPDIR=${BACKUPDIR:='/var/db/big/pgsql/restore'}
   REMOTE=YOUR_REMOTE_SCRUTINIZER_INSTANCE
   ```

2. Create the backup directory on both instances:

   ```cfg
   sudo mkdir -p $BACKUPDIR
   sudo chown plixer:plixer $BACKUPDIR
   ssh plixer@$REMOTE "sudo su -c 'mkdir -p $BACKUPDIR && chown plixer:plixer $BACKUPDIR'"
   ```

3. Allow other users to use FUSE mounts:

   ```cfg
   sudo grep -Eq "^user_allow_other" /etc/fuse.conf || \
   sudo sed -i '$ a user_allow_other' /etc/fuse.conf
   ```

4. Mount the remote instance's backup directory on the local instance:

   ```cfg
   sshfs -o allow_other -o reconnect plixer@$REMOTE:$BACKUPDIR $BACKUPDIR
   ```

5. Run `backup.sh` using the directory mounted from the remote instance as the backup file location:

   ```cfg
   BACKUPDIR=/var/db/big/pgsql/restore ~plixer/scrutinizer/files/backup.sh
   ```

   ```{IMPORTANT}
   Before running the backup utility, verify that the remote directory to be used is empty and there is sufficient storage available. For a rough estimate of the backup file size, run the command `df -h /var/db | awk '!/^Filesystem/ {print "Space Required: "$3}'` on the Scrutinizer instance.
   ```

6. After the backup is complete, unmount the remote Scrutinizer directory:

   ```cfg
   BACKUPDIR=${BACKUPDIR:='/var/db/big/pgsql/restore'}
   fusermount -u $BACKUPDIR
   ```

````

(full-local)=

#### Local backups

By default, both `backup.sh` and `restore.sh` are set to use `/var/db/big/pgsql/restore` on the local Scrutinizer filesystem for full backup files. However, in most cases, the backup operation will likely fail unless additional disk space is allocated to or created on the Scrutinizer instance. Running the command `df -h /var/db | awk '!/^Filesystem/ {print "Space Required: "$3}'` will provide a rough estimate of the storage required for the backup.

To force a checkpoint, enter `psql plixer -c "CHECKPOINT"` after the script has finished running.

:::{NOTE}
- In v19.2, the backup file path must be defined in the `backup.sh` and `restore.sh` scripts before they are run.
- Storing backup files locally will severely limit the storage Scrutinizer can use for its primary functions. As such, backup files saved to the instance should be transferred to a separate resource as soon as possible.
:::

(backups-config)=

## Configuration backups

For more "lightweight" backup and restore operations, the `scrut_conf_dump.sh` and `scrut_conf_restore.sh` scripts (both located in `/home/plixer/scrutinizer/database/utils`) can be used to target only the application/configuration data of a Scrutinizer instance, including:

- User-added maps
- Dashboards
- IP groups
- Saved reports
- 3rd-party integration settings

Configuration backups do not include any collected flow data.

:::{NOTE}
In [distributed clusters](guides-distributed), the primary reporter regularly syncs application/configuration data to remote collectors. Only the configuration backup of the primary reporter is needed to perform a restore for the cluster.
:::

`scrut_conf_dump.sh` and `scrut_conf_restore.sh` use Postgres's [pg_dump](https://www.postgresql.org/docs/11/app-pgdump.html) and [pg_restore](https://www.postgresql.org/docs/11/app-pgrestore.html) utils and respect the same set of environment variables:


| Variable | Description | Default |
|----------|-------------|---------|
| `DUMP` | Location of the backup file | `./conf.dump` |
| `PGHOST` | IP address or hostname of the PostgreSQL database | `localhost` |
| `PGUSER` | Role/user used to connect to PGHOST | `plixer` |
| `PGDATABASE` | The database to access at PGHOST | `plixer` |

### Backing up configuration data

To create a backup of a Scrutinizer server's current configuration data, follow these steps:

````{dropdown} View instructions

1. Run the backup script.

   To save the backup file to the default location:

   ```cfg
   ~/scrutinizer/database/utils/scrut_conf_dump.sh
   ```

   To use a custom location/filename:

   ```cfg
   mkdir /tmp/CONF_BACKUP_DIR
   touch /tmp/CONF_BACKUP_DIR/CONF_BACKUP.dump
   DUMP=/tmp/CONF_BACKUP_DIR/CONF_BACKUP.dump ~/scrutinizer/database/utils/scrut_conf_dump.sh
   ```

3. Restart the stopped services:

   ```cfg
   sudo systemctl restart scrutinizer
   ```

````

### Restoring configuration data

To restore configuration data to a Scrutinizer server from a backup file, follow these steps:

````{dropdown} View instructions

1. Stop the `plixer_webapp` and `plixer_collector` services:

   ```cfg
   sudo systemctl stop plixer_webapp
   sudo systemctl stop plixer_collector
   ```

2. Run the restore script.

   To restore from the default backup location/file:

   ```cfg
   ~/scrutinizer/database/utils/scrut_conf_dump.sh
   ```

   To restore from a specified location/file:

   ```cfg
   PGHOST=SCRUTINIZER_IP
   DUMP=/tmp/CONF_BACKUP_DIR/CONF_BACKUP.dump ~/scrutinizer/database/utils/scrut_conf_restore.sh
   ```

3. Restart the stopped services:

   ```cfg
   sudo /bin/systemctl start plixer_webapp
   sudo /bin/systemctl start plixer_collector
   ```

4. Resync the access table:

   ```cfg
   psql -c "SELECT setval(pg_get_serial_sequence('plixer.access', 'access_id'), COALESCE(max(access_id) + 1, 1), false) FROM plixer.access;"
   ```

````

:::{NOTE}
`scrut_conf_restore.sh` should only be used for restoring configuration data for the same Scrutinizer server/appliance. To apply a configuration backup to a different server, follow the steps for {ref}`backup migrations <migrate-backup>` in the {ref}`migration guides <backups-migration>`.
:::

### Additional notes

- `pg_restore` errors typically only cause the restore to fail for the table associated with the error. Other tables should still be restored successfully.

- Errors associated with **duplicate keys** usually indicate a conflict between existing rows in the table and the rows being restored.

  ```cfg
  pg_restore: [archiver (db)] Error from TOC entry 51348; 0 17943 TABLE DATA exporters plixer
  pg_restore: [archiver (db)] COPY failed for table "exporters": ERROR:  duplicate key value violates unique constraint "exporters_pkey"
  DETAIL:  Key (exporter_id)=(\x0a4d4d0a) already exists.
  ```

  The conflicting keys should be removed from the table before attempting to restore again.

- If you are swapping IP addresses, the database keys should be rotated using `scrut_util --pgcerts --verbose`, because the backed up keys will be associated with the old address.
