Plixer ML Engine

Plixer ML Engine#

After deploying Scrutinizer, follow the steps below to deploy the Plixer ML Engine (requires Plixer One Enterprise).

Pre-deployment preparations#

The following preparatory steps should be completed before starting the deployment procedure for any type of Plixer ML Engine appliance.

Deploying the ML VM#

Use the template obtained with the Plixer One Enterprise license to deploy the ML VM locally.

This VM will function as a separate deployment host and includes all prerequisite resources. The ML engine environment will be deployed and managed from this VM.

Note

When connecting to the VM via PuTTY, VT100 line drawing even in UTF-8 mode (under Settings > Window > Translation) must be enabled for the setup wizard to be displayed correctly. Requested details can be pasted into the prompts/dialogs using Shift+Insert.

KVM deployment#

To deploy the ML VM image on a KVM host running libvirt, follow these steps:

Registering the ML engine#

Before deploying an ML engine, the appliance must first be registered as follows:

In the Scrutinizer web interface, navigate to Admin > Resources > ML Engines.
Click the + button to add a new ML engine.
From the dropdown, select the type of ML engine paired with your Scrutinizer environment:
Enter a name to assign to the engine, and then click Save.

After returning to the main view, click the name of the new ML engine and save/copy the primary reporter address and authentication token shown in the tray. These will be required during the ML engine deployment process.

Confirming SSH credentials#

To complete the appliance setup process, the ML engine will need to establish an SSH session with the primary reporter/server of the Scrutinizer environment. As such, the IP address of the primary reporter and the plixer user password will need to be provided.

If a private SSH key is required, verify that the public key is configured on the primary Scrutinizer reporter/server under /home/plixer/.ssh/authorized_keys. The private key should also be accessible from the machine hosting the ML engine virtual appliance, as the path to the key will need to be entered during the appliance setup process.

Deployment guides#

Important

Scrutinizer 19.7.2 environments require v19.5+ of the Plixer ML Engine (v19.7 is recommended). For Scrutinizer 19.5.3 and below, see this deployment guide for v19.4 of the ML engine.

Note

When connecting to the ML VM via PuTTY, VT100 line drawing even in UTF-8 mode (under Settings > Window > Translation) must be enabled for the setup wizard to be displayed correctly. Requested details can be pasted into the prompts/dialogs using Shift+Insert.

AWS (EKS)

Follow these steps to deploy the Plixer ML Engine multi-node cluster in AWS.

Additional prerequisites for AWS

AWS IAM user secret access key ID and secret access key
A VPC with two subnets for the deployment
At least 60 IP addresses available in the VPC subnets for use by AWS (EKS)

Note

The ML VM (the deployment host) deployed as part of the pre-deployment preparations will have all software prerequisites (Docker, Terraform, etc.) preinstalled.
The setup scripts include an option to automatically set up a new VPC and will prompt the user to enter the necessary information.
For existing VPCs, the following requirements must be met:
- The VPC must have a DHCP option set with the option to use AmazonProvidedDNS for its domain name servers.
- The VPC must have two private subnets on separate Availability Zones (AZs).
- If the subnets cannot access the Internet (no NAT gateway attached), set airgap_install in /home/plixer/common/kubernetes/aws.tfvars to TRUE.
For additional information on Amazon EKS VPC and subnet requirements and considerations, see this article.

Hybrid cloud deployments

When pairing an ML engine in AWS with an on-prem Scrutinizer environment, one of the following methods should be used to enable connectivity between the two before starting the deployment process.

AWS Site-to-Site VPN

Follow these instructions to create an AWS Site-to-Site VPN connection to allow communication between the two deployments.

Direct access via public IP

A public IP address can be used to allow external access to the on-prem Scrutinizer deployment. However, this will expose the Scrutinizer environment to the Internet via ports 5432, 22, and 443.

The public IP address must be entered when prompted by the setup scripts. The Internet gateway IP must also be manually added to the Scrutinizer pg_hba.conf file to allow access to Postgres.

After the file has been modified, run the following command on the Scrutinizer server to reload the configuration:

psql -c "SELECT pg_reload_conf()"

Deploying the ML engine

Follow these instructions to set up the necessary infrastructure and deploy the ML engine:

Log in to the ML VM image using plixer:plixer.
Accept the EULA, and then configure network settings for the host.
SSH to the ML VM image using the plixer credentials set in step 2, and then wait for the setup wizard/scripts to start automatically.
Enter the infrastructure deployment parameters as prompted.

Note

The requested details are automatically saved to /home/plixer/common/kubernetes/aws.tfvars, which also contains other default parameters for deploying the ML engine Kubernetes cluster. If there are issues with the infrastructure deployment, contact Plixer Technical Support for assistance before making changes to the file.
Wait as the Kubernetes cluster is deployed (may take several minutes), and then enter the Scrutinizer SSH credentials when prompted.

After the scripts complete running, navigate to Admin > Resources > ML Engines and wait for the engine to show as Deployed under its Deploy Status. Refresh the page if the status has not updated after a few minutes.

See this guide for configuration instructions and recommendations for the ML engine.

Terraform configuration

The following table lists all required and optional variables in /home/plixer/common/kubernetes/aws.tfvars, which are used when deploying the Kubernetes infrastructure for the ML engine.

Note

Contact Plixer Technical Support before making changes to this file.

Field name	Description
cluster_name	REQUIRED: Name to identify the ML engine cluster/deployment; can only contain the characters a to z (in lowercase), 0 to 9, and - .
creator	REQUIRED: This is the name of the person creating these AWS resources, used as a tag in AWS to track utilization.
cost_center	REQUIRED: This is the cost center to use for these AWS resources, used as a tag in AWS to track utilization.
aws_certificate_name	REQUIRED: This is the name of an existing SSH certificate configured in your AWS environment. You can see a list of these in your AWS Console by navigating to EC2 > Network > Security > Key Pairs.
instance_type	REQUIRED: This is the AWS instance type to create for EKS worker nodes (i.e. t2.large).
fargate	REQUIRED: Use fargate instead of EKS nodes for applicable workloads. Setting the value to TRUE will allow using a smaller instance_type.
aws_region	REQUIRED: The AWS region to deploy infrastructure in.
airgap_install	OPTIONAL: If this is an airgapped install (i.e. the vpc_private_subnets don’t have a route to a NAT gateway), then set this to TRUE.
create_ec2_endpoint	OPTIONAL: If airgap_install = TRUE, this bool controls whether or not to create an EC2 endpoint in the VPC.
create_s3_endpoint	OPTIONAL: If airgap_install = TRUE, this bool controls whether or not to create an S3 endpoint in the VPC.
create_ecr_endpoint	OPTIONAL: If airgap_install = TRUE, this bool controls whether or not to create an ECR endpoint in the VPC.
create_ssm_endpoint	OPTIONAL: If airgap_install = TRUE, this bool controls whether or not to create an SSM endpoint in the VPC.
new_vpc_cidr	OPTIONAL: If you want to create a new VPC, then specify the IP address range in this field.
new_vpc_public_cidr	OPTIONAL: If you want to create a new VPC, then specify the IP address range for the public subnet in the new VPC.
new_vpc_private_cidr	OPTIONAL: If you want to create a new VPC, then specify the IP address range for the private subnet in the new VPC.
azs	OPTIONAL: Availability zones corresponding to the subnets you want created in new_vpc_public_cidr and new_vpc_private_cidr.
vpc_name	OPTIONAL: Existing vpc_name to create the EKS resources in.
vpc_private_subnets	OPTIONAL: List of private subnet names to create the EKS resources in.
vpc_public_subnets	OPTIONAL: List of public subnet names to create the EKS resources in.

Azure (AKS)

Follow these steps to deploy the Plixer ML Engine multi-node cluster in Azure after the pre-deployment steps have been completed:

Additional prerequisites for Azure

Credentials for the Azure user account that will be used for deployment
A VNet with one subnet for the deployment

Note

The ML VM (the deployment host) deployed as part of the pre-deployment preparations will have all software prerequisites (Docker, Terraform, etc.) preinstalled.
The Azure user account must be assigned the owner role to allow a role to be assigned to the AKS cluster user.
VNet details for infrastructure deployment can be defined using the vnet_addresses and new_subnet_cidr fields in /home/plixer/common/kubernetes/azure.tfvars.

Hybrid cloud deployments

When pairing an ML engine in Azure with an on-prem Scrutinizer environment, one of the following methods should be used to enable connectivity between the two before starting the deployment process.

Azure site-to-site (S2S) VPN

Follow these instructions to create a site-to-site VPN connection to allow communication between the two deployments.

Direct access via public IP

A public IP address can be used to allow external access to the on-prem Scrutinizer deployment. However, this will expose the Scrutinizer environment to the Internet via ports 5432, 22, and 443.

The public IP address must be entered when prompted by the 01_azure_infrastructure.sh and setup.sh scripts. The Internet gateway IP must also be manually added to the Scrutinizer pg_hba.conf file to allow access to Postgres.

After the file has been modified, run the following command on the Scrutinizer server to reload the configuration:

psql -c "SELECT pg_reload_conf()"

Deploying the Kubernetes infrastructure

Log in to the ML VM image using plixer:plixer.
Accept the EULA, and then configure network settings for the host.
SSH to the ML VM image using the plixer credentials set in step 2.
Exit the automated setup wizard by pressing Ctrl + C.
Start the Azure CLI and run the following to set up the client and log in:
```
az login
```
Define the infrastructure deployment parameters in /home/plixer/common/kubernetes/azure.tfvars (as described in the file).

Note

azure.tfvars may also include fields/variables with factory-defined values (e.g., kube_version) for deploying the ML engine Kubernetes cluster. Contact Plixer Technical Support for assistance before making changes to any default value.
Navigate to /home/plixer/common/kubernetes and run the Kubernetes cluster deployment script:
```
./01_azure_infrastructure.sh
```
Verify that the infrastructure was successfully deployed (may take several minutes):
```
kubectl get nodes
```

After confirming the Kubernetes cluster has been correctly deployed, proceed to deploying the ML engine.

Deploying the ML engine

Once the Kubernetes cluster has been deployed, follow these steps to deploy the ML engine:

Navigate to the /home/plixer/ml directory on the deployment host.
Run the ML engine deployment script and follow the prompts to set up the appliance:
```
./setup.sh
```
When prompted, enter the following Scrutinizer environment details:
- Authentication token
- Primary reporter IP address
- SSH credentials

After the script completes running, navigate to Admin > Resources > ML Engines and wait for the engine to show as Deployed under its Deploy Status. Refresh the page if the status has not updated after a few minutes.

See this guide for configuration instructions and recommendations for the ML engine.

Terraform configuration

The following table lists all required and optional variables in /home/plixer/common/kubernetes/azure.tfvars, which are used when deploying the Kubernetes infrastructure for the ML engine.

Field name	Description
cluster_name	REQUIRED: Name to identify the ML engine cluster/deployment; can only contain the characters a to z (in lowercase), 0 to 9, and - .
vm_type	REQUIRED: This is the Azure VM instance type to create for AKS worker nodes.
location	REQUIRED: This is the location to create the AKS worker nodes in (e.g. East US 2).
resource_group_name	OPTIONAL: Name of existing resource group to use when deploying assets. If empty, a new resource group named ${var.cluster_name}-resource-group will be created. resource_group_name must also be in the specified location field.
vnet_name	OPTIONAL: Name of existing VNET to deploy AKS in.
vnet_subnet_name	OPTIONAL: Name of existing subnet within vnet_name to deploy AKS in. Each subnet can only contain one AKS cluster.
vnet_addresses	OPTIONAL: If vnet_name is not specified, then use this address space when creating the new VNET to place AKS in. By default, value is set to 172.18.0.0/16.
new_subnet_cidr	OPTIONAL (required if vnet_subnet_name is not specified): If vnet_subnet_name is not specified, then use this address space when creating the new VNET subnet to place AKS in. Value must be within the address space of the specified VNET. Default value is set to 172.18.1.0/24.
public_node_ips	OPTIONAL: Whether or not to assign public IPs to AKS nodes. By default, value is set to FALSE.
service_cidr	OPTIONAL: Service CIDR space for internal k8s services. Must not conflict with the address space of the VNET being deployed to. By default, value is set to 172.19.1.0/24.
dns_service_ip	OPTIONAL: Service IP to assign to the k8s internal DNS service. Must be within the address space specified by service_cidr. By default, value is set to 172.19.1.5.

vSphere

Follow these steps to deploy the Plixer ML Engine multi-node cluster in vSphere after the pre-deployment steps have been completed:

Additional prerequisites for vSphere deployment

The ML engine VM template must be available in vSphere. Note the path to the template as it will need to be entered when deploying the engine.
The deployment process will require credentials for a vSphere user with permissions to create VMs and resource groups.

Note

The ML VM template includes all software prerequisites (Docker, Terraform, etc.).

Deploying the ML engine

Follow these instructions to set up the necessary infrastructure and deploy the ML engine:

Log in to the ML VM image using plixer:plixer.
Accept the EULA, and then configure network settings for the host.
SSH to the ML VM image using the plixer credentials set in step 2, and then wait for the setup wizard/scripts to start automatically.
Enter the infrastructure deployment parameters as prompted.

Note

The requested details are automatically saved to /home/plixer/common/kubernetes/vsphere.tfvars, which also contains other default parameters for deploying the ML engine Kubernetes cluster. If there are issues with the infrastructure deployment, contact Plixer Technical Support for assistance before making changes to the file.
Wait as the Kubernetes cluster is deployed (may take several minutes), and then enter the Scrutinizer SSH credentials when prompted.

After the scripts complete running, navigate to Admin > Resources > ML Engines and wait for the engine to show as Deployed under its Deploy Status. Refresh the page if the status has not updated after a few minutes.

See this guide for configuration instructions and recommendations for the ML engine.

Terraform configuration

The following table lists all required and optional variables in /home/plixer/common/kubernetes/vsphere.tfvars, which are used when deploying the Kubernetes infrastructure for the ML engine.

Note

Contact Plixer Technical Support before making changes to this file.

Field name	Description
create_hosts	Whether or not to create vSphere hosts. If FALSE, then the IPs in vm_master_ips should correspond to the VMs created using the VM template.
vm_master_ips	List of IPs to assign to Kubernetes nodes. This must be 1 or 3 hosts (can’t be an even number of IPs).
vm_haproxy_vip	The virtual IP address to assign to a VM running HAProxy.
vsphere_vcenter	The IP address of the vCenter host to deploy on.
vsphere_user	vSphere user to connect with.
vsphere_datacenter	Datacenter in vSphere to deploy assets into.
vsphere_host	Host within the specified datacenter to deploy assets into.
vsphere_resource_pool	Resource pool to create for the VMs.
vm_folder	Folder name in vSphere to create the VMs in.
vm_datastore	The datastore name used to store the files of the VMs.
vm_network	The vSphere network name used by the VMs.
vm_gateway	The network gateway used by the VMs.
vm_dns	The DNS server used by the VMs.
vm_domain	The domain name used by the VMs.
vm_template	The vSphere template that the VM is based on.
vsphere_unverified_ssl	Set to TRUE to bypass the vSphere host certificate verification.
offline_install	If set to TRUE, then it will be assumed that the template being used to create the VMs already has all assets it needs, and will skip downloading the assets.
rke2_airgap_copy	If set to TRUE and offline_install is also TRUE, then the script will attempt to proxy any downloads required for RKE2 Kubernetes setup through the host that `./01_vsphere_infrastructure.sh` is running on.

Plixer ML Engine

Contents

Plixer ML Engine#

Pre-deployment preparations#

Deploying the ML VM#

KVM deployment#

Registering the ML engine#

Confirming SSH credentials#

Deployment guides#