AWS (EKS)

The Plixer ML Engine deployment utilizes a separate host VM to deploy, monitor, and control the deployment. All guide content and steps are managed from that host VM, which is distinct from the actual deployment nodes and resources. It is recommended to deploy the Plixer ML VM image as a deployment host because it installs all the required prerequisites.

The following are the software prerequisites required:

Once all required software is installed, proceed with setting up the following:

AWS secrets

To connect to AWS you will need to provide a secret key pair from an IAM user (Secret Access Key ID and Secret Access Key). You will be asked for these in 01_aws_infrastructure.sh.

VPC and subnet names

You will need to know which VPC and subnets (2 are required) you want to deploy to. You will be asked for these during 01_aws_infrastructure.sh, or you can configure them directly in aws.tfvars.

Creating a new VPC

01_aws_infrastructure.sh will ask you for new VPC details, or you can set new_vpc_cidr, new_vpc_public_cidr, and new_vpc_private_cidr manually in aws.tfvars.

Deploying to an existing VPC

If deploying to an existing VPC and existing subnets, the VPC must meet the following requirements:

  • A DHCP option set applied to the VPC with domain-name-servers: AmazonProvidedDNS

  • Two private subnets within the the VPC, in two different AZs

  • If no NAT gateway is attached to the private subnets (i.e. subnets have no internet access), then set airgap_install to TRUE

For more information, see the article on Amazon EKS VPC and subnet requirements and considerations.

Setting up Plixer Kubernetes in AWS

Run the following steps on a cloned VM template that will act as the deploy host to manage the rest of the infrastructure.

  1. Log in as the plixer user, and then accept the EULA and set up networking.

  2. Navigate to the /home/plixer/common/kubernetes directory.

  3. Run 01_aws_infrastructure.sh, which for a new install should walk you through configuring required values.

    Note

    • If you are reconfiguring an existing setup, run 01_aws_infrastructure.sh --configure, or you can also manually configure the required values in aws.tfvars.

    • If you get an error complaining that the tfstate s3 bucket already exists and that the cluster name that you have set isn’t being used by another user, you can run 01_aws_infrastructure.sh –recover-tfstate, and then continue with step 3 again.

  4. Run kubectl get nodes to make sure that the Kubernetes infrastructure has been deployed successfully.

Setting up the ML engine in AWS

After the Kubernetes infrastructure has been deployed and is now running successfully, you may proceed with deploying the ML engine in AWS.

  1. Run the following command: /home/plixer/ml/setup.sh

  2. Enter the path to your Plixer Scrutinizer private SSH key.

  3. Enter your Plixer Scrutinizer authentication token.

  4. Run kubectl get pods. The pods will have the Running status which means that it has been deployed successfully.

Note

If you encounter the Error: Provider configuration not present when deploying in AWS, simply run terraform -chdir=aws state rm module.eks.data.http.wait_for_cluster and re-run 01_aws_infrastructure.sh.

aws.tfvars field reference table

Field name

Description

creator

REQUIRED: This is the name of the person creating these AWS resources, used as a tag in AWS to track utilization.

cost_center

REQUIRED: This is the cost center to use for these AWS resources, used as a tag in AWS to track utilization.

aws_certificate_name

REQUIRED: This is the name of an existing SSH certificate configured in your AWS environment. You can see a list of these in your AWS Console by navigating to EC2 > Network > Security > Key Pairs.

instance_type

REQUIRED: This is the AWS instance type to create for EKS worker nodes (i.e. t2.large).

fargate

REQUIRED: Use fargate instead of EKS nodes for applicable workloads. Setting the value to TRUE will allow using a smaller instance_type.

aws_region

REQUIRED: The AWS region to deploy infrastructure in.

public_load_balancer

REQUIRED: Whether or not to deploy a public load balancer for external access.

airgap_install

OPTIONAL: If this is an airgapped install (i.e. the vpc_private_subnets don’t have a route to a NAT gateway), then set this to TRUE.

create_ec2_endpoint

OPTIONAL: If airgap_install = TRUE, this bool controls whether or not to create an EC2 endpoint in the VPC.

create_s3_endpoint = true

OPTIONAL: If airgap_install = true, this bool controls whether or not to create an S3 endpoint in the VPC.

create_ecr_endpoint = true

OPTIONAL: If airgap_install = true, this bool controls whether or not to create an ECR endpoint in the VPC.

create_ssm_endpoint = true

OPTIONAL: If airgap_install = true, this bool controls whether or not to create an SSM endpoint in the VPC.

new_vpc_cidr

OPTIONAL: If you want to create a new VPC, then specify the IP address range in this field.

new_vpc_public_cidr

OPTIONAL: If you want to create a new VPC, then specify the IP address range for the public subnet in the new VPC.

new_vpc_private_cidr

OPTIONAL: If you want to create a new VPC, then specify the IP address range for the private subnet in the new VPC.

azs

OPTIONAL: Availability zones corresponding to the subnets you want created in new_vpc_public_cidr and new_vpc_private_cidr.

vpc_name

OPTIONAL: Existing vpc_name to create the EKS resources in.

vpc_private_subnets

OPTIONAL: List of private subnet names to create the EKS resources in.

vpc_public_subnets

OPTIONAL: List of public subnet names to create the EKS resources in.