VNG Lab

Using nodeSelectors

Let’s deploy our first workload, in our examples we will deploy simple nginx Pod just to demonstrate the behaviour of Ocean scaling up an application directed towards a specific Virtual Node Group (VNG). Let’s navigate back to our Cloud9 IDE for this next section. The YAML file has already been deployed into your Cloud9 IDE, we will just need to run the following command:

We will be deploying the YAML file listed below into our EKS cluster. The file was already cloned into your cloud9 IDE, we just need to apply it into Cloud9.

From your Cloud9 IDE, run the following command:

kubectl apply -f /home/ec2-user/environment/spot-workshop-template/vng-example.yaml

The output should look like the example below:

example-1

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-1
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx-dev
        image: nginx
        resources:
          requests:
            memory: "100Mi"
            cpu: "256m"
      nodeSelector:
        env: ocean-workshop
        example: "1"

Now let’s review the YAML file of what we just deployed: A The K8s deployment will deploy 1 replica, with minimal CPU and Memory requests but has a node selector that will target the Virtual Node Group we created earlier.

To understand the current status of our recently deployed pod and nodes run

kubectl get pods,nodes

We should expect to see our example 1 pod in a pending state while Ocean provisions a new node. Allow up to 2 minutes before running the next command.

example-1

After waiting ~2 minutes, let’s run:

kubectl get pods,nodes -o wide

We will see that a new node joined our cluster and that the Pod was scheduled to it. The “Age” of the node running should be much more recent than the infrastructure we pre-created upon creating the EKS cluster

example-1

Example 1 Recap

In this section, we were able to see how to target a specific workload to run on machines of the example-1 VNG, in the current state, Pods with no specific constraints (e.g not directed toward specific type of nodes) might be placed on nodes from the example-1 VNG, in order to prevent that we will also need to assign a Taint to the VNG, the Pods that we want to run on the tainted VNG should Tolerate that Taint. Follow along to the next section to see how we can achieve this.


Using Taints/Tolerations

In order to demonstrate that, let’s add a new VNG, follow the same process but specify the following configurations

In the Ocean cluster view, enter the Virtual Node Groups tab vng_tab

Press on the “+ Create VNG” new_vng

In the pop-up screen, choose Configure Manually, this will derive all the configurations from the default VNG, with the ability to override any parameters you choose.

vng_tab

Specify example-2 as the VNG Name and in the Node Selection section specify the following Node Labels:

Key*

env

Value

ocean-workshop

Click " Add Node Label to create a dropdown for a second Node Label:

Key*

example

Value

2

Now Let’s add a Taint in the “Node Taints” section

Key*

example

Value

2

Effect

Noschedule

Your final result should look like this:

Click the Create button on the bottom of the page

We will now deploy the YAML file below into our EKS cluster. The YAML file contains two simple nginx deployments. One of them has a matching toleration for the “example-2” VNG that we created, and one doesn’t.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-2-1
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx-dev
        image: nginx
        resources:
          requests:
            memory: "100Mi"
            cpu: "256m"
      nodeSelector:
        env: ocean-workshop
        example: "2"
      tolerations:
      - key: "example"
        operator: "Equal"
        value: "2"
        effect: "NoSchedule"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-2-2
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx-dev
        image: nginx
        resources:
          requests:
            memory: "100Mi"
            cpu: "256m"
      nodeSelector:
        env: ocean-workshop
        example: "2"

Let’s run the command below to deploy this file.

kubectl apply -f /home/ec2-user/environment/spot-workshop-template/vng-example-2.yaml

The output should look like the example below:

example-1

After both deployments have been created, let’s run the following command:

kubectl get pods,nodes

We should expect to see both example-2-1 pods in a pending state.

➜ kubectl get pods,nodes
NAME                               READY   STATUS    RESTARTS   AGE
pod/example-1-5c7574576f-ghnrt     1/1     Running   0          53m
pod/example-2-1-859889d4f-5xrtl    0/1     Pending   0          2m17s
pod/example-2-2-6c8c466f9d-8fszf   0/1     Pending   0          2m17s

NAME                                                STATUS   ROLES    AGE   VERSION
node/ip-192-168-26-69.us-east-2.compute.internal    Ready    <none>   97m   v1.29.3-eks-ae9a62a
node/ip-192-168-54-80.us-east-2.compute.internal    Ready    <none>   42m   v1.29.3-eks-ae9a62a
node/ip-192-168-89-107.us-east-2.compute.internal   Ready    <none>   97m   v1.29.3-eks-ae9a62a

A couple of minutes later Ocean will scale up a new machine to satisfy the supported Pod, we will see that only one of the Pods (example-2-1) will be scheduled while the other wont since it does not tolerate the nodes taint.

Run the following command after waiting ~2 minutes:

kubectl get pods,nodes -o wide

We should see example-2-1 is Running, while example-2-2 is still pending. Your output should look like the example below:

➜  kubectl get pods,nodes -o wide                                                                                                                                                                                                                  
NAME                               READY   STATUS    RESTARTS   AGE     IP               NODE                                           NOMINATED NODE   READINESS GATES
pod/example-1-7f8b5549bb-xnsfq     1/1     Running   0          50m     192.168.48.21    ip-192-168-46-105.us-west-2.compute.internal   <none>           <none>
pod/example-2-1-64d9d4877d-rv2pg   1/1     Running   0          2m26s   192.168.24.234   ip-192-168-24-193.us-west-2.compute.internal   <none>           <none>
pod/example-2-2-7d55b5b8cf-dfkpw   0/1     Pending   0          2m24s   <none>           <none>                                         <none>           <none>

NAME                                                STATUS   ROLES    AGE    VERSION              INTERNAL-IP      EXTERNAL-IP      OS-IMAGE         KERNEL-VERSION                CONTAINER-RUNTIME
node/ip-192-168-24-193.us-west-2.compute.internal   Ready    <none>   87s    v1.19.6-eks-49a6c0   192.168.24.193   54.149.39.228    Amazon Linux 2   5.4.117-58.216.amzn2.x86_64   docker://19.3.13
node/ip-192-168-46-105.us-west-2.compute.internal   Ready    <none>   134m   v1.19.6-eks-49a6c0   192.168.46.105   35.160.241.120   Amazon Linux 2   5.4.117-58.216.amzn2.x86_64   docker://19.3.13

Run kubectl describe <pod-name> Under the Events section you’ll see that the reason for the Pod not being scheduled is because there is a node with a taint that the pod did not tolerate

➜  kubectl describe pod/example-2-2-7d55b5b8cf-dfkpw                                                                                                                                                                                              
Name:           example-2-2-7d55b5b8cf-dfkpw
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=nginx
                pod-template-hash=7d55b5b8cf
Annotations:    kubernetes.io/psp: eks.privileged
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/example-2-2-7d55b5b8cf
Containers:
  nginx-dev:
    Image:      nginx
    Port:       <none>
    Host Port:  <none>
    Requests:
      cpu:        256m
      memory:     100Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-wlpsv (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  default-token-wlpsv:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-wlpsv
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  env=ocean-workshop
                 example=2
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                    From               Message
  ----     ------            ----                   ----               -------
  Warning  FailedScheduling  5m59s (x4 over 6m46s)  default-scheduler  0/1 nodes are available: 1 node(s) didn't match node selector.
  Warning  FailedScheduling  5m47s (x2 over 5m47s)  default-scheduler  0/2 nodes are available: 2 node(s) didn't match node selector.
  Warning  FailedScheduling  31s (x7 over 5m37s)    default-scheduler  0/2 nodes are available: 1 node(s) didn't match node selector, 1 node(s) had taint {example: 2}, that the pod didn't tolerate.