Share with friends
In the guide, we will talk about AWS EKS best practices covering security, tenancy, GitOps & CI/CD, cost/resource optimization, cluster version upgrades, and more.
Amazon EKS makes teams use Kubernetes to orchestrate their containers without having to set up and maintain the Kubernetes control plane or the hardware needed to support Kubernetes clusters.
By automating their deployment and scalability, the open-source solution, Kubernetes makes containerized apps simpler.
Kubernetes makes it simpler for app developers to organize applications into logical pieces, containerizing them so they may travel as a group and work without incorporating an operating system inside the app or container.
As a result, containerized applications are more streamlined and effective. Developers can build clusters of master and worker nodes in multi-container applications.
But Amazon EKS makes it even simpler.
How does Amazon EKS accomplish that? It ensures high availability by scaling and executing the Kubernetes control plane automatically across zones.
Load balancing, the automatic replacement of defective control plane instances, and service updating are all handled by the same control plane automation.
1. AWS EKS Security Best Practices
Securing your Amazon EKS clusters is a must to ensure your applications and data are safe.
Here are some practical tips to aid your knowledge.
Your EKS configuration requires manual configuration because automation cannot manage all security. Additionally, you'll need to fix any errors that EKS may have upon installation.
To deal with that and correct configuration, AWS provides a free best practices guide, accessible on GitHub.
Some key highlights are listed below:
- Controlling access to the cluster with IAM roles
- Putting in place a firewall to safeguard each cluster
- Cluster traffic encryption with SSL/TLS
- Using a bastion host to access the cluster
- Using virtual private networks (VPNs) to connect to clusters to access your cluster
- The use of a monitoring tool to find harmful activity
A. VPC Isolation
Keep your EKS clusters in a dedicated Virtual Private Cloud (VPC). This prevents unwanted or unauthorized network access that might expose a part of your configs, improving your cluster's security.
This isolation prevents other AWS resources from accidentally or maliciously accessing your EKS cluster's network. It's like having a private bubble for your cluster.
You can set this up during the EKS cluster creation process by specifying a VPC.
Also Read: Kubernetes Best Practices
B. IAM Roles
Instead of embedding credentials directly into your applications, leverage AWS Identity and Access Management (IAM) roles for your EKS worker nodes.
Why?
These roles grant permissions to the nodes only when they need them. This way, you eliminate the need (and hassle) for managing credentials within your code.
Here, authentication and authorization are two crucial tasks carried out by the IAM. While authorization controls the operations that AWS resources can execute, authentication requires the verification of identity.
You can set authorized or disallowed activities and resources, as well as the circumstances under which actions are allowed or denied, with the app developer or manager EKS IAM role with IAM identity-based policies.
C. Node Group Isolation
Divide your EKS worker nodes into separate node groups based on their responsibilities and permissions.
For instance, if you're running sensitive workloads, create a dedicated node group solely for them. This isolation ensures that a security breach in one node group doesn't automatically put the entire cluster at risk.
How can this be achieved?
Separate the control and data traffic for Kubernetes. They will both eventually pass through the same pipe if they don't. This is bad news for EKS security since open access to the data plane implies open access to the control plane.
Use an ingress controller to configure the nodes, and configure it to only permit connections from the master node over a particular port listed in the network access control list (ACL).
Also Read: How to Use Just One LoadBalancer for Multiple Apps in Kubernetes?
D. Network Policies
Kubernetes Network Policies act like a firewall for your cluster. They dictate which pods can communicate with each other and how.
Implementing these policies helps you define and enforce communication boundaries within your cluster. Say you can create a policy to allow communication only between specific pods that require it.
Suppose you have a frontend and a backend service. You can create a Network Policy that allows communication from the frontend pods to the backend pods but restricts any other unauthorized access.
Consider the below example:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-backend
spec:
podSelector:
matchLabels:
app: frontend
ingress:
- from:
- podSelector:
matchLabels:
app: backend
E. Use Secrets Manager
Keep sensitive data like API keys, tokens, and passwords in AWS Secrets Manager rather than hardcoding them in your applications.
Even using .env
files can expose the variable value to other parts of the app which do utilize them.
F. Regular Updates
Both the EKS control plane and worker node Amazon Machine Images (AMIs) need to be kept up to date with the latest security patches. These patches often address vulnerabilities and security loopholes. The managed EKS service has the same release cycle as the Kubernetes project, which is every four months.
Kubernetes versions older than four releases are regarded as deprecated, and installations of EKS running earlier versions will be compelled to upgrade.
G. Enhance the security posture of the EKS cluster by using native AWS features.
This includes limiting IAM access, turning on EBS volume encryption, using the most recent worker node AMIs, switching to SSM instead of SSH, and turning on VPC flow logs, among other things.
2. AWS EKS Best Practices for GitOps and CI/CD
Implementing GitOps and Continuous Integration/Continuous Deployment (CI/CD) practices in your Amazon EKS environment can greatly enhance your development workflow and maintain security.
A. GitOps with Flux
What is GitOps? GitOps is a methodology that promotes using git repositories as the source of truth for your Kubernetes cluster's configuration.
And where does Flux fit in? Flux is a popular tool that helps you apply this methodology in your EKS clusters. Here's how you can set it up:
Install Flux
1. Start by installing Flux on your local machine:
brew install fluxcd/tap/flux
2. Bootstrap Your Cluster
Use Flux to bootstrap your cluster by connecting it to your Git repository. This example uses a GitHub repository:
flux bootstrap github \
--owner=<github-user> \
--repository=<github-repo> \
--path=./clusters/my-cluster \
--personal
3. Deploy Manifests
Store your Kubernetes manifests (YAML files) in your git repository. Flux will continuously monitor the repository and apply changes to your cluster.
# YAML manifest for a Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3
template:
spec:
containers:
- name: my-app
image: myregistry/my-app:v1
Also Read: GitOps vs DevOps
B. ArgoCD
Git is used to store all EKS cluster artifacts as part of the GitOps architecture, and technologies like ArgoCD are used to monitor and automatically deploy Git repository updates.
An EKS cluster can install the CI/CD tool ArgoCD to poll a remote Git repository for updates.
By merely sending application source code, Kubernetes manifests, or infrastructure configuration information to a Git repository, managers may effectively deploy updates to several EKS clusters. ArgoCD will automatically update clusters with changes.
C. CI/CD Integration
Integrating CI/CD pipelines into your EKS environment helps automate the process of building, testing, and deploying your applications.
Here's how you can achieve this:
1. Use a CI/CD Tool
Choose a CI/CD tool like Jenkins, GitLab CI/CD, or AWS CodePipeline. Configure your pipeline to fetch your code from your Git repository.
2. Build and Push Images
In your pipeline, build your application's Docker image and push it to a secure image registry like Amazon ECR.
# Example pipeline script (using AWS CLI)
aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <account-id>.dkr.ecr.<region>.amazonaws.com
docker build -t my-app .
docker tag my-app:latest <account-id>.dkr.ecr.<region>.amazonaws.com/my-app:latest
docker push <account-id>.dkr.ecr.<region>.amazonaws.com/my-app:latest
3. Apply Changes with Flux
After pushing your Docker image to ECR, you can use Flux to apply the changes to your EKS cluster. Trigger this step in your CI/CD pipeline:
fluxctl sync --k8s-fwd-ns=<namespace>
4. Testing and Validation
Incorporate tests and validation steps in your pipeline to ensure that the deployed application functions as expected in the EKS environment.
D. Secure Image Registry
Using a secure container image registry, such as Amazon ECR, helps ensure your application images are protected and accessible only to authorized users and services. You can control access using IAM policies.
E. Infrastructure as Code
Define your EKS infrastructure using IaC tools like AWS CloudFormation, Terraform, or Pulumi. This way, your cluster setup is easily repeatable and auditable.
F. Automated Scans
Integrate image vulnerability scanning into your CI/CD pipeline. Tools like Clair or Trivy can help you identify security issues in your container images.
Also Read: Challenges of Setting up an Internal Developer Platform - IDP
3. AWS EKS Best Practices for Cluster Version Upgrades
Regularly upgrading your EKS cluster's version is essential to keep up with security patches and new features.
Follow these steps for smooth upgrades.
A. Backup and Testing
Before upgrading, back up your cluster's data and test the upgrade on a non-production environment to catch any potential issues.
B. Upgrade Plan
Create a well-defined plan for each upgrade, including rollbacks if things go wrong. Also, check for compatibility with your applications and add-ons.
C. Node Drainage
During node upgrades, gracefully drain the nodes to ensure that running pods are moved to healthy nodes without disruptions.
Let's look at an example.
kubectl drain <node-name> --ignore-daemonsets
D. Control Plane Upgrades
AWS manages control plane upgrades, but you should monitor and be ready for any required actions during the process.
E. Validate and Monitor
After upgrading, validate that your applications are functioning correctly and that there are no issues. Monitor the cluster closely for any unexpected behavior, and have a rollback plan ready in case you encounter severe problems.
Also Read: Top Microservices Monitoring Tools
4. AWS EKS Tenancy Best Practices
Tenants, who are stakeholders, such as developers who deploy applications to the cluster, may use clusters and so the tenancy model should be put into place when numerous tenants are using an EKS cluster to make sure that each tenant is appropriately governed.
In order to provide effective cluster administration and a high-quality developer experience, workloads are separated and segregated according to cluster tenancy patterns.
Tenancy models affect your EKS clusters' isolation and security. Here are some tips:
A. Shared Tenancy
For cost savings, you can run multiple workloads in a single cluster using Kubernetes namespaces, but ensure strong isolation between them using Kubernetes RBAC.
You will be responsible for authorization or role-based access control (RBAC), a crucial security technique for controlling access to computer or network resources based on the roles of specific users within your organization, regardless of how EKS IAM is implemented, which only handles authentication.
One of the examples of Kubernetes RBAC given is to "avoid adding users to the system:masters group."
Being a member of that group enables the account to bypass all RBAC, putting the security of all of your clusters at risk.
The least amount of rights and access to clusters and nodes should be given to each account.
B. Isolated Tenancy
For security, consider running sensitive workloads in separate clusters. This minimizes the risk of cross-contamination.
C. Resource Limits
Implement resource quotas and limits to prevent runaway workloads from consuming excessive resources, which could lead to cluster instability.
5. AWS EKS Cost and Resource Optimization Best Practices
AWS cost optimization, of course, becomes a significant part of AWS EKS best practices, given huge AWS bills.
A. Right-sizing Instances
Choose the right instance types for your nodes based on your application's resource requirements. Using larger instances than necessary can drive up costs.
You can check instance utilization using CloudWatch or Kubernetes monitoring tools.
kubectl top nodes
Also Read: How to Delete AWS EC2 Instances?
B. Auto Scaling
Implement EKS cluster auto-scaling to automatically adjust the number of nodes based on resource utilization.
This prevents over-provisioning during low-traffic periods.
aws eks update-cluster-config --name <cluster-name> --scaling-config minSize=<min>,maxSize=<max>,desiredSize=<desired>
C. Spot Instances
Consider using EC2 Spot Instances for non-critical workloads. Spot Instances can significantly reduce costs, but they can be interrupted if the capacity is needed by on-demand instances.
aws eks update-nodegroup-config --cluster-name <cluster-name> --nodegroup-name <nodegroup-name> --instance-types <spot-instance-type>
D. Reserved Instances
Utilize EC2 Reserved Instances for stable workloads with predictable resource requirements. This can lead to substantial cost savings compared to on-demand instances.
E. Use Managed Node Groups
EKS Managed Node Groups simplify the scaling and management of worker nodes. They automatically manage the underlying EC2 instances, including scaling and updates.
F. Pod Scheduling
Optimize pod placement using node affinity and anti-affinity rules. This ensures that pods are scheduled on appropriate nodes, reducing resource wastage.
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: <label-key>
operator: In
values:
- <label-value>
G. Horizontal Pod Autoscaling
Implement Kubernetes Horizontal Pod Autoscaling to automatically adjust the number of pod replicas based on CPU or custom metrics.
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: <hpa-name>
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: <deployment-name>
minReplicas: <min-replicas>
maxReplicas: <max-replicas>
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: <target-utilization>
H. Resource Requests and Limits
Set resource requests and limits for containers in your pods. This helps Kubernetes make intelligent scheduling decisions and prevents resource contention.
resources:
requests:
cpu: "0.2"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"
I. Cleanup Unused Resources
Regularly identify and delete unused resources like old pods, deployments, and services.
Tools like kubectl
or Kubernetes Dashboard can help with this.
kubectl delete deployment <deployment-name>
Also Read: AWS ECS vs EKS
6. AWS EKS Best Practices for Cluster Autoscaling
Karpenter and K8s Cluster Autoscaler help you manage your Kubernetes cluster's resources efficiently, so your applications run smoothly without overprovisioning.
A. Use Karpenter for Advanced Autoscaling
Karpenter is a cool tool that extends Kubernetes' native autoscaling capabilities. It considers not only CPU and memory metrics but also custom metrics to scale your workloads.
Installation:
You can install Karpenter using kubectl and Helm. Here's how:
kubectl apply -f https://github.com/awslabs/karpenter/releases/latest/download/release.yaml
Usage:
Define autoscaling profiles for your workloads in Kubernetes Deployment or StatefulSet resources.
Let's look at an example.
apiVersion: v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 1
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app-container
image: my-app-image
autoscaling:
karpenter.sh/strategy: advanced
karpenter.sh/minReplicas: 1
karpenter.sh/maxReplicas: 10
karpenter.sh/concurrency: 50%
B. Cluster Autoscaler for Node Scaling
AWS EKS supports the Kubernetes Cluster Autoscaler, which manages the number of nodes in your cluster based on pending pods.
Installation:
You can enable the Cluster Autoscaler when creating your EKS cluster. If you've already got a cluster, you can adjust the Launch Configuration or Launch Template settings to enable autoscaling.
Use the AWS CLI:
aws eks update-cluster-config --name <cluster-name> --scaling-config file://scaling-config.json
And the scaling-config.json
should look something like this:
{
"clusterName": "<cluster-name>",
"autoScalingGroupRecommendations": {
"defaultCapacityType": "ON_DEMAND"
}
}
C. Pod Resource Requests and Limits
Configure accurate resource requests and limits for your pods. This helps Kubernetes make informed decisions about scaling.
Let's look at an example.
In your Deployment or Pod definition, specify resource requests and limits as such.
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: my-container
image: my-image
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
TL;DR - AWS EKS Best Practices
To conclude, this guide has illuminated a comprehensive array of best practices for AWS EKS, equipping you with the insights needed to navigate the complexities of modern cloud-native architecture.
Security, the cornerstone of any deployment, is bolstered by practices such as VPC isolation, IAM role precision, and network policies, ensuring the protection of your applications and data.
Efficiency and innovation merge seamlessly in GitOps and CI/CD paradigms, facilitated by tools like Flux and ArgoCD. These practices empower you to streamline workflows and amplify the benefits of Infrastructure as Code.
Lastly, EKS version upgrades, tenancy management, cost optimization, and cluster autoscaling round out your toolkit, ensuring that your EKS deployment not only meets but exceeds the demands of a dynamic cloud landscape.
I hope that this guide helps you in mastering EKS and using it to its potential to transform your cloud infrastructure.
Share with friends