Configure Prometheus and Grafana on EKS for Observability
Deploying Prometheus in EKS with CSI Driver: A Comprehensive Guide
Introduction:
Elastic Kubernetes Service (EKS) is one of the easiest ways to use Kubernetes managed by AWS. This guide assumes you have a basic understanding of EKS, including deploying pods. Here, we address a common issue: Prometheus pods (or other pods requiring volumes) are stuck in a Pending status due to improper volume configuration.
Prerequisites:
- EKS Cluster:
- Create an EKS cluster using the following command:
eksctl create cluster --name <name> --region <region> --node-type t2.large --managed
This provisions an EKS cluster with t2.large EC2 instances. Note: This incurs costs.
- Create an EKS cluster using the following command:
- kubectl Installed:
- Install the
kubectl
command for configuring the EKS cluster. Official Documentation
- Install the
- Helm Package Manager:
- Install Helm for Kubernetes. Installation Guide
- Knowledge of OIDC:
- Understand OIDC (Open ID Connect) and its integration with AWS. AWS OIDC Documentation
Principles of the CSI Driver in the EKS Cluster:
In a Kubernetes cluster, the Service Account is the final entity responsible for essential tasks, such as provisioning pods and accessing resources outside the cluster (e.g., provisioning EBS volumes in AWS, GCP, etc.). Initially, Kubernetes core included a volume plugin to manage all volume-related actions, such as:
- Deleting volumes
- Provisioning new volumes
- Increasing volume sizes
However, managing these tasks directly within the Kubernetes core created significant administrative overhead. To address this, Kubernetes introduced the CSI (Container Storage Interface) driver, which offloads these actions from the core. The CSI driver operates as a DaemonSet in the kube-system
namespace, where most core functionalities reside (e.g., etcd server, controller, kube-config, etc.).
In a Kubernetes cluster, the ServiceAccount serves as the primary entity responsible for executing key tasks such as provisioning pods and accessing external resources required by the cluster, such as EBS volumes in cloud environments like AWS or GCP. Previously, Kubernetes managed storage operations through in-tree volume plugins that handled tasks like provisioning, deleting, and resizing volumes. However, this approach became an administrative burden for the Kubernetes core. To address this, the Container Storage Interface (CSI) driver was introduced to offload these responsibilities, removing the need for volume management to be tightly coupled with the Kubernetes core.
The CSI driver is deployed as DaemonSets in the kube-system
namespace, where core Kubernetes services like the API server, controller manager, etcd operate. These DaemonSets implement storage functionality and interact with external storage systems to handle tasks such as provisioning, attaching, and mounting volumes. This architecture allows Kubernetes to remain modular while relying on CSI drivers to manage storage operations efficiently.
When a Kubernetes workload requests persistent storage, the process begins with Kubernetes checking for a suitable volume in the cluster. If none exists, Kubernetes uses the CSI driver to dynamically provision the required volume, ensuring seamless integration with external storage systems. The sequence of events during this process is outlined below.
Sequence of Events:
When Kubernetes needs to provision a volume, the following sequence occurs:
- Service Account Requests Token:
- The service account uses OIDC to request a token to acquire AWS credentials for provisioning resources.
- OIDC Issues Signed JWT:
- The token consists of:
- Header: Encodes token type and signing algorithm.
- Payload: Contains claims.
- Signature: Ensures the token’s integrity.
- The token consists of:
- AWS STS Validates the Token:
- STS verifies the token against the attached IAM permissions.
- STS Assumes Role:
- Produces temporary credentials and sends them to the service account.
- Service Account Authenticates with AWS:
- Uses credentials to provision an EBS volume (or access other resources specified in the IAM role).
- CSI Driver Links Volume:
- The driver associates the EBS volume with Persistent Volume (PV) and Persistent Volume Claims (PVC).
Project Steps:
Step 1: Link OIDC with the EKS Cluster
- List OIDC Providers:
aws iam list-open-id-connect-providers
- Associate OIDC with EKS Cluster:
eksctl utils associate-iam-oidc-provider --region <region> --cluster <name> --approve
Step 2: Create Namespace for Prometheus
- Create a
prometheus
namespace.kubectl create namespace prometheus
Step 3: Add Prometheus Helm Chart
- Add the Prometheus Helm repository:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
- Update Helm repositories:
helm repo update
Step 4: Deploy Prometheus
- Deploy Prometheus using Helm:
helm install prometheus prometheus-community/prometheus --namespace prometheus
- Pods such as
prometheus-server
andprometheus-alert-manager
will remain Pending if PVCs are not provisioned.
Step 5: Install CSI Driver
- Install the AWS EBS CSI driver as an EKS addon:
eksctl create addon --name aws-ebs-csi-driver --cluster <name> --force
- Verify the addon installation:
eksctl get addon --name aws-ebs-csi-driver --cluster <name>
Step 6: Create an IAM Role for an EBS CSI Driver
- IAM role which will enable entities to provision ebs volume and trust policy will assume role with web Identities. Make sure to mention the OIDC URL in the trust policy. If you want to do it manually, you can go ahead. Command to get the OIDC URL.
aws eks desribe-cluster --name <name> --region <name> --query "cluster.identity.oidc.issuer" --output text
- Now if you are doing it manually ensure that it is correctly mentioned in the trust policy.
- If you don’t want to do it manually then use the following command to create an IAM role, attach the OIDC URL to the trust policy, and annotate the role with the service account. It will provision IAM role, attach the OIDC URL to the trust policy, and attach/annotate the IAM role with the Service account.
eksctl create iamserviceaccount \ --name ebs-csi-controller-sa \ --cluster <name> \ --namespace kube-system \ --role-name AmazonEKS_EBS_CSI_DriverRole \ --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \ --approve
Outcome:
After completing these steps, the Prometheus pods should transition to a Running state with Persistent Volumes (PVs) successfully provisioned. You can verify the status of your pods using:
kubectl get pods -n prometheus
Conclusion:
This guide outlines the integration of the AWS EBS CSI driver with an EKS cluster to resolve volume-related issues for pods such as Prometheus. By leveraging OIDC and IAM roles, the Kubernetes service account can seamlessly interact with AWS resources, ensuring a robust and scalable deployment.
0 Comments