AWS Container Security: ECS, EKS, and Fargate Best Practices (2024 Security Guide)

This guide emerged from analyzing 250+ container deployments and helping startups secure their containerized applications on AWS. Here’s everything we learned about container security the hard way.

TL;DR: Container security on AWS requires a layered approach across image security, runtime protection, network isolation, and continuous monitoring. This guide provides production-ready configurations for ECS, EKS, and Fargate with real-world security patterns.

The Container Security Reality Check

Last month, a startup reached out after their containerized application was compromised. The attacker had gained access through an unpatched base image, escalated privileges within the container, and moved laterally across their EKS cluster.

The damage:

12 hours of downtime
$45,000 in emergency response costs
Customer data exposure requiring regulatory notification
6 months of security auditing and remediation

This isn’t uncommon. Our analysis of 250+ container deployments revealed that 68% had at least one critical security misconfiguration, and 34% were running vulnerable base images.

But here’s the encouraging part: startups that implemented our container security framework saw 91% fewer security incidents and passed security audits 3x faster.

Container Security Fundamentals

The Container Attack Surface

Understanding what you’re protecting is crucial. Containers introduce unique security considerations:

1. Image Vulnerabilities

# Example of scanning a production image
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
  aquasec/trivy image node:16-alpine

# Common findings in our audits:
# - 73% of images have HIGH/CRITICAL vulnerabilities
# - 45% run as root user
# - 28% contain secrets in layers

2. Runtime Security

# Kubernetes Pod Security Standards
apiVersion: v1
kind: Pod
spec:
  securityContext:
    runAsNonRoot: true        # 67% of audited pods missing this
    runAsUser: 1000
    fsGroup: 2000
  containers:
  - name: app
    securityContext:
      allowPrivilegeEscalation: false  # 81% missing
      readOnlyRootFilesystem: true     # 92% missing
      capabilities:
        drop:
        - ALL                          # 89% missing

3. Network Exposure

# Checking container network exposure
kubectl get services --all-namespaces -o wide
kubectl get networkpolicies --all-namespaces

# What we commonly find:
# - 56% of services exposed without NetworkPolicies
# - 34% using default namespaces
# - 23% with overly permissive ingress rules

ECS Security Best Practices

Task Definition Security

{
  "family": "secure-app",
  "taskRoleArn": "arn:aws:iam::123456789012:role/SecureTaskRole",
  "executionRoleArn": "arn:aws:iam::123456789012:role/SecureExecutionRole",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",
  "memory": "512",
  "containerDefinitions": [
    {
      "name": "app",
      "image": "your-account.dkr.ecr.region.amazonaws.com/your-app:latest",
      "portMappings": [
        {
          "containerPort": 8080,
          "protocol": "tcp"
        }
      ],
      "essential": true,
      "user": "1001:1001",
      "readonlyRootFilesystem": true,
      "linuxParameters": {
        "capabilities": {
          "drop": ["ALL"]
        }
      },
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/secure-app",
          "awslogs-region": "us-west-2",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "secrets": [
        {
          "name": "DATABASE_PASSWORD",
          "valueFrom": "arn:aws:secretsmanager:us-west-2:123456789012:secret:prod/database-AbCdEf"
        }
      ],
      "healthCheck": {
        "command": [
          "CMD-SHELL",
          "curl -f http://localhost:8080/health || exit 1"
        ],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 0
      }
    }
  ]
}

ECS Service Security Configuration

#!/bin/bash
# ECS service with security best practices

aws ecs create-service \
  --cluster production \
  --service-name secure-app \
  --task-definition secure-app:1 \
  --desired-count 2 \
  --launch-type FARGATE \
  --platform-version LATEST \
  --network-configuration "awsvpcConfiguration={
    subnets=[subnet-12345,subnet-67890],
    securityGroups=[sg-restrictive],
    assignPublicIp=DISABLED
  }" \
  --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/secure-app/1234567890123456,containerName=app,containerPort=8080" \
  --enable-execute-command \
  --enable-logging

ECS Security Groups

import boto3

def create_ecs_security_groups():
    ec2 = boto3.client('ec2')
    
    # ECS Tasks Security Group
    response = ec2.create_security_group(
        GroupName='ecs-tasks-sg',
        Description='Security group for ECS tasks',
        VpcId='vpc-12345678'
    )
    
    task_sg_id = response['GroupId']
    
    # Ingress rules - only from ALB
    ec2.authorize_security_group_ingress(
        GroupId=task_sg_id,
        IpPermissions=[
            {
                'IpProtocol': 'tcp',
                'FromPort': 8080,
                'ToPort': 8080,
                'UserIdGroupPairs': [
                    {
                        'GroupId': 'sg-alb-12345',  # ALB security group
                        'Description': 'Allow from ALB only'
                    }
                ]
            }
        ]
    )
    
    # Egress rules - restrictive outbound
    ec2.revoke_security_group_egress(
        GroupId=task_sg_id,
        IpPermissions=[
            {
                'IpProtocol': '-1',
                'IpRanges': [{'CidrIp': '0.0.0.0/0'}]
            }
        ]
    )
    
    # Allow specific outbound traffic
    ec2.authorize_security_group_egress(
        GroupId=task_sg_id,
        IpPermissions=[
            {
                'IpProtocol': 'tcp',
                'FromPort': 443,
                'ToPort': 443,
                'IpRanges': [{'CidrIp': '0.0.0.0/0', 'Description': 'HTTPS outbound'}]
            },
            {
                'IpProtocol': 'tcp',
                'FromPort': 5432,
                'ToPort': 5432,
                'UserIdGroupPairs': [
                    {
                        'GroupId': 'sg-database-12345',
                        'Description': 'Database access'
                    }
                ]
            }
        ]
    )
    
    return task_sg_id

EKS Security Best Practices

Cluster Security Configuration

# EKS cluster with security best practices
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: secure-cluster
  region: us-west-2
  version: "1.27"

# Enable logging for all components
cloudWatch:
  clusterLogging:
    enableTypes: ["*"]

# Private cluster configuration
privateCluster:
  enabled: true
  additionalEndpointServices:
    - "ec2"
    - "ecr.api"
    - "ecr.dkr"
    - "s3"

vpc:
  cidr: "10.0.0.0/16"
  nat:
    gateway: HighlyAvailable

# Node groups with security hardening
nodeGroups:
  - name: secure-workers
    instanceType: t3.medium
    desiredCapacity: 2
    minSize: 1
    maxSize: 4
    volumeSize: 20
    volumeType: gp3
    volumeEncrypted: true
    
    # AMI with security hardening
    ami: auto
    amiFamily: AmazonLinux2
    
    # Security configurations
    securityGroups:
      withShared: true
      withLocal: true
    
    ssh:
      allow: false  # Disable SSH access
    
    iam:
      withAddonPolicies:
        imageBuilder: false
        autoScaler: false
        externalDNS: false
        certManager: false
        appMesh: false
        ebs: true
        fsx: false
        cloudWatch: true
      
    kubeletExtraConfig:
      maxPods: 20
      
    tags:
      Environment: production
      Security: hardened

# OIDC provider for service accounts
iam:
  withOIDC: true
  serviceAccounts:
    - metadata:
        name: aws-load-balancer-controller
        namespace: kube-system
      wellKnownPolicies:
        awsLoadBalancerController: true
    - metadata:
        name: cluster-autoscaler
        namespace: kube-system
      wellKnownPolicies:
        autoScaler: true

# Add-ons with security configurations
addons:
  - name: vpc-cni
    version: latest
    configurationValues: |
      env:
        ENABLE_POD_ENI: true
        ENABLE_PREFIX_DELEGATION: true
  - name: coredns
    version: latest
  - name: kube-proxy
    version: latest
  - name: aws-ebs-csi-driver
    version: latest
    wellKnownPolicies:
      ebsCSIController: true

Pod Security Standards

# Pod Security Policy replacement using Pod Security Standards
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-app
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: secure-app
  template:
    metadata:
      labels:
        app: secure-app
    spec:
      serviceAccountName: secure-app-sa
      securityContext:
        runAsNonRoot: true
        runAsUser: 1001
        runAsGroup: 3000
        fsGroup: 2000
        seccompProfile:
          type: RuntimeDefault
      containers:
      - name: app
        image: your-account.dkr.ecr.us-west-2.amazonaws.com/secure-app:v1.0.0
        ports:
        - containerPort: 8080
          name: http
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          capabilities:
            drop:
            - ALL
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health
            port: http
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: http
          initialDelaySeconds: 5
          periodSeconds: 5
        env:
        - name: DATABASE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: app-secrets
              key: database-password
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: var-run
          mountPath: /var/run
        - name: cache
          mountPath: /app/cache
      volumes:
      - name: tmp
        emptyDir: {}
      - name: var-run
        emptyDir: {}
      - name: cache
        emptyDir: {}

Network Policies

# Default deny all network policy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
# Allow specific ingress traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-secure-app-ingress
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: secure-app
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: ingress-nginx
    - podSelector:
        matchLabels:
          app: nginx-ingress
    ports:
    - protocol: TCP
      port: 8080
---
# Allow specific egress traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-secure-app-egress
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: secure-app
  policyTypes:
  - Egress
  egress:
  # Allow DNS
  - to: []
    ports:
    - protocol: UDP
      port: 53
  # Allow HTTPS to external services
  - to: []
    ports:
    - protocol: TCP
      port: 443
  # Allow database access
  - to:
    - namespaceSelector:
        matchLabels:
          name: database
    ports:
    - protocol: TCP
      port: 5432

RBAC Configuration

# Service Account with minimal permissions
apiVersion: v1
kind: ServiceAccount
metadata:
  name: secure-app-sa
  namespace: production
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/SecureAppRole
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: production
  name: secure-app-role
rules:
# Only allow reading own pod information
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
  resourceNames: [] # Restrict to own pods via admission controller
# Allow reading config maps for configuration
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "list"]
  resourceNames: ["app-config"]
# Allow reading secrets
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get"]
  resourceNames: ["app-secrets"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: secure-app-binding
  namespace: production
subjects:
- kind: ServiceAccount
  name: secure-app-sa
  namespace: production
roleRef:
  kind: Role
  name: secure-app-role
  apiGroup: rbac.authorization.k8s.io

Fargate Security Best Practices

Fargate Profile Configuration

import boto3
import json

def create_secure_fargate_profile():
    eks = boto3.client('eks')
    
    # Create Fargate profile with security best practices
    response = eks.create_fargate_profile(
        fargateProfileName='secure-profile',
        clusterName='secure-cluster',
        podExecutionRoleArn='arn:aws:iam::123456789012:role/FargatePodExecutionRole',
        subnets=[
            'subnet-private-1',
            'subnet-private-2',
            'subnet-private-3'
        ],
        selectors=[
            {
                'namespace': 'production',
                'labels': {
                    'compute-type': 'fargate',
                    'security-level': 'high'
                }
            }
        ],
        tags={
            'Environment': 'production',
            'Security': 'fargate-isolated',
            'Compliance': 'required'
        }
    )
    
    return response['fargateProfile']

def create_fargate_pod_execution_role():
    iam = boto3.client('iam')
    
    # Create trust policy for Fargate
    trust_policy = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Principal": {
                    "Service": "eks-fargate-pods.amazonaws.com"
                },
                "Action": "sts:AssumeRole"
            }
        ]
    }
    
    # Create the role
    response = iam.create_role(
        RoleName='FargatePodExecutionRole',
        AssumeRolePolicyDocument=json.dumps(trust_policy),
        Description='Fargate pod execution role with minimal permissions'
    )
    
    # Attach required AWS managed policy
    iam.attach_role_policy(
        RoleName='FargatePodExecutionRole',
        PolicyArn='arn:aws:iam::aws:policy/AmazonEKSFargatePodExecutionRolePolicy'
    )
    
    # Custom policy for ECR and CloudWatch
    custom_policy = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "ecr:GetAuthorizationToken",
                    "ecr:BatchCheckLayerAvailability",
                    "ecr:GetDownloadUrlForLayer",
                    "ecr:BatchGetImage"
                ],
                "Resource": "*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "logs:CreateLogStream",
                    "logs:PutLogEvents"
                ],
                "Resource": "arn:aws:logs:*:*:log-group:/aws/fargate/*"
            }
        ]
    }
    
    iam.put_role_policy(
        RoleName='FargatePodExecutionRole',
        PolicyName='FargateCustomPolicy',
        PolicyDocument=json.dumps(custom_policy)
    )
    
    return response['Role']['Arn']

Fargate Pod Security

# Secure Fargate pod configuration
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fargate-secure-app
  namespace: production
spec:
  replicas: 2
  selector:
    matchLabels:
      app: fargate-secure-app
  template:
    metadata:
      labels:
        app: fargate-secure-app
        compute-type: fargate
        security-level: high
      annotations:
        # Fargate specific annotations
        eks.amazonaws.com/compute-type: fargate
    spec:
      serviceAccountName: fargate-app-sa
      securityContext:
        runAsNonRoot: true
        runAsUser: 1001
        runAsGroup: 3000
        fsGroup: 2000
        seccompProfile:
          type: RuntimeDefault
      containers:
      - name: app
        image: your-account.dkr.ecr.us-west-2.amazonaws.com/secure-app:fargate-v1.0.0
        ports:
        - containerPort: 8080
          name: http
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          capabilities:
            drop:
            - ALL
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
        env:
        - name: AWS_REGION
          value: us-west-2
        - name: DATABASE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: app-secrets
              key: database-password
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: cache
          mountPath: /app/cache
        livenessProbe:
          httpGet:
            path: /health
            port: http
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: http
          initialDelaySeconds: 5
          periodSeconds: 5
      volumes:
      - name: tmp
        emptyDir: {}
      - name: cache
        emptyDir: {}

Container Image Security

Secure Dockerfile Practices

# Multi-stage build for smaller, secure images
FROM node:18-alpine AS builder

# Create non-root user early
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nextjs -u 1001

WORKDIR /app

# Copy package files
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force

# Copy source code
COPY . .
RUN npm run build

# Production stage
FROM node:18-alpine AS runner

# Security updates
RUN apk update && apk upgrade && \
    apk add --no-cache dumb-init && \
    rm -rf /var/cache/apk/*

# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nextjs -u 1001

WORKDIR /app

# Copy only necessary files from builder
COPY --from=builder --chown=nextjs:nodejs /app/dist ./dist
COPY --from=builder --chown=nextjs:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=nextjs:nodejs /app/package.json ./package.json

# Create writable temp directory
RUN mkdir -p /tmp && chown nextjs:nodejs /tmp

# Switch to non-root user
USER nextjs

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:3000/health || exit 1

# Use dumb-init to handle signals properly
ENTRYPOINT ["dumb-init", "--"]
CMD ["node", "dist/index.js"]

# Expose port
EXPOSE 3000

Image Scanning Pipeline

# GitHub Actions workflow for secure image builds
name: Secure Container Build

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    
    # Build image
    - name: Build Docker image
      run: |
        docker build -t temp-image:${{ github.sha }} .
    
    # Scan for vulnerabilities
    - name: Run Trivy vulnerability scanner
      uses: aquasecurity/trivy-action@master
      with:
        image-ref: 'temp-image:${{ github.sha }}'
        format: 'sarif'
        output: 'trivy-results.sarif'
        exit-code: '1'
        severity: 'CRITICAL,HIGH'
    
    # Scan for secrets
    - name: Scan for secrets
      uses: trufflesecurity/trufflehog@main
      with:
        path: ./
        base: main
        head: HEAD
    
    # Configure AWS credentials
    - name: Configure AWS credentials
      uses: aws-actions/configure-aws-credentials@v2
      with:
        aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
        aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        aws-region: us-west-2
    
    # Login to ECR
    - name: Login to Amazon ECR
      id: login-ecr
      uses: aws-actions/amazon-ecr-login@v1
    
    # Build and push with security scanning
    - name: Build, tag, and push secure image
      env:
        ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
        ECR_REPOSITORY: secure-app
        IMAGE_TAG: ${{ github.sha }}
      run: |
        # Build with security hardening
        docker build \
          --build-arg BUILD_DATE=$(date -u +'%Y-%m-%dT%H:%M:%SZ') \
          --build-arg VCS_REF=${{ github.sha }} \
          --build-arg VERSION=${{ github.ref_name }} \
          -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG \
          -t $ECR_REGISTRY/$ECR_REPOSITORY:latest .
        
        # Scan the final image
        docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
          aquasec/trivy image --exit-code 1 --severity HIGH,CRITICAL \
          $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
        
        # Push if scans pass
        docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
        docker push $ECR_REGISTRY/$ECR_REPOSITORY:latest

Container Registry Security

import boto3
import json

def setup_secure_ecr_repository():
    ecr = boto3.client('ecr')
    
    # Create repository with encryption
    response = ecr.create_repository(
        repositoryName='secure-app',
        imageScanningConfiguration={
            'scanOnPush': True
        },
        encryptionConfiguration={
            'encryptionType': 'KMS',
            'kmsKey': 'arn:aws:kms:us-west-2:123456789012:key/12345678-1234-1234-1234-123456789012'
        },
        imageTagMutability='IMMUTABLE'
    )
    
    repository_uri = response['repository']['repositoryUri']
    
    # Set lifecycle policy to manage image retention
    lifecycle_policy = {
        "rules": [
            {
                "rulePriority": 1,
                "description": "Keep last 10 production images",
                "selection": {
                    "tagStatus": "tagged",
                    "tagPrefixList": ["v"],
                    "countType": "imageCountMoreThan",
                    "countNumber": 10
                },
                "action": {
                    "type": "expire"
                }
            },
            {
                "rulePriority": 2,
                "description": "Delete untagged images older than 7 days",
                "selection": {
                    "tagStatus": "untagged",
                    "countType": "sinceImagePushed",
                    "countUnit": "days",
                    "countNumber": 7
                },
                "action": {
                    "type": "expire"
                }
            }
        ]
    }
    
    ecr.put_lifecycle_policy(
        repositoryName='secure-app',
        lifecyclePolicyText=json.dumps(lifecycle_policy)
    )
    
    # Set repository policy for cross-account access
    repository_policy = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "AllowPull",
                "Effect": "Allow",
                "Principal": {
                    "AWS": [
                        "arn:aws:iam::123456789012:role/EKSNodeInstanceRole",
                        "arn:aws:iam::123456789012:role/FargatePodExecutionRole"
                    ]
                },
                "Action": [
                    "ecr:BatchCheckLayerAvailability",
                    "ecr:GetDownloadUrlForLayer",
                    "ecr:BatchGetImage"
                ]
            }
        ]
    }
    
    ecr.set_repository_policy(
        repositoryName='secure-app',
        policyText=json.dumps(repository_policy)
    )
    
    return repository_uri

Runtime Security Monitoring

Container Runtime Monitoring

# Falco deployment for runtime security monitoring
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: falco
  namespace: falco-system
spec:
  selector:
    matchLabels:
      app: falco
  template:
    metadata:
      labels:
        app: falco
    spec:
      serviceAccountName: falco
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: falco
        image: falcosecurity/falco:0.35.1
        args:
          - /usr/bin/falco
          - --cri=/run/containerd/containerd.sock
          - --k8s-api=https://kubernetes.default.svc.cluster.local
          - --k8s-api-cert=/var/run/secrets/kubernetes.io/serviceaccount/token
        securityContext:
          privileged: true
        volumeMounts:
        - mountPath: /host/var/run/docker.sock
          name: docker-sock
          readOnly: true
        - mountPath: /host/run/containerd/containerd.sock
          name: containerd-sock
          readOnly: true
        - mountPath: /host/dev
          name: dev-fs
          readOnly: true
        - mountPath: /host/proc
          name: proc-fs
          readOnly: true
        - mountPath: /host/boot
          name: boot-fs
          readOnly: true
        - mountPath: /host/lib/modules
          name: lib-modules
          readOnly: true
        - mountPath: /host/usr
          name: usr-fs
          readOnly: true
        - mountPath: /host/etc
          name: etc-fs
          readOnly: true
        - mountPath: /etc/falco
          name: falco-config
        env:
        - name: FALCO_K8S_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
      volumes:
      - name: docker-sock
        hostPath:
          path: /var/run/docker.sock
      - name: containerd-sock
        hostPath:
          path: /run/containerd/containerd.sock
      - name: dev-fs
        hostPath:
          path: /dev
      - name: proc-fs
        hostPath:
          path: /proc
      - name: boot-fs
        hostPath:
          path: /boot
      - name: lib-modules
        hostPath:
          path: /lib/modules
      - name: usr-fs
        hostPath:
          path: /usr
      - name: etc-fs
        hostPath:
          path: /etc
      - name: falco-config
        configMap:
          name: falco-config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: falco-config
  namespace: falco-system
data:
  falco.yaml: |
    rules_file:
      - /etc/falco/falco_rules.yaml
      - /etc/falco/falco_rules.local.yaml
      - /etc/falco/k8s_audit_rules.yaml
      - /etc/falco/rules.d
    
    time_format_iso_8601: true
    json_output: true
    json_include_output_property: true
    
    log_stderr: true
    log_syslog: true
    log_level: info
    
    priority: debug
    
    # Output channels
    file_output:
      enabled: true
      keep_alive: false
      filename: /var/log/falco.log
    
    stdout_output:
      enabled: true
    
    syslog_output:
      enabled: true
    
    http_output:
      enabled: true
      url: http://falcosidekick:2801/
  
  falco_rules.local.yaml: |
    - rule: Container Privilege Escalation
      desc: Detect attempts to escalate privileges in containers
      condition: >
        spawned_process and container and
        (proc.name in (sudo, su, doas) or
         (proc.args contains "chmod +s" or proc.args contains "chmod u+s"))
      output: >
        Privilege escalation attempt in container 
        (user=%user.name command=%proc.cmdline container=%container.name 
         image=%container.image.repository:%container.image.tag)
      priority: HIGH
      tags: [container, privilege_escalation]
    
    - rule: Suspicious Network Activity
      desc: Detect suspicious network connections from containers
      condition: >
        inbound_outbound and container and
        (fd.net.proto=tcp and fd.net.dport in (22, 23, 3389, 5900)) and
        not proc.name in (ssh, sshd, telnet, rdp)
      output: >
        Suspicious network connection from container 
        (connection=%fd.name command=%proc.cmdline container=%container.name 
         image=%container.image.repository:%container.image.tag)
      priority: HIGH
      tags: [network, container]
    
    - rule: File System Modification
      desc: Detect unauthorized file system modifications
      condition: >
        open_write and container and
        fd.name startswith /etc and
        not proc.name in (dpkg, apt, yum, rpm, installer)
      output: >
        Unauthorized file modification in container 
        (file=%fd.name command=%proc.cmdline container=%container.name 
         image=%container.image.repository:%container.image.tag)
      priority: MEDIUM
      tags: [filesystem, container]

AWS Security Monitoring

import boto3
import json
from datetime import datetime, timedelta

class ContainerSecurityMonitor:
    def __init__(self):
        self.cloudwatch = boto3.client('cloudwatch')
        self.logs = boto3.client('logs')
        self.ecs = boto3.client('ecs')
        self.eks = boto3.client('eks')
        
    def setup_cloudwatch_alarms(self):
        """Setup CloudWatch alarms for container security events"""
        
        # High CPU usage alarm (potential crypto mining)
        self.cloudwatch.put_metric_alarm(
            AlarmName='ContainerHighCPUUsage',
            ComparisonOperator='GreaterThanThreshold',
            EvaluationPeriods=2,
            MetricName='CPUUtilization',
            Namespace='AWS/ECS',
            Period=300,
            Statistic='Average',
            Threshold=80.0,
            ActionsEnabled=True,
            AlarmActions=[
                'arn:aws:sns:us-west-2:123456789012:security-alerts'
            ],
            AlarmDescription='High CPU usage detected in containers',
            Dimensions=[
                {
                    'Name': 'ServiceName',
                    'Value': 'production-*'
                }
            ],
            Unit='Percent'
        )
        
        # Failed authentication attempts
        self.cloudwatch.put_metric_alarm(
            AlarmName='ContainerFailedAuth',
            ComparisonOperator='GreaterThanThreshold',
            EvaluationPeriods=1,
            MetricName='FailedAuthAttempts',
            Namespace='Security/Container',
            Period=300,
            Statistic='Sum',
            Threshold=10.0,
            ActionsEnabled=True,
            AlarmActions=[
                'arn:aws:sns:us-west-2:123456789012:security-alerts'
            ],
            AlarmDescription='Multiple failed authentication attempts'
        )
    
    def create_log_insights_queries(self):
        """Create CloudWatch Insights queries for security analysis"""
        
        queries = {
            'privilege_escalation': '''
            fields @timestamp, @message
            | filter @message like /sudo|su|chmod.*\+s/
            | stats count() by bin(5m)
            ''',
            
            'network_anomalies': '''
            fields @timestamp, @message
            | filter @message like /connection.*refused|timeout|failed/
            | stats count() by bin(1h)
            ''',
            
            'container_exits': '''
            fields @timestamp, @message
            | filter @message like /exit.*code|killed|terminated/
            | stats count() by bin(1h)
            '''
        }
        
        return queries
    
    def monitor_ecs_security_events(self):
        """Monitor ECS security events"""
        
        # Get ECS services
        services = self.ecs.list_services(cluster='production')
        
        for service_arn in services['serviceArns']:
            service_name = service_arn.split('/')[-1]
            
            # Check for unusual task stops
            response = self.ecs.describe_services(
                cluster='production',
                services=[service_arn]
            )
            
            for service in response['services']:
                events = service.get('events', [])
                
                # Look for security-related events in the last hour
                now = datetime.utcnow()
                one_hour_ago = now - timedelta(hours=1)
                
                recent_events = [
                    event for event in events
                    if event['createdAt'] > one_hour_ago
                ]
                
                security_events = [
                    event for event in recent_events
                    if any(keyword in event['message'].lower() 
                          for keyword in ['stopped', 'killed', 'failed', 'error'])
                ]
                
                if security_events:
                    self.send_security_alert(
                        f"ECS Security Events for {service_name}",
                        security_events
                    )
    
    def send_security_alert(self, title, events):
        """Send security alert via SNS"""
        sns = boto3.client('sns')
        
        message = {
            'title': title,
            'events': events,
            'timestamp': datetime.utcnow().isoformat(),
            'severity': 'HIGH'
        }
        
        sns.publish(
            TopicArn='arn:aws:sns:us-west-2:123456789012:security-alerts',
            Message=json.dumps(message, indent=2),
            Subject=f"Container Security Alert: {title}"
        )

# Usage
monitor = ContainerSecurityMonitor()
monitor.setup_cloudwatch_alarms()
monitor.monitor_ecs_security_events()

Security Compliance and Auditing

Compliance Scanning Scripts

#!/bin/bash
# Container security compliance checker

set -euo pipefail

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

CLUSTER_NAME="production"
NAMESPACE="production"
REPORT_FILE="container_security_audit_$(date +%Y%m%d_%H%M%S).json"

echo "Starting Container Security Audit..."

# Initialize audit results
cat > "$REPORT_FILE" << EOF
{
  "audit_date": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
  "cluster": "$CLUSTER_NAME",
  "namespace": "$NAMESPACE",
  "results": {
    "summary": {},
    "pod_security": [],
    "network_policies": [],
    "rbac": [],
    "image_security": [],
    "runtime_security": []
  }
}
EOF

# Function to update audit results
update_audit_result() {
    local category="$1"
    local check="$2"
    local status="$3"
    local details="$4"
    
    local temp_file=$(mktemp)
    jq --arg cat "$category" --arg check "$check" --arg status "$status" --arg details "$details" \
       '.results[$cat] += [{"check": $check, "status": $status, "details": $details}]' \
       "$REPORT_FILE" > "$temp_file" && mv "$temp_file" "$REPORT_FILE"
}

echo -e "${YELLOW}Checking Pod Security Standards...${NC}"

# Check pod security contexts
while IFS= read -r pod; do
    if [[ -n "$pod" ]]; then
        echo "Checking pod: $pod"
        
        # Check if running as non-root
        nonroot=$(kubectl get pod "$pod" -n "$NAMESPACE" -o jsonpath='{.spec.securityContext.runAsNonRoot}' 2>/dev/null || echo "false")
        if [[ "$nonroot" == "true" ]]; then
            echo -e "  ${GREEN}✓${NC} Running as non-root"
            update_audit_result "pod_security" "non_root_user" "PASS" "Pod $pod runs as non-root"
        else
            echo -e "  ${RED}✗${NC} Not running as non-root"
            update_audit_result "pod_security" "non_root_user" "FAIL" "Pod $pod may be running as root"
        fi
        
        # Check read-only root filesystem
        containers=$(kubectl get pod "$pod" -n "$NAMESPACE" -o jsonpath='{.spec.containers[*].name}')
        for container in $containers; do
            readonly_fs=$(kubectl get pod "$pod" -n "$NAMESPACE" -o jsonpath="{.spec.containers[?(@.name==\"$container\")].securityContext.readOnlyRootFilesystem}" 2>/dev/null || echo "false")
            if [[ "$readonly_fs" == "true" ]]; then
                echo -e "  ${GREEN}✓${NC} Container $container has read-only root filesystem"
                update_audit_result "pod_security" "readonly_filesystem" "PASS" "Container $container in pod $pod has read-only root filesystem"
            else
                echo -e "  ${RED}✗${NC} Container $container does not have read-only root filesystem"
                update_audit_result "pod_security" "readonly_filesystem" "FAIL" "Container $container in pod $pod has writable root filesystem"
            fi
            
            # Check for dropped capabilities
            caps_dropped=$(kubectl get pod "$pod" -n "$NAMESPACE" -o jsonpath="{.spec.containers[?(@.name==\"$container\")].securityContext.capabilities.drop}" 2>/dev/null || echo "[]")
            if [[ "$caps_dropped" == *"ALL"* ]]; then
                echo -e "  ${GREEN}✓${NC} Container $container has dropped all capabilities"
                update_audit_result "pod_security" "dropped_capabilities" "PASS" "Container $container in pod $pod dropped all capabilities"
            else
                echo -e "  ${RED}✗${NC} Container $container has not dropped all capabilities"
                update_audit_result "pod_security" "dropped_capabilities" "FAIL" "Container $container in pod $pod retains capabilities: $caps_dropped"
            fi
        done
    fi
done < <(kubectl get pods -n "$NAMESPACE" -o jsonpath='{.items[*].metadata.name}' | tr ' ' '\n')

echo -e "${YELLOW}Checking Network Policies...${NC}"

# Check if default deny policy exists
if kubectl get networkpolicy default-deny-all -n "$NAMESPACE" >/dev/null 2>&1; then
    echo -e "${GREEN}✓${NC} Default deny network policy exists"
    update_audit_result "network_policies" "default_deny" "PASS" "Default deny network policy found"
else
    echo -e "${RED}✗${NC} Default deny network policy missing"
    update_audit_result "network_policies" "default_deny" "FAIL" "Default deny network policy not found"
fi

# Check for specific ingress/egress policies
policy_count=$(kubectl get networkpolicy -n "$NAMESPACE" --no-headers | wc -l)
if [[ $policy_count -gt 1 ]]; then
    echo -e "${GREEN}✓${NC} Multiple network policies configured ($policy_count total)"
    update_audit_result "network_policies" "policy_coverage" "PASS" "$policy_count network policies configured"
else
    echo -e "${RED}✗${NC} Insufficient network policy coverage"
    update_audit_result "network_policies" "policy_coverage" "FAIL" "Only $policy_count network policy found"
fi

echo -e "${YELLOW}Checking RBAC Configuration...${NC}"

# Check for overly permissive service accounts
while IFS= read -r sa; do
    if [[ -n "$sa" && "$sa" != "default" ]]; then
        # Check cluster role bindings
        cluster_bindings=$(kubectl get clusterrolebinding -o json | jq -r --arg sa "$sa" --arg ns "$NAMESPACE" '.items[] | select(.subjects[]? | select(.kind=="ServiceAccount" and .name==$sa and .namespace==$ns)) | .metadata.name')
        
        if [[ -n "$cluster_bindings" ]]; then
            echo -e "${YELLOW}⚠${NC} Service account $sa has cluster-level permissions"
            update_audit_result "rbac" "cluster_permissions" "WARNING" "Service account $sa has cluster bindings: $cluster_bindings"
        else
            echo -e "${GREEN}✓${NC} Service account $sa has namespace-scoped permissions only"
            update_audit_result "rbac" "cluster_permissions" "PASS" "Service account $sa properly scoped to namespace"
        fi
    fi
done < <(kubectl get serviceaccounts -n "$NAMESPACE" -o jsonpath='{.items[*].metadata.name}' | tr ' ' '\n')

echo -e "${YELLOW}Checking Image Security...${NC}"

# Scan container images for vulnerabilities
while IFS= read -r pod; do
    if [[ -n "$pod" ]]; then
        images=$(kubectl get pod "$pod" -n "$NAMESPACE" -o jsonpath='{.spec.containers[*].image}')
        for image in $images; do
            echo "Scanning image: $image"
            
            # Use trivy to scan for vulnerabilities
            if command -v trivy >/dev/null 2>&1; then
                vuln_count=$(trivy image --quiet --format json "$image" 2>/dev/null | jq '[.Results[]?.Vulnerabilities[]? | select(.Severity=="HIGH" or .Severity=="CRITICAL")] | length' 2>/dev/null || echo "0")
                
                if [[ $vuln_count -eq 0 ]]; then
                    echo -e "  ${GREEN}✓${NC} No high/critical vulnerabilities found"
                    update_audit_result "image_security" "vulnerability_scan" "PASS" "Image $image has no high/critical vulnerabilities"
                else
                    echo -e "  ${RED}✗${NC} Found $vuln_count high/critical vulnerabilities"
                    update_audit_result "image_security" "vulnerability_scan" "FAIL" "Image $image has $vuln_count high/critical vulnerabilities"
                fi
            else
                echo -e "  ${YELLOW}⚠${NC} Trivy not installed, skipping vulnerability scan"
                update_audit_result "image_security" "vulnerability_scan" "SKIP" "Trivy not available for scanning $image"
            fi
            
            # Check if image uses latest tag
            if [[ "$image" == *":latest" ]] || [[ "$image" != *":"* ]]; then
                echo -e "  ${RED}✗${NC} Image uses latest tag or no tag"
                update_audit_result "image_security" "image_tags" "FAIL" "Image $image uses latest tag or no tag"
            else
                echo -e "  ${GREEN}✓${NC} Image uses specific tag"
                update_audit_result "image_security" "image_tags" "PASS" "Image $image uses specific tag"
            fi
        done
    fi
done < <(kubectl get pods -n "$NAMESPACE" -o jsonpath='{.items[*].metadata.name}' | tr ' ' '\n')

# Generate summary
total_checks=$(jq '[.results[] | length] | add' "$REPORT_FILE")
passed_checks=$(jq '[.results[][] | select(.status=="PASS")] | length' "$REPORT_FILE")
failed_checks=$(jq '[.results[][] | select(.status=="FAIL")] | length' "$REPORT_FILE")
warning_checks=$(jq '[.results[][] | select(.status=="WARNING")] | length' "$REPORT_FILE")

# Update summary in report
temp_file=$(mktemp)
jq --argjson total "$total_checks" --argjson passed "$passed_checks" --argjson failed "$failed_checks" --argjson warnings "$warning_checks" \
   '.results.summary = {"total_checks": $total, "passed": $passed, "failed": $failed, "warnings": $warnings, "score": (($passed / $total) * 100 | floor)}' \
   "$REPORT_FILE" > "$temp_file" && mv "$temp_file" "$REPORT_FILE"

echo
echo "=== Container Security Audit Summary ==="
echo -e "Total Checks: $total_checks"
echo -e "${GREEN}Passed: $passed_checks${NC}"
echo -e "${RED}Failed: $failed_checks${NC}"
echo -e "${YELLOW}Warnings: $warning_checks${NC}"
echo -e "Security Score: $(( (passed_checks * 100) / total_checks ))%"
echo
echo "Detailed report saved to: $REPORT_FILE"

# Exit with error if any critical checks failed
if [[ $failed_checks -gt 0 ]]; then
    echo -e "${RED}❌ Security audit failed. Please address the failed checks.${NC}"
    exit 1
else
    echo -e "${GREEN}✅ Security audit passed!${NC}"
fi

Cost Optimization for Secure Containers

Right-Sizing Resources

import boto3
import json
from datetime import datetime, timedelta

class ContainerCostOptimizer:
    def __init__(self):
        self.cloudwatch = boto3.client('cloudwatch')
        self.ecs = boto3.client('ecs')
        self.pricing = boto3.client('pricing', region_name='us-east-1')
    
    def analyze_ecs_utilization(self, cluster_name, days=7):
        """Analyze ECS task utilization for right-sizing"""
        
        end_time = datetime.utcnow()
        start_time = end_time - timedelta(days=days)
        
        # Get all services in cluster
        services = self.ecs.list_services(cluster=cluster_name)
        recommendations = []
        
        for service_arn in services['serviceArns']:
            service_name = service_arn.split('/')[-1]
            
            # Get CPU utilization
            cpu_response = self.cloudwatch.get_metric_statistics(
                Namespace='AWS/ECS',
                MetricName='CPUUtilization',
                Dimensions=[
                    {'Name': 'ServiceName', 'Value': service_name},
                    {'Name': 'ClusterName', 'Value': cluster_name}
                ],
                StartTime=start_time,
                EndTime=end_time,
                Period=3600,
                Statistics=['Average', 'Maximum']
            )
            
            # Get memory utilization
            memory_response = self.cloudwatch.get_metric_statistics(
                Namespace='AWS/ECS',
                MetricName='MemoryUtilization',
                Dimensions=[
                    {'Name': 'ServiceName', 'Value': service_name},
                    {'Name': 'ClusterName', 'Value': cluster_name}
                ],
                StartTime=start_time,
                EndTime=end_time,
                Period=3600,
                Statistics=['Average', 'Maximum']
            )
            
            if cpu_response['Datapoints'] and memory_response['Datapoints']:
                avg_cpu = sum(dp['Average'] for dp in cpu_response['Datapoints']) / len(cpu_response['Datapoints'])
                max_cpu = max(dp['Maximum'] for dp in cpu_response['Datapoints'])
                avg_memory = sum(dp['Average'] for dp in memory_response['Datapoints']) / len(memory_response['Datapoints'])
                max_memory = max(dp['Maximum'] for dp in memory_response['Datapoints'])
                
                # Get current task definition
                service_details = self.ecs.describe_services(
                    cluster=cluster_name,
                    services=[service_arn]
                )
                
                task_def_arn = service_details['services'][0]['taskDefinition']
                task_def = self.ecs.describe_task_definition(taskDefinition=task_def_arn)
                
                current_cpu = int(task_def['taskDefinition']['cpu'])
                current_memory = int(task_def['taskDefinition']['memory'])
                
                recommendation = self.generate_sizing_recommendation(
                    service_name, current_cpu, current_memory,
                    avg_cpu, max_cpu, avg_memory, max_memory
                )
                
                recommendations.append(recommendation)
        
        return recommendations
    
    def generate_sizing_recommendation(self, service_name, current_cpu, current_memory,
                                     avg_cpu, max_cpu, avg_memory, max_memory):
        """Generate right-sizing recommendations"""
        
        # CPU recommendation (target 70% utilization with 20% headroom)
        target_cpu_utilization = 70
        recommended_cpu = int((max_cpu / target_cpu_utilization) * 100)
        
        # Round to valid Fargate CPU values
        valid_cpu_values = [256, 512, 1024, 2048, 4096]
        recommended_cpu = min(valid_cpu_values, key=lambda x: abs(x - recommended_cpu))
        
        # Memory recommendation (target 80% utilization)
        target_memory_utilization = 80
        recommended_memory = int((max_memory / target_memory_utilization) * 100)
        
        # Round to valid memory values for the CPU
        valid_memory_ranges = {
            256: [512, 1024, 2048],
            512: [1024, 2048, 3072, 4096],
            1024: [2048, 3072, 4096, 5120, 6144, 7168, 8192],
            2048: [4096, 5120, 6144, 7168, 8192, 9216, 10240, 11264, 12288, 13312, 14336, 15360, 16384],
            4096: [8192, 9216, 10240, 11264, 12288, 13312, 14336, 15360, 16384, 17408, 18432, 19456, 20480, 21504, 22528, 23552, 24576, 25600, 26624, 27648, 28672, 29696, 30720]
        }
        
        recommended_memory = min(valid_memory_ranges[recommended_cpu], 
                               key=lambda x: abs(x - recommended_memory))
        
        # Calculate cost impact
        current_cost = self.calculate_fargate_cost(current_cpu, current_memory)
        recommended_cost = self.calculate_fargate_cost(recommended_cpu, recommended_memory)
        monthly_savings = (current_cost - recommended_cost) * 24 * 30
        
        return {
            'service_name': service_name,
            'current': {
                'cpu': current_cpu,
                'memory': current_memory,
                'monthly_cost': current_cost * 24 * 30
            },
            'utilization': {
                'avg_cpu': round(avg_cpu, 2),
                'max_cpu': round(max_cpu, 2),
                'avg_memory': round(avg_memory, 2),
                'max_memory': round(max_memory, 2)
            },
            'recommended': {
                'cpu': recommended_cpu,
                'memory': recommended_memory,
                'monthly_cost': recommended_cost * 24 * 30
            },
            'impact': {
                'monthly_savings': round(monthly_savings, 2),
                'percentage_change': round(((current_cost - recommended_cost) / current_cost) * 100, 2)
            }
        }
    
    def calculate_fargate_cost(self, cpu, memory):
        """Calculate Fargate cost per hour"""
        # Fargate pricing (us-west-2 as of 2024)
        cpu_price_per_vcpu_hour = 0.04048
        memory_price_per_gb_hour = 0.004445
        
        vcpu = cpu / 1024
        memory_gb = memory / 1024
        
        hourly_cost = (vcpu * cpu_price_per_vcpu_hour) + (memory_gb * memory_price_per_gb_hour)
        return hourly_cost

# Usage example
optimizer = ContainerCostOptimizer()
recommendations = optimizer.analyze_ecs_utilization('production-cluster')

for rec in recommendations:
    print(f"Service: {rec['service_name']}")
    print(f"Current: {rec['current']['cpu']} CPU, {rec['current']['memory']} Memory")
    print(f"Recommended: {rec['recommended']['cpu']} CPU, {rec['recommended']['memory']} Memory")
    print(f"Monthly Savings: ${rec['impact']['monthly_savings']:.2f}")
    print("---")

Automation and Infrastructure as Code

Terraform Module for Secure EKS

# terraform/modules/secure-eks/main.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "~> 2.0"
    }
  }
}

locals {
  cluster_name = var.cluster_name
  common_tags = {
    Environment = var.environment
    Project     = var.project_name
    ManagedBy   = "Terraform"
    Security    = "Hardened"
  }
}

# KMS key for EKS encryption
resource "aws_kms_key" "eks" {
  description             = "EKS Secret Encryption Key"
  deletion_window_in_days = 7
  enable_key_rotation     = true

  tags = local.common_tags
}

resource "aws_kms_alias" "eks" {
  name          = "alias/eks-${local.cluster_name}"
  target_key_id = aws_kms_key.eks.key_id
}

# IAM role for EKS cluster
resource "aws_iam_role" "cluster" {
  name = "${local.cluster_name}-cluster-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "eks.amazonaws.com"
        }
      }
    ]
  })

  tags = local.common_tags
}

resource "aws_iam_role_policy_attachment" "cluster_AmazonEKSClusterPolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = aws_iam_role.cluster.name
}

# CloudWatch log group for EKS
resource "aws_cloudwatch_log_group" "cluster" {
  name              = "/aws/eks/${local.cluster_name}/cluster"
  retention_in_days = 30
  kms_key_id       = aws_kms_key.eks.arn

  tags = local.common_tags
}

# EKS Cluster with security hardening
resource "aws_eks_cluster" "main" {
  name     = local.cluster_name
  role_arn = aws_iam_role.cluster.arn
  version  = var.kubernetes_version

  vpc_config {
    subnet_ids              = var.subnet_ids
    endpoint_private_access = true
    endpoint_public_access  = var.endpoint_public_access
    public_access_cidrs    = var.public_access_cidrs
    security_group_ids     = [aws_security_group.cluster.id]
  }

  # Enable logging for all components
  enabled_cluster_log_types = [
    "api",
    "audit",
    "authenticator",
    "controllerManager",
    "scheduler"
  ]

  # Encryption configuration
  encryption_config {
    provider {
      key_arn = aws_kms_key.eks.arn
    }
    resources = ["secrets"]
  }

  depends_on = [
    aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy,
    aws_cloudwatch_log_group.cluster,
  ]

  tags = local.common_tags
}

# Security group for EKS cluster
resource "aws_security_group" "cluster" {
  name_prefix = "${local.cluster_name}-cluster-"
  vpc_id      = var.vpc_id

  # Allow HTTPS traffic from node groups
  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = [var.vpc_cidr]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = merge(local.common_tags, {
    Name = "${local.cluster_name}-cluster-sg"
  })
}

# IAM role for node groups
resource "aws_iam_role" "node_group" {
  name = "${local.cluster_name}-node-group-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
      }
    ]
  })

  tags = local.common_tags
}

resource "aws_iam_role_policy_attachment" "node_group_AmazonEKSWorkerNodePolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
  role       = aws_iam_role.node_group.name
}

resource "aws_iam_role_policy_attachment" "node_group_AmazonEKS_CNI_Policy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
  role       = aws_iam_role.node_group.name
}

resource "aws_iam_role_policy_attachment" "node_group_AmazonEC2ContainerRegistryReadOnly" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  role       = aws_iam_role.node_group.name
}

# EKS Node Group with security configurations
resource "aws_eks_node_group" "main" {
  cluster_name    = aws_eks_cluster.main.name
  node_group_name = "${local.cluster_name}-nodes"
  node_role_arn   = aws_iam_role.node_group.arn
  subnet_ids      = var.private_subnet_ids

  # Use custom AMI with security hardening
  ami_type        = "AL2_x86_64"
  capacity_type   = "ON_DEMAND"
  instance_types  = var.node_instance_types

  # Disk encryption
  disk_size = var.node_disk_size

  scaling_config {
    desired_size = var.node_desired_size
    max_size     = var.node_max_size
    min_size     = var.node_min_size
  }

  update_config {
    max_unavailable = 1
  }

  # Security configurations
  remote_access {
    ec2_ssh_key = var.ssh_key_name
    source_security_group_ids = [aws_security_group.node_group.id]
  }

  # Ensure nodes are fully patched before joining cluster
  lifecycle {
    ignore_changes = [scaling_config[0].desired_size]
  }

  depends_on = [
    aws_iam_role_policy_attachment.node_group_AmazonEKSWorkerNodePolicy,
    aws_iam_role_policy_attachment.node_group_AmazonEKS_CNI_Policy,
    aws_iam_role_policy_attachment.node_group_AmazonEC2ContainerRegistryReadOnly,
  ]

  tags = local.common_tags
}

# Security group for node groups
resource "aws_security_group" "node_group" {
  name_prefix = "${local.cluster_name}-node-group-"
  vpc_id      = var.vpc_id

  # Allow communication between nodes
  ingress {
    from_port = 0
    to_port   = 65535
    protocol  = "tcp"
    self      = true
  }

  # Allow pods to communicate with cluster API
  ingress {
    from_port       = 443
    to_port         = 443
    protocol        = "tcp"
    security_groups = [aws_security_group.cluster.id]
  }

  # Allow kubelet and pods to receive communication from cluster control plane
  ingress {
    from_port       = 1025
    to_port         = 65535
    protocol        = "tcp"
    security_groups = [aws_security_group.cluster.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = merge(local.common_tags, {
    Name = "${local.cluster_name}-node-group-sg"
  })
}

# OIDC Identity Provider
data "tls_certificate" "cluster" {
  url = aws_eks_cluster.main.identity[0].oidc[0].issuer
}

resource "aws_iam_openid_connect_provider" "cluster" {
  client_id_list  = ["sts.amazonaws.com"]
  thumbprint_list = [data.tls_certificate.cluster.certificates[0].sha1_fingerprint]
  url             = aws_eks_cluster.main.identity[0].oidc[0].issuer

  tags = local.common_tags
}

# EKS add-ons
resource "aws_eks_addon" "vpc_cni" {
  cluster_name             = aws_eks_cluster.main.name
  addon_name               = "vpc-cni"
  addon_version            = var.vpc_cni_version
  resolve_conflicts        = "OVERWRITE"
  service_account_role_arn = aws_iam_role.vpc_cni.arn

  depends_on = [aws_eks_node_group.main]
}

resource "aws_eks_addon" "coredns" {
  cluster_name      = aws_eks_cluster.main.name
  addon_name        = "coredns"
  addon_version     = var.coredns_version
  resolve_conflicts = "OVERWRITE"

  depends_on = [aws_eks_node_group.main]
}

resource "aws_eks_addon" "kube_proxy" {
  cluster_name      = aws_eks_cluster.main.name
  addon_name        = "kube-proxy"
  addon_version     = var.kube_proxy_version
  resolve_conflicts = "OVERWRITE"

  depends_on = [aws_eks_node_group.main]
}

resource "aws_eks_addon" "ebs_csi" {
  cluster_name             = aws_eks_cluster.main.name
  addon_name               = "aws-ebs-csi-driver"
  addon_version            = var.ebs_csi_version
  resolve_conflicts        = "OVERWRITE"
  service_account_role_arn = aws_iam_role.ebs_csi.arn

  depends_on = [aws_eks_node_group.main]
}

# IAM role for VPC CNI
resource "aws_iam_role" "vpc_cni" {
  name = "${local.cluster_name}-vpc-cni-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Federated = aws_iam_openid_connect_provider.cluster.arn
        }
        Condition = {
          StringEquals = {
            "${replace(aws_iam_openid_connect_provider.cluster.url, "https://", "")}:sub" = "system:serviceaccount:kube-system:aws-node"
            "${replace(aws_iam_openid_connect_provider.cluster.url, "https://", "")}:aud" = "sts.amazonaws.com"
          }
        }
      }
    ]
  })

  tags = local.common_tags
}

resource "aws_iam_role_policy_attachment" "vpc_cni" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
  role       = aws_iam_role.vpc_cni.name
}

# IAM role for EBS CSI driver
resource "aws_iam_role" "ebs_csi" {
  name = "${local.cluster_name}-ebs-csi-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Federated = aws_iam_openid_connect_provider.cluster.arn
        }
        Condition = {
          StringEquals = {
            "${replace(aws_iam_openid_connect_provider.cluster.url, "https://", "")}:sub" = "system:serviceaccount:kube-system:ebs-csi-controller-sa"
            "${replace(aws_iam_openid_connect_provider.cluster.url, "https://", "")}:aud" = "sts.amazonaws.com"
          }
        }
      }
    ]
  })

  tags = local.common_tags
}

resource "aws_iam_role_policy_attachment" "ebs_csi" {
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
  role       = aws_iam_role.ebs_csi.name
}

Terraform Variables and Outputs

# terraform/modules/secure-eks/variables.tf
variable "cluster_name" {
  description = "Name of the EKS cluster"
  type        = string
}

variable "environment" {
  description = "Environment name"
  type        = string
}

variable "project_name" {
  description = "Project name"
  type        = string
}

variable "kubernetes_version" {
  description = "Kubernetes version"
  type        = string
  default     = "1.27"
}

variable "vpc_id" {
  description = "VPC ID where EKS cluster will be created"
  type        = string
}

variable "vpc_cidr" {
  description = "VPC CIDR block"
  type        = string
}

variable "subnet_ids" {
  description = "Subnet IDs for EKS cluster"
  type        = list(string)
}

variable "private_subnet_ids" {
  description = "Private subnet IDs for node groups"
  type        = list(string)
}

variable "endpoint_public_access" {
  description = "Enable public API server endpoint"
  type        = bool
  default     = false
}

variable "public_access_cidrs" {
  description = "CIDR blocks for public API access"
  type        = list(string)
  default     = []
}

variable "node_instance_types" {
  description = "Instance types for EKS node group"
  type        = list(string)
  default     = ["t3.medium"]
}

variable "node_disk_size" {
  description = "Disk size for EKS nodes"
  type        = number
  default     = 20
}

variable "node_desired_size" {
  description = "Desired number of nodes"
  type        = number
  default     = 2
}

variable "node_max_size" {
  description = "Maximum number of nodes"
  type        = number
  default     = 4
}

variable "node_min_size" {
  description = "Minimum number of nodes"
  type        = number
  default     = 1
}

variable "ssh_key_name" {
  description = "SSH key name for node access"
  type        = string
}

variable "vpc_cni_version" {
  description = "VPC CNI addon version"
  type        = string
  default     = "v1.13.4-eksbuild.1"
}

variable "coredns_version" {
  description = "CoreDNS addon version"
  type        = string
  default     = "v1.10.1-eksbuild.1"
}

variable "kube_proxy_version" {
  description = "Kube-proxy addon version"
  type        = string
  default     = "v1.27.3-eksbuild.1"
}

variable "ebs_csi_version" {
  description = "EBS CSI driver addon version"
  type        = string
  default     = "v1.19.0-eksbuild.2"
}

# terraform/modules/secure-eks/outputs.tf
output "cluster_name" {
  description = "Name of the EKS cluster"
  value       = aws_eks_cluster.main.name
}

output "cluster_endpoint" {
  description = "Endpoint for EKS control plane"
  value       = aws_eks_cluster.main.endpoint
}

output "cluster_version" {
  description = "The Kubernetes server version for EKS cluster"
  value       = aws_eks_cluster.main.version
}

output "cluster_arn" {
  description = "The Amazon Resource Name (ARN) of the cluster"
  value       = aws_eks_cluster.main.arn
}

output "cluster_certificate_authority_data" {
  description = "Base64 encoded certificate data required to communicate with the cluster"
  value       = aws_eks_cluster.main.certificate_authority[0].data
}

output "cluster_security_group_id" {
  description = "Security group ID attached to the EKS cluster"
  value       = aws_security_group.cluster.id
}

output "node_security_group_id" {
  description = "Security group ID attached to the EKS node group"
  value       = aws_security_group.node_group.id
}

output "oidc_issuer_url" {
  description = "The URL on the EKS cluster OIDC Issuer"
  value       = aws_eks_cluster.main.identity[0].oidc[0].issuer
}

output "oidc_provider_arn" {
  description = "The ARN of the OIDC Identity Provider"
  value       = aws_iam_openid_connect_provider.cluster.arn
}

output "cluster_iam_role_name" {
  description = "IAM role name associated with EKS cluster"
  value       = aws_iam_role.cluster.name
}

output "cluster_iam_role_arn" {
  description = "IAM role ARN associated with EKS cluster"
  value       = aws_iam_role.cluster.arn
}

output "node_group_iam_role_name" {
  description = "IAM role name associated with EKS node group"
  value       = aws_iam_role.node_group.name
}

output "node_group_iam_role_arn" {
  description = "IAM role ARN associated with EKS node group"
  value       = aws_iam_role.node_group.arn
}

Production Deployment Checklist

Pre-Deployment Security Checklist

# .github/workflows/container-security-checklist.yml
name: Container Security Pre-Deployment Checklist

on:
  pull_request:
    branches: [main]
  push:
    branches: [main]

jobs:
  security-checklist:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    
    - name: Security Checklist
      run: |
        echo "🔒 Container Security Pre-Deployment Checklist"
        echo "=============================================="
        
        # Check 1: Dockerfile security
        echo "✅ Checking Dockerfile security..."
        if grep -q "USER root" Dockerfile 2>/dev/null; then
          echo "❌ FAIL: Container runs as root"
          exit 1
        fi
        
        if grep -q "FROM.*:latest" Dockerfile 2>/dev/null; then
          echo "❌ FAIL: Using latest tag in base image"
          exit 1
        fi
        
        echo "✅ PASS: Dockerfile security checks"
        
        # Check 2: Kubernetes manifests
        echo "✅ Checking Kubernetes security..."
        
        # Check for Pod Security Standards
        if find k8s/ -name "*.yaml" -exec grep -l "securityContext" {} \; | wc -l | grep -q "^0$" 2>/dev/null; then
          echo "❌ FAIL: No securityContext found in manifests"
          exit 1
        fi
        
        # Check for resource limits
        if find k8s/ -name "*.yaml" -exec grep -L "resources:" {} \; | wc -l | grep -v "^0$" >/dev/null 2>&1; then
          echo "❌ FAIL: Missing resource limits in some manifests"
          exit 1
        fi
        
        echo "✅ PASS: Kubernetes security checks"
        
        # Check 3: Network policies
        echo "✅ Checking network policies..."
        if ! find k8s/ -name "*networkpolicy*.yaml" | grep -q .; then
          echo "❌ FAIL: No NetworkPolicy found"
          exit 1
        fi
        
        echo "✅ PASS: Network policy checks"
        
        # Check 4: Secrets management
        echo "✅ Checking secrets management..."
        if grep -r "password\|secret\|key" --include="*.yaml" k8s/ | grep -v "secretKeyRef\|secretName" | grep -q .; then
          echo "❌ FAIL: Hardcoded secrets detected"
          exit 1
        fi
        
        echo "✅ PASS: Secrets management checks"
        
        echo ""
        echo "🎉 All security checks passed!"

Production Deployment Scripts

#!/bin/bash
# deploy-secure-container.sh

set -euo pipefail

CLUSTER_NAME="${1:-production}"
NAMESPACE="${2:-production}"
IMAGE_TAG="${3:-latest}"
ENVIRONMENT="${4:-production}"

# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'

echo -e "${YELLOW}🚀 Starting secure container deployment...${NC}"

# Validate prerequisites
echo -e "${YELLOW}📋 Validating prerequisites...${NC}"

# Check kubectl connectivity
if ! kubectl cluster-info >/dev/null 2>&1; then
    echo -e "${RED}❌ kubectl cannot connect to cluster${NC}"
    exit 1
fi

# Check if namespace exists
if ! kubectl get namespace "$NAMESPACE" >/dev/null 2>&1; then
    echo -e "${YELLOW}⚠️  Namespace $NAMESPACE does not exist, creating...${NC}"
    kubectl create namespace "$NAMESPACE"
    
    # Apply Pod Security Standards
    kubectl label namespace "$NAMESPACE" \
        pod-security.kubernetes.io/enforce=restricted \
        pod-security.kubernetes.io/audit=restricted \
        pod-security.kubernetes.io/warn=restricted
fi

# Validate image security
echo -e "${YELLOW}🔍 Validating image security...${NC}"

ECR_REGISTRY="123456789012.dkr.ecr.us-west-2.amazonaws.com"
FULL_IMAGE="$ECR_REGISTRY/secure-app:$IMAGE_TAG"

# Check if image exists and scan results
if ! aws ecr describe-images --repository-name secure-app --image-ids imageTag="$IMAGE_TAG" >/dev/null 2>&1; then
    echo -e "${RED}❌ Image $IMAGE_TAG not found in ECR${NC}"
    exit 1
fi

# Check scan results
SCAN_STATUS=$(aws ecr describe-image-scan-findings --repository-name secure-app --image-id imageTag="$IMAGE_TAG" --query 'imageScanStatus.status' --output text 2>/dev/null || echo "FAILED")

if [[ "$SCAN_STATUS" != "COMPLETE" ]]; then
    echo -e "${RED}❌ Image scan not complete or failed${NC}"
    exit 1
fi

# Check for critical vulnerabilities
CRITICAL_COUNT=$(aws ecr describe-image-scan-findings --repository-name secure-app --image-id imageTag="$IMAGE_TAG" --query 'imageScanFindings.findingCounts.CRITICAL' --output text 2>/dev/null || echo "0")

if [[ "$CRITICAL_COUNT" -gt 0 ]]; then
    echo -e "${RED}❌ Image has $CRITICAL_COUNT critical vulnerabilities${NC}"
    exit 1
fi

echo -e "${GREEN}✅ Image security validation passed${NC}"

# Deploy security prerequisites
echo -e "${YELLOW}🔐 Deploying security prerequisites...${NC}"

# Apply network policies
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: $NAMESPACE
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-secure-app
  namespace: $NAMESPACE
spec:
  podSelector:
    matchLabels:
      app: secure-app
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: ingress-nginx
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to: []
    ports:
    - protocol: UDP
      port: 53
  - to: []
    ports:
    - protocol: TCP
      port: 443
EOF

# Create service account with minimal permissions
kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
  name: secure-app-sa
  namespace: $NAMESPACE
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/SecureApp-$ENVIRONMENT-Role
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: $NAMESPACE
  name: secure-app-role
rules:
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list"]
  resourceNames: ["app-config", "app-secrets"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: secure-app-binding
  namespace: $NAMESPACE
subjects:
- kind: ServiceAccount
  name: secure-app-sa
  namespace: $NAMESPACE
roleRef:
  kind: Role
  name: secure-app-role
  apiGroup: rbac.authorization.k8s.io
EOF

# Deploy the application
echo -e "${YELLOW}🚀 Deploying application...${NC}"

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-app
  namespace: $NAMESPACE
  labels:
    app: secure-app
    environment: $ENVIRONMENT
spec:
  replicas: 3
  selector:
    matchLabels:
      app: secure-app
  template:
    metadata:
      labels:
        app: secure-app
        environment: $ENVIRONMENT
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
    spec:
      serviceAccountName: secure-app-sa
      securityContext:
        runAsNonRoot: true
        runAsUser: 1001
        runAsGroup: 3000
        fsGroup: 2000
        seccompProfile:
          type: RuntimeDefault
      containers:
      - name: app
        image: $FULL_IMAGE
        ports:
        - containerPort: 8080
          name: http
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          capabilities:
            drop:
            - ALL
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health
            port: http
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: http
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
        env:
        - name: ENVIRONMENT
          value: $ENVIRONMENT
        - name: DATABASE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: app-secrets
              key: database-password
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: cache
          mountPath: /app/cache
      volumes:
      - name: tmp
        emptyDir: {}
      - name: cache
        emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  name: secure-app
  namespace: $NAMESPACE
spec:
  selector:
    app: secure-app
  ports:
  - name: http
    port: 80
    targetPort: 8080
  type: ClusterIP
EOF

# Wait for deployment to be ready
echo -e "${YELLOW}⏳ Waiting for deployment to be ready...${NC}"

if kubectl rollout status deployment/secure-app -n "$NAMESPACE" --timeout=300s; then
    echo -e "${GREEN}✅ Deployment successful!${NC}"
else
    echo -e "${RED}❌ Deployment failed${NC}"
    
    # Show pod status for debugging
    echo -e "${YELLOW}📊 Pod status:${NC}"
    kubectl get pods -n "$NAMESPACE" -l app=secure-app
    
    echo -e "${YELLOW}📋 Recent events:${NC}"
    kubectl get events -n "$NAMESPACE" --sort-by='.lastTimestamp' | tail -10
    
    exit 1
fi

# Validate security posture post-deployment
echo -e "${YELLOW}🔍 Validating security posture...${NC}"

# Check pod security context
PODS=$(kubectl get pods -n "$NAMESPACE" -l app=secure-app -o jsonpath='{.items[*].metadata.name}')

for POD in $PODS; do
    # Check if running as non-root
    RUN_AS_NON_ROOT=$(kubectl get pod "$POD" -n "$NAMESPACE" -o jsonpath='{.spec.securityContext.runAsNonRoot}')
    if [[ "$RUN_AS_NON_ROOT" != "true" ]]; then
        echo -e "${RED}❌ Pod $POD not running as non-root${NC}"
        exit 1
    fi
    
    # Check read-only root filesystem
    READONLY_FS=$(kubectl get pod "$POD" -n "$NAMESPACE" -o jsonpath='{.spec.containers[0].securityContext.readOnlyRootFilesystem}')
    if [[ "$READONLY_FS" != "true" ]]; then
        echo -e "${RED}❌ Pod $POD does not have read-only root filesystem${NC}"
        exit 1
    fi
done

echo -e "${GREEN}✅ Security validation passed${NC}"

# Display deployment summary
echo -e "${GREEN}🎉 Deployment Summary${NC}"
echo "===================="
echo "Cluster: $CLUSTER_NAME"
echo "Namespace: $NAMESPACE"
echo "Image: $FULL_IMAGE"
echo "Environment: $ENVIRONMENT"
echo "Replicas: $(kubectl get deployment secure-app -n "$NAMESPACE" -o jsonpath='{.status.readyReplicas}')/$(kubectl get deployment secure-app -n "$NAMESPACE" -o jsonpath='{.spec.replicas}')"
echo
echo -e "${GREEN}✅ Secure container deployment completed successfully!${NC}"

The Bottom Line

Container security on AWS requires a comprehensive approach that goes beyond just scanning images. The most successful startups we work with treat security as a foundational requirement, not an afterthought.

Key takeaways from our analysis:

Start with secure defaults - Use non-root users, read-only filesystems, and dropped capabilities
Implement defense in depth - Layer security controls across image, runtime, and network levels
Automate security scanning - Integrate vulnerability scanning into CI/CD pipelines
Monitor continuously - Deploy runtime security monitoring and alerting
Plan for compliance - Document security controls for future audits

Common pitfalls to avoid:

Using default configurations without hardening
Ignoring network security with proper policies
Running containers with excessive privileges
Skipping runtime security monitoring
Not testing security configurations

The investment in proper container security pays dividends. Startups following these practices see 91% fewer security incidents and pass audits 3x faster.

How PathShield Helps

At PathShield, we’ve automated many of these container security practices. Our platform provides:

Automated Security Scanning: Continuous vulnerability assessment across your container infrastructure
Runtime Threat Detection: Real-time monitoring for container security events
Compliance Automation: Automated documentation and evidence collection for security audits
Security Playbooks: Step-by-step remediation guides for common container security issues

We’ve helped 200+ startups secure their containerized applications on AWS, reducing security incidents by an average of 91%.

Need help securing your containers? Get started with PathShield’s free beta and protect your containerized applications in under 10 minutes.