Runtime Security

Image hardening controls what goes into an image. Runtime security controls what the container is allowed to do after it starts.

Default posture

Start with this posture for ordinary application containers:

  • Run as a non-root user.

  • Drop Linux capabilities.

  • Prevent privilege escalation.

  • Use a read-only root filesystem.

  • Write only to declared volumes or tmpfs mounts.

  • Keep Docker’s default seccomp profile enabled.

  • Avoid privileged containers.

  • Avoid mounting the Docker socket.

Compose example

 1services:
 2  api:
 3    image: registry.example.com/team/api:latest
 4    user: "10001:10001"
 5    read_only: true
 6    tmpfs:
 7      - /tmp
 8    cap_drop:
 9      - ALL
10    security_opt:
11      - no-new-privileges:true

Kubernetes example

 1apiVersion: apps/v1
 2kind: Deployment
 3metadata:
 4  name: api
 5spec:
 6  template:
 7    spec:
 8      securityContext:
 9        runAsNonRoot: true
10        runAsUser: 10001
11        runAsGroup: 10001
12        fsGroup: 10001
13      containers:
14        - name: api
15          image: registry.example.com/team/api:latest
16          securityContext:
17            allowPrivilegeEscalation: false
18            readOnlyRootFilesystem: true
19            capabilities:
20              drop:
21                - ALL

Capabilities

Linux capabilities split root privileges into smaller permissions. Most application containers need none of them.

Drop all capabilities by default, then add back only what is required and documented.

1docker run --rm \
2    --cap-drop ALL \
3    --security-opt no-new-privileges:true \
4    nginx:alpine

If a workload asks for --privileged, treat it as a design review. --privileged gives broad host-level access and disables many isolation controls.

seccomp

Docker’s default seccomp profile blocks many sensitive system calls while keeping broad application compatibility. Keep it enabled unless there is a specific, reviewed reason to override it.

Use a custom seccomp profile only when the application needs a known syscall and the profile can be reviewed.

1docker run --rm \
2    --security-opt seccomp=/path/to/seccomp/profile.json \
3    app:local

Avoid seccomp=unconfined in production.

AppArmor and SELinux

AppArmor and SELinux are Linux security modules that can enforce host-level access policies around containers. Use the platform’s default container policy where available, then add custom policy only when the workload requires it.

Examples of policy-controlled operations include filesystem paths, process behavior, and some network access patterns.

User namespaces and rootless Docker

User namespaces map container users to different host users. Rootless Docker runs the Docker daemon and containers without root privileges on the host. These controls reduce risk from daemon and runtime vulnerabilities.

They do not remove the need for application-level least privilege. A container should still use USER, drop capabilities, and avoid privileged mounts.

Docker socket

Mounting the Docker socket gives the container control over the Docker daemon.

1docker run -v /var/run/docker.sock:/var/run/docker.sock tool:local

Treat this as host-level access. Prefer purpose-built APIs, narrow CI permissions, or remote builders instead of giving a workload the daemon socket.

Decision checklist

Before a container runs in production, answer these:

  • What UID and GID does it run as?

  • Which directories are writable?

  • Which Linux capabilities remain?

  • Does it need privilege escalation?

  • Does it use Docker’s default seccomp profile?

  • Does it mount the Docker socket?

  • Does it need host networking, host PID, or privileged mode?

  • How will operators debug it without adding tools to the runtime image?

References