Resources and Performance

Containers share host CPU, memory, disk, and network resources. Without limits, a container can consume as much as the host scheduler allows. Production containers should have measured resource requests and limits.

Memory limits

Set a hard memory limit for workloads that can spike or leak.

1docker run --rm \
2    --memory 512m \
3    --memory-swap 512m \
4    api:local

When --memory and --memory-swap are equal, the container cannot use swap. This is useful for latency-sensitive workloads where swapping is worse than a clean failure.

Avoid disabling the OOM killer unless a memory limit is also set. If the host runs out of memory, the kernel can kill important host processes.

CPU limits

Limit CPU with --cpus.

1docker run --rm --cpus 1.5 api:local

For relative scheduling weight, use CPU shares.

1docker run --rm --cpu-shares 512 worker:local

Compose resources

Compose can describe resource expectations. The exact enforcement depends on the runtime mode and platform.

 1services:
 2  api:
 3    image: api:local
 4    deploy:
 5      resources:
 6        limits:
 7          cpus: "1.0"
 8          memory: 512M
 9        reservations:
10          cpus: "0.25"
11          memory: 128M

Kubernetes resources

Kubernetes separates requests from limits. Requests affect scheduling. Limits cap usage.

1resources:
2  requests:
3    cpu: 250m
4    memory: 128Mi
5  limits:
6    cpu: 1000m
7    memory: 512Mi

Measure first

Use runtime metrics before guessing limits.

1docker stats
2docker stats api

Inside the container, remember that some tools report host-level values. Validate behavior with load tests and runtime metrics from the platform.

Image performance

Image size affects pull time, deployment speed, storage use, and vulnerability noise.

Improve image performance with:

  • A tight .dockerignore file.

  • Multi-stage builds.

  • Cache mounts for dependency managers.

  • Smaller runtime bases.

  • Avoiding package-manager caches in final layers.

  • Grouping commands so cleanup happens in the same layer.

  • Pushing multi-platform images only for platforms that are actually deployed.

Startup performance

Slow container startup usually comes from one of these:

  • The image pull is large.

  • The entrypoint runs migrations or dependency installation.

  • The application waits on an unavailable dependency.

  • Health checks have long grace periods.

  • The process performs cold compilation or model loading.

Keep containers ready to run. Install dependencies during the image build, not at container startup.

Cost control

Resource limits are also cost controls.

  • Right-size memory to avoid over-provisioned nodes.

  • Use ARM64 images when the workload performs well on cheaper ARM capacity.

  • Keep build caches separate from release images and expire them.

  • Use native builders for expensive cross-architecture compilation.

  • Remove stale volumes, images, and caches on shared hosts.

Practical checklist

  • Load test before setting production limits.

  • Set memory limits for services.

  • Use CPU limits or scheduling policy for noisy workloads.

  • Keep dependency installation out of startup.

  • Keep final images small.

  • Watch pull time, startup time, memory high-water marks, and restart counts.

References