kubernetes

Kubernetes Autoscaler Threshold Calculator

Calculate the right HPA (Horizontal Pod Autoscaler) CPU and memory thresholds for your Kubernetes workloads. Get absolute target values in millicores and MiB.

Kubernetes HPA Configuration Guide

The Horizontal Pod Autoscaler (HPA) automatically scales Deployment replicas based on CPU, memory, or custom metrics. Correct threshold configuration is critical — wrong values cause either constant scale-flapping or under-utilized pods.

HPA v2 Spec Example

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: AverageValue
        averageValue: 350m   # 70% of 500m limit
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300

Scaling Algorithm

desired_replicas = ceil(current_replicas × (current_metric / target_metric))

If 3 pods average 700m CPU with a 350m target: ceil(3 × (700/350)) = 6 replicas.

Common Mistakes

1.Target on limits not requests: HPA type: Utilization targets a % of _requests_, not limits. If your requests are lower than limits, you'll scale out sooner than expected.
2.No stabilization window: Default scale-down is immediate — configure a stabilization window to prevent yo-yo scaling.
3.CPU limits too low: If your app regularly hits CPU limits (throttling), it will never trigger HPA scale-out — it just runs slow instead.
4.Max replicas too low: If maxReplicas is too small, HPA can't scale to meet demand. Monitor kube_horizontalpodautoscaler_status_current_replicas == kube_horizontalpodautoscaler_spec_max_replicas for this condition.

Key Terms

Full glossary →

kubeadm

A tool for bootstrapping Kubernetes clusters. It automates the setup of control plane components and joining worker nodes, following Kubernetes best practices.

etcd

A distributed key-value store used by Kubernetes to store all cluster state and configuration. etcd is the single source of truth for the entire cluster.

cert-manager

A Kubernetes controller for automating TLS certificate management. cert-manager can issue certificates from Let's Encrypt, Vault, or internal CAs, and automatically renews them.

Helm

A package manager for Kubernetes. Helm charts bundle Kubernetes manifests into reusable packages with configurable values, versioned and published to chart repositories.

Frequently Asked Questions

What is the right CPU target for HPA?

70% is the standard production default. This gives 30% headroom between scale-out triggers, preventing constant flapping. If your app has very spiky load, use 60%. If load is predictable and gradual, 80% is acceptable.

Why does HPA use absolute millicores instead of percentage?

HPA v2 supports both: `type: Utilization` (percentage of request, not limit) and `type: AverageValue` (absolute millicores). The calculator outputs absolute values because they're more predictable — `type: Utilization` depends on your requests being set accurately.

Why is memory-based autoscaling less reliable than CPU-based?

Memory is not compressible — a pod using 400 Mi of a 512 Mi limit isn't 'using too much CPU', it's just holding memory. Many apps never release memory to the OS even when idle. This causes HPA to scale out unnecessarily. Use memory-based HPA only for apps with predictable memory growth patterns.

How do I prevent HPA scale flapping?

Set a stabilization window: `behavior.scaleDown.stabilizationWindowSeconds: 300` (5 minutes). This prevents rapid scale-in after a brief traffic drop. Also ensure your --horizontal-pod-autoscaler-sync-period (default 15s) and --horizontal-pod-autoscaler-downscale-stabilization (default 5m) are appropriate.

Related Tools

K8s Node Sizing

Calculate the right number and size of Kubernetes worker nodes for your workloads. Supports Hetzner Cloud and Vultr with verified pricing.

K8s Cluster Cost

Calculate the monthly cost of running a Kubernetes cluster on Hetzner Cloud. Choose server types for control planes, workers, and load balancers with HA mode.

Related Generators

K8s HPA K8s PDB

Related Guides