n8n Automation

Self-Hosted n8n on AWS ECS Fargate: Load Test Results

How much traffic can a $35/month n8n setup handle? We deployed n8n on AWS ECS Fargate with the cheapest configuration and ran webhook load tests to find out.

İlker Ulusoy 2026-02-28 10 min min read

How much traffic can a $35/month n8n setup handle? We deployed n8n on AWS ECS Fargate with the cheapest possible configuration and ran webhook load tests to find out.

The Setup

ComponentSpecMonthly Cost
ECS Fargate0.25 vCPU, 512MB RAM, ARM64~$9
RDS PostgreSQL 17db.t4g.micro, 20GB GP3~$12
Application Load Balancerinternet-facing, HTTPS~$8
Other (ECR, Secrets, Logs)minimal~$1
Total~$31-35

Architecture: Single Fargate task in public subnet, ALB with ACM certificate, RDS with SSL, auto-scaling 1-3 tasks at 70% CPU/Memory threshold.

Test Tool

hey — HTTP load generator. All tests hit a webhook endpoint:

POST https://workflow.example.com/webhook/...
Content-Type: application/json
Body: {"test": true}

Test 1: Smoke Test (1 concurrent)

hey -n 10 -c 1
MetricValue
Requests10
Avg Latency185ms
p50145ms
p90550ms
Fastest143ms
Slowest550ms
Error Rate0%

Verdict: Baseline response time ~145ms. The 550ms outlier is likely the first cold request hitting SSL handshake overhead.

Test 2: Light Load (5 concurrent, 30 seconds)

hey -n 100 -c 5 -z 30s
MetricValue
Total Requests367
Requests/sec12.2
Avg Latency410ms
p50315ms
p90501ms
p951.32s
p991.71s
Error Rate0%

Verdict: Handles 12 req/s with zero errors. p50 stays under 315ms which is excellent for a 0.25 vCPU instance. The p95 spike to 1.3s suggests occasional garbage collection pauses or Node.js event loop contention — expected behavior for this tier.

Latency Distribution

Light Load Distribution
  0.15s - 0.31s  ████████████████████████████████████████  167 (46%)
  0.31s - 0.46s  ██████████                                 37 (10%)
  0.62s - 0.78s  █                                           5 (1%)
  0.78s - 1.71s  ██                                         22 (6%)

83% of requests complete under 460ms. The long tail (6% above 780ms) is where n8n's workflow execution engine is likely running database migrations or internal bookkeeping.

Test 3: Moderate Load (20 concurrent, 60 seconds)

hey -n 500 -c 20 -z 60s
MetricValue
Total Requests865
Requests/sec14.0
Avg Latency1.43s
p501.19s
p902.39s
p952.46s
p992.66s
Error Rate0%

Verdict: Still zero errors at 20 concurrent users. Throughput increased to 14 req/s but latency jumped significantly — avg went from 410ms to 1.43s. The 0.25 vCPU is now clearly saturated, with a bimodal distribution showing two clusters: one around 1.1-1.3s (56% of requests) and another around 2.3-2.5s (19% of requests).

Latency Distribution

Moderate Load Distribution (0.25 vCPU)
  0.36s - 0.85s  ████                                        5 (1%)
  0.85s - 1.33s  ██████████████████████████████████████████ 612 (71%)
  1.33s - 1.81s  ████                                       50 (6%)
  1.81s - 2.29s  ██                                         32 (4%)
  2.29s - 2.77s  █████████████                             166 (19%)

The bimodal pattern suggests the Node.js event loop is queuing requests when all worker threads are busy. The second cluster (2.3-2.5s) represents requests that had to wait a full cycle before processing.

Test 4: After Upgrade (20 concurrent, 0.5 vCPU)

After upgrading from 0.25 vCPU / 512MB to 0.5 vCPU / 1GB, we reran the same moderate load test.

hey -n 500 -c 20 -z 60s
MetricValue
Total Requests2117
Requests/sec35.1
Avg Latency568ms
p50514ms
p90705ms
p951.07s
p991.32s
Error Rate0%

Doubling the CPU delivered a 2.5x throughput increase and 60% latency reduction (1.43s to 568ms avg). The bimodal distribution is gone — 83% of requests now complete under 622ms.

Latency Distribution

Moderate Load Distribution (0.5 vCPU)
  0.29s - 0.46s  ████████████████                               485 (23%)
  0.46s - 0.62s  ████████████████████████████████████████████  1274 (60%)
  0.62s - 0.79s  ██████                                         196 (9%)
  0.79s - 0.95s  █                                                27 (1%)
  0.95s - 1.28s  ███                                              99 (5%)
  1.28s - 1.94s  █                                                35 (2%)

The distribution is now unimodal — no more queuing effects. The Node.js event loop has enough CPU headroom to process all requests without backing up.

Before vs After Upgrade

0.25 vCPU vs 0.5 vCPU at 20 Concurrent Users

Metric0.25 vCPU0.5 vCPUImprovement
Requests/sec14.035.1+150%
Total Requests (60s)8652117+145%
Avg Latency1.43s568ms-60%
p501.19s514ms-57%
p902.39s705ms-70%
p992.66s1.32s-50%
Error Rate0%0%same

Key Findings

Capacity by Tier

Concurrent UsersThroughputAvg Latencyp90p99Status
1 (0.25 vCPU)5.4 req/s185ms550ms-Comfortable
5 (0.25 vCPU)12.2 req/s410ms501ms1.71sComfortable
20 (0.25 vCPU)14.0 req/s1.43s2.39s2.66sCPU saturated
20 (0.5 vCPU)35.1 req/s568ms705ms1.32sComfortable

Auto-Scaling Behavior

The ECS service is configured to scale from 1 to 3 tasks based on CPU utilization > 70% and Memory utilization > 70%.

With the 0.5 vCPU tier and 3 tasks running, theoretical capacity reaches ~105 req/s sustained.

Queue Mode for Even Better Performance

These tests were run with n8n in its default single-process mode. If you enable queue mode, n8n separates webhook handling from workflow execution using a Redis-backed queue. A main instance handles the API and webhook ingestion while dedicated worker instances process executions asynchronously. This architecture would significantly improve throughput and latency under load — the webhook response returns immediately after queuing, and workers can scale independently. For production deployments expecting sustained traffic, queue mode is the recommended configuration.

Cost vs Performance

Measured Cost-Performance Tiers

Monthly CostConfigMeasured CapacityLatency (p90)
~$351x 0.25 vCPU, 512MB14 req/s2.39s at 20 concurrent
~$401x 0.5 vCPU, 1GB35 req/s705ms at 20 concurrent
~$40-551-3x 0.5 vCPU (auto-scale)~105 req/s<1s

The 0.5 vCPU tier costs only ~$5 more per month but delivers 2.5x the throughput with 70% lower p90 latency. This is the best value tier for self-hosted n8n.

When to Scale Up

  • p95 latency consistently above 2 seconds
  • Error rate above 1%
  • ECS auto-scaling frequently at max (3 tasks)
  • RDS CPU credits depleting (check CloudWatch CPUCreditBalance)

Conclusion

A $40/month n8n setup on ECS Fargate (0.5 vCPU, 1GB) handles 35 req/s sustained with sub-700ms p90 latency at 20 concurrent users — and zero errors. Upgrading from the minimal $35 tier (0.25 vCPU) to 0.5 vCPU delivered a 2.5x throughput increase for just $5/month more.

The key insight: this setup never drops requests. It queues them and responds slower under extreme load, but maintains 0% error rate across all test levels. With auto-scaling to 3 tasks, burst capacity reaches ~105 req/s.

For teams processing fewer than 2,000 webhook calls per minute, the 0.5 vCPU tier is the sweet spot between cost and reliability.

35 req/s
Sustained Throughput
0.5 vCPU, 20 concurrent
0%
Error Rate
Across all test levels
705ms
p90 Latency
At 20 concurrent users
$40/mo
Total Cost
ECS + RDS + ALB

Stack: n8n 2.9.4 | ECS Fargate (ARM64/Graviton, 0.5 vCPU, 1GB) | RDS PostgreSQL 17 | ALB + ACM | AWS CDK


Need Help with Your n8n Deployment?

Whether you're setting up n8n on AWS, optimizing performance, or planning a migration to queue mode — we've been through it and can help you get there faster.

Get in Touch