Calculate stream capacity for your edge AI hardware

Plan multi-stream inference capacity for edge AI deployments. Select hardware, model architecture, and resolution to estimate per-stream FPS, maximum camera streams, and total system throughput — including pipeline overhead from capture, pre-processing, and NMS decode. Supports NVIDIA Jetson (5 modules), Google Coral TPU, and Hailo-8 / 8L.

Configuration
Task Type
Security / Surveillance
Robotics
Traffic / ITS
Retail / Analytics
Custom
Platform Family
Loading hardware catalog…
Platform
// Select a platform first
Runtime
Model Family
Model Variant
// Select a model family first
Precision
Resolution
224×224
320×320
416×416
640×640
1280×720
1920×1080
Target FPS
30 fps
15 fps
10 fps
5 fps
Streams
1
2
4
8
16
32
Advanced
Pipeline Overhead
Include (4ms pre+post)
Inference only
Batch Size
1
2
4
8
// Select hardware, runtime, and model to continue
Stream Capacity
Max streams @ 30fps
Max streams @ 15fps
Max streams @ 10fps
Baseline FPS (1 stream)
Effective FPS (with overhead)
Feasibility Assessment
Streams
Required total FPS
Available FPS
Accelerator utilisation
Planning Notes
Configure inputs to see planning recommendations.
Assumptions
Configure the system to see detailed assumptions.
// RELATED TOOLS
→ Tool 08: Inference Throughput Estimator → Tool 09: Memory Estimator
// machine-readable output — application/json
{ }

Edge AI Stream Capacity — Planning Guide

How this stream capacity calculator works

The calculator estimates multi-stream inference capacity by dividing the hardware's effective FPS — after pipeline overhead — by the target FPS per stream. Pipeline overhead accounts for image capture, resize/normalize preprocessing, and NMS postprocessing per frame (typically 3–8ms in real deployments). Select your hardware, model, resolution, and target FPS to see how many camera streams your deployment can sustain simultaneously.

What affects multi-stream inference capacity

Edge AI stream capacity is determined by several compounding factors: model complexity (parameter count and layer depth), input resolution (FPS scales roughly inversely with pixel count), inference runtime (TensorRT INT8 vs FP16 vs ONNX), GPU or accelerator utilization, and pipeline preprocessing cost. On Jetson Orin modules, unified memory shared between CPU and GPU further constrains how many concurrent AI video analytics streams can run before memory becomes the bottleneck rather than compute.

Related tools

Inference Throughput Estimator — estimate single-stream FPS and latency by model and hardware.
Memory Estimator — calculate VRAM and RAM requirements before sizing stream count.
Module Power Calculator — size PSU and thermal budget for multi-stream deployments.
Full Deployment Planner — combine stream capacity, memory, and power into an end-to-end edge AI BOM.

FAQ
What does "feasible" mean?

Feasible means the hardware's effective FPS (after pipeline overhead) is ≥ (desired streams × target fps). A 4-stream @ 30fps requirement needs 120 fps minimum from the inference engine.

What is pipeline overhead?

In a real deployment, each frame requires pre-processing (resize, normalize), inference, and post-processing (NMS decode). The 4ms overhead per frame covers this — real deployments often see 3–8ms. Disable in Advanced for pure inference comparison.

Why does resolution matter so much?

FPS scales roughly inversely with pixel count. Running at 1920×1080 vs 640×640 means ~9× more pixels per frame — FPS drops by ~7–9× for pixel-bound operations. TensorRT with INT8 is less sensitive, but the effect is still significant.

How accurate are stream counts?

Benchmark-backed estimates are ±15%. Interpolated estimates are ~65% accurate. Theoretical estimates are planning-only (±30–50%). Resolution scaling adds ±25% additional uncertainty. Always test on device with representative workloads.