What is estimated FPS?

Estimated frames per second — how many inference passes the hardware can complete per second for the selected model, precision, and runtime. Higher FPS is better for real-time inference.

What is the difference between latency/batch and latency/image?

Latency/batch is the time to process a full batch of frames. Latency/image divides that by batch size — the per-frame processing time. For real-time streaming, latency/image is the relevant metric.

What does the confidence score mean?

High (90%): exact published vendor benchmark. Medium (65%): interpolated from GFLOPs across known variants. Low (40%): theoretical TOPS heuristic with no benchmark data. Always validate Low-confidence estimates on device.

Why is TensorRT so much faster than PyTorch?

TensorRT performs layer fusion, precision calibration, and kernel auto-tuning at build time — extracting 1.5–2.5× more throughput than vanilla PyTorch inference on Jetson hardware. The build step (trtexec) takes minutes but runs once.

What is DLA (Deep Learning Accelerator)?

DLA is a fixed-function neural network processor on Jetson Orin NX and AGX Orin (2 DLA cores each). It runs supported layers alongside the GPU, freeing GPU headroom for other tasks. Not all YOLO11 ops are DLA-compatible; unsupported layers fall back to GPU automatically.

How accurate are these estimates?

Benchmark-backed estimates are ±10–15% of real measured throughput under similar conditions. GFLOPs-interpolated: ~65% accurate. Theoretical TOPS: planning-only (±30–50%). Always measure on target hardware before finalising a deployment design.

What inference engines does this tool support?

The estimator supports NVIDIA TensorRT, PyTorch, ONNX Runtime, Google Coral Edge TPU SDK, and Hailo Runtime. Runtime availability depends on the selected hardware platform — unsupported runtimes are disabled in the selector.

Estimate inference throughput for your edge AI workload

Estimate real-world inference throughput for vision models on edge AI hardware. Configure runtime, precision, batch size, and concurrent streams to compare FPS, per-image latency, and maximum stream capacity. Supports NVIDIA Jetson, Google Coral, and Hailo platforms.

Configuration

Platform Family

Loading hardware catalog…

Platform

// Select a platform first

Power Mode

// Select a platform first

Runtime

Model Family

Model Variant

// Select a model family first

Precision

Resolution

224×224

320×320

416×416

640×640

1280×720

Advanced

Batch Size

Streams

// Select hardware, runtime, and model to continue

Quick Results

Estimated FPS —

Latency / batch —

Latency / image —

Accelerator util. —

—

Multi-Stream Capacity

FPS per stream —

Max streams @ 30fps —

Max streams @ 15fps —

Total FPS (all streams) —

Planning Notes

Configure inputs to see planning recommendations.

Assumptions

Configure the system to see detailed assumptions.

// RELATED TOOLS

→ Tool 07: Module Power Calculator → Tool 06: Full Deployment Planner

// machine-readable output — application/json

{ }