// Reference Architecture

Smart City Edge AI System: 16 Camera Traffic Architecture

Last updated: July 2026 — added outdoor enclosure and thermal engineering, UPS runtime sizing, backhaul bandwidth math, operations and failure detection, privacy notes, and multi-intersection scaling.

A high-density edge AI deployment pattern for intersections, traffic corridors, and municipal video analytics using 16 cameras, two Jetson AGX Orin 64GB nodes (~8 cameras per node), local storage, and event-first cloud synchronization.

16x traffic cameras

2× AGX Orin 64GB

~320-500W system load

Mixed 1080p/4K

Verdict

16-camera traffic analytics at 1080p 30 FPS typically requires a multi-node architecture. Recommended baseline: 2 × Jetson AGX Orin 64GB, mapped as ~8 cameras per node. A single AGX Orin can carry up to ~13 streams at 1080p 30 FPS INT8 detection, making it suitable for smaller intersections but not sustained 16-camera corridors. The workload simultaneously stresses decode, inference, storage, networking, and thermals — multi-node design is the default, not an escalation.

Try this in System Designer before finalizing node architecture.

Architecture Overview

The edge node should process live video locally, store recent video and events near the intersection, and send only metadata, alerts, violation clips, and health telemetry upstream.

16x Traffic CamerasMixed 1080p/4K at intersections and corridor lanes

PoE/Fiber AggregationIndustrial switch, camera VLAN, fiber backhaul

2× Jetson AGX Orin 64GBTwo-node compute, ~8 cams/node — multi-stream decode, detection, tracking, event logic

NVMe/RAID StorageLocal event clips, short rolling retention, evidence cache

Local Control PlaneSignal integration, health checks, policy enforcement

City DashboardTraffic metrics, alerts, incident review, fleet ops

Deployment Summary

Use case	Intersection analytics, vehicle and pedestrian detection, traffic counts
Cameras	16 fixed or PTZ cameras
Compute	2 × Jetson AGX Orin 64GB (~8 cameras per node)
Resolution	Mixed 1080p and 4K by lane and zone requirements
Frame rate	15-30 FPS depending on event type
Latency target	~60-300 ms detection + tracking; ~300-800 ms when escalating to cloud
Retention	3-14 days local, longer for event clips

Recommended Stack

Compute	2 × NVIDIA Jetson AGX Orin 64GB (multi-node baseline; scale to 3+ for redundancy or 4K-heavy mix)
Network	Industrial PoE/fiber aggregation with VLAN segmentation; one camera VLAN per node where possible
Storage	4-16 TB high-endurance NVMe or RAID-backed storage depending on resolution mix and retention. For example, 16 cameras at ~4 Mbps average for 14 days requires roughly 7.4 TB before overhead.
Camera codec	H.265 preferred for high camera density
Cloud pattern	Metadata, incidents, selected clips, model and policy sync

Camera Layer

Cover lane approaches, pedestrian crossings, turn lanes, and conflict points. Use mixed resolution strategically so compute is spent where decision quality matters most.

Alert and Control Layer

Integrate local event handling with traffic operations and safety workflows. Priority alerts should not wait on cloud round-trips during peak traffic conditions.

Compute Layer

Two AGX Orin 64GB nodes are the default for sustained 16-camera traffic analytics. Use single-node only when the deployment is ≤13 cameras at 1080p 30 FPS INT8 detection. Scale to 3+ nodes for heavy 4K mix, ensemble models, detection + pose pipelines, or N+1 redundancy.

Camera-to-Node Mapping

Two-node baseline distributes ~8 cameras per node. Group by lane, direction, or intersection zone where possible so a single node failure removes a coherent zone rather than scattering coverage gaps across the corridor.

Node A — cameras 1–8 (e.g. north and east approaches, pedestrian crossings on those legs)
Node B — cameras 9–16 (e.g. south and west approaches, conflict points on those legs)
Keep VLAN, PoE switch, and storage paths aligned with node ownership so a node outage does not cascade across other zones
Reserve a small management/spare port budget on each switch for diagnostics and node swap

Power and Performance

Component	Estimate
16 cameras x ~10-15W	~160-240W
Switching / aggregation overhead	~30-60W
Jetson AGX Orin 64GB (per node)	~30-60W
2-node compute total	~60-120W
Storage / enclosure / cooling	~20-60W per node
Full system estimate	~320-500W including cameras, compute, PoE switching, storage, and cooling

Expected Performance

Metric	Expected range
System stream capacity	~26 streams at 1080p 30 FPS INT8 detection (2 × ~13 streams/node)
Per-node capacity	~12-13 streams sustained
Recommended operating point	16 requested streams across 2 nodes (~8 streams per node) — ~1.6× headroom
GPU utilization	~85-95% per node near the 13-stream ceiling; ~55-70% per node in the recommended 2-node 16-camera deployment
Detection-only latency	~60-150 ms end-to-end
Detection + tracking + event logic	~150-300 ms
Cloud escalation	~300-800 ms depending on network path
Thermal load	High; industrial cooling and monitoring required

Benchmark note: Capacity guidance is based on benchmark-backed stream planning assumptions for 1080p 30 FPS INT8 detection and should be revalidated against the selected model, codec, camera bitrate, and JetPack/runtime version. Use the System Designer with your actual workload mix before finalizing node count.

Outdoor Enclosure and Thermal Engineering

Traffic deployments put compute in pole-mounted boxes or roadside cabinets, not climate-controlled rooms. Sun-heated cabinet exteriors can run 20–50°C above ambient air temperature, and interior temperatures can exceed 70–80°C even with ventilation. Thermal design must target the sun-heated cabinet interior, not the weather-report air temperature. If the worst-case ambient is unknown, assume 45°C and validate in the field after installation.

Junction temperature follows Tj = T_ambient + (TDP × Θja). Jetson modules throttle when Tj crosses roughly 85–95°C, so every degree of cabinet heat soak directly consumes thermal margin. AGX Orin requires active cooling — its 15–60W envelope is well past the ~10–15W practical limit for passive cooling in sealed enclosures. Plan for:

Enclosure rating: select an IP/NEMA-rated enclosure matched to site exposure (dust, rain, washdown), and verify the vendor-rated operating temperature range against local climate extremes for both summer heat soak and winter cold starts. Ventilation openings trade against ingress rating — sealed enclosures need a heat path (heat pipes to external fins, or filtered forced airflow).
Forced airflow: a small 12V fan draws 1–3W and can remove 50W+ of heat with a proper intake-to-exhaust path across the heatsink. In a cabinet, avoid recirculating exhaust air — stratified hot air at the top of a tall cabinet can run 10–20°C hotter than intake level.
Validation: run the sustained 16-camera pipeline (decode + inference + recording) in the target enclosure for at least 30 minutes while logging junction temperature via tegrastats, at the highest expected ambient. Compute actual Θja as (Tj_stable − T_ambient) ÷ TDP and repeat with a dust-fouled heatsink to simulate 6–12 months of field operation.
Cabling: PoE runs are limited to 100 m; use outdoor-rated shielded Cat6 with the shield grounded at one end only, and conduit where cable is exposed.

Design rule: if TDP × worst-case Θja exceeds the throttle threshold minus worst-case cabinet ambient, passive cooling will not hold — plan active cooling from the start rather than retrofitting after summer throttling appears.

Power Resilience and UPS Sizing

The PoE budget is the first constraint. Outdoor fixed cameras draw 10–12W each as a conservative planning figure (5–15W range depending on IR illuminators, PTZ motors, and onboard heaters). Sixteen cameras at 12W is 192W of camera load before switch overhead (~10–15W per switch). Cameras with heaters or PTZ mechanisms should be verified against PoE+ (802.3at, 30W/port); basic fixed cameras fit 802.3af (15.4W/port). Check the switch's total PoE budget, not just per-port class — split across two switches aligned to node ownership, each switch should carry ~8 cameras (~96W) with margin above that for winter heater load and future camera swaps.

Note that the AGX Orin nodes themselves cannot be PoE-powered: at 40–60W typical and 75W peak, they exceed every standard PoE class and need direct DC or AC supply in the cabinet.

For battery backup, size the UPS from the full system load (~320–500W for this architecture) using:

Required Wh = Peak Power (W) × Runtime (hours) × 1.25

500W, 15 min shutdown window	500 × 0.25 × 1.25 ≈ 156 Wh minimum; select 150–200 Wh class
500W, 20 min window	200–250 Wh
Thermal derating	Lithium-ion loses ~10–15% capacity per 10°C above 25°C; in a 45°C cabinet expect 20–30% runtime loss
Real-world runtime	Typically 70–80% of rated, after aging and inverter losses

Two traffic-specific rules: the UPS must also carry the PoE switches and backhaul equipment, or the system loses cameras and cannot report its own outage upstream; and load shedding on battery (dropping recording and telemetry while keeping detection and alerting) can extend runtime 30–50%. Verify the UPS power rating (W/VA) exceeds peak draw — Wh capacity alone is not sufficient. Full derivation in UPS sizing for edge AI.

Backhaul Options and Bandwidth Math

Camera-to-node traffic stays local on the intersection switch; backhaul only carries what leaves the site. The math splits into two regimes:

Full-stream backhaul: at ~4 Mbps per camera (H.265), 16 cameras aggregate to ~64 Mbps sustained; at 8 Mbps (H.264), ~128 Mbps. This is continuous, 24/7 load.
Event-first backhaul (this architecture): only metadata, alerts, selected clips, and health telemetry go upstream — typically orders of magnitude less than full streams. Size it by measuring clip volume per day at your event rate rather than assuming a figure.

Backhaul	Fit
Fiber	Preferred where municipal conduit exists; carries event traffic trivially and leaves headroom for full-stream evidence pulls or VMS integration later
Cellular	Viable only for event-first operation; verify sustained uplink throughput and data caps against measured daily clip volume, and expect variable latency on the cloud escalation path
Point-to-point wireless	Middle option for corridors without fiber; budget link capacity against the aggregate bitrate regime you actually run, with headroom for retransmission and weather fade

Latency: analytics vs recording. The detection and alert loop (~60–300 ms in the performance table above) runs entirely on the local nodes, and recording writes to local NVMe — neither depends on backhaul latency. Backhaul quality affects only the cloud escalation path (~300–800 ms) and dashboard freshness. That means backhaul selection is a throughput, cost, and reliability decision, not a detection-latency decision — a constrained cellular link degrades reporting speed, not intersection safety logic.

Bottlenecks and Failure Modes

Primary risk: assuming 16 cameras is just double an 8-camera design. At this scale, decode, storage writes, thermal envelope, and network isolation all become first-class architecture decisions.

Failure mode	What causes it	Symptom	Mitigation
Decode pressure	High 4K share or elevated FPS	Frame drops or instability	H.265, lower FPS, split nodes, AGX class platform
Inference saturation	Heavy model mix across many streams	Latency spikes and missed events	Model tiering, ROI inference, batching, multi-node split
Storage write pressure	Continuous high-bitrate recording	Write stalls and incomplete clips	High-endurance NVMe/RAID, event-focused retention
Thermal throttling	Outdoor cabinets with poor ventilation	Performance drops over time	Industrial enclosure, active cooling, thermal monitoring
Network exposure	Flat networks and weak segmentation	Security and manageability risk	Camera VLAN, management VLAN, firewall segmentation

Operations and Failure Detection

A 16-camera roadside system fails quietly unless it is instrumented to report its own degradation. Build these into day-one operations rather than after the first silent outage:

Camera outage detection: build stream reconnection into the pipeline (DeepStream supports it natively) and design it to keep processing surviving streams when one drops rather than halting. Alert on any stream down beyond a threshold — an intersection running 14 of 16 cameras for weeks is a common undetected state.
Time synchronization: event correlation across 16 streams requires synchronized timestamps. Run NTP on both nodes and point all cameras at the same NTP source.
Thermal soak monitoring: log junction temperature continuously (tegrastats) and alert on a sustained upward trend at constant load — that pattern indicates dust-fouled heatsinks or failing cabinet airflow before throttling starts, which matters most during the first summer.
Storage rotation: run recording as a ring buffer that overwrites oldest segments. At ~4 Mbps per camera, 16 cameras write roughly 345 GB/day (2× the 172 GB/day of an 8-camera node at the same bitrate) — a write load that exhausts consumer-drive TBW ratings quickly, so monitor SSD wear indicators and use high-endurance NVMe. Pin evidence clips outside the ring buffer so rotation cannot delete flagged material.
Security hygiene: change default camera credentials before deployment and keep the camera VLAN without internet access — roadside cabinets are physically accessible attack surfaces.
Battery health: check UPS battery condition quarterly and alert when estimated runtime drops below the shutdown window.

Redundancy and Failover

A 2-node design provides capacity, not full redundancy. If one node fails, roughly half the camera coverage is degraded or offline. For mission-critical corridors, use 3 nodes for N+1 redundancy or implement reduced-FPS failover routing where the surviving node temporarily ingests neighbouring cameras at lower frame rate or detection cadence. Design alert paths so a single node outage does not silently disable upstream incident reporting.

NVENC and Transcoding Notes

If the system re-encodes streams, NVENC becomes a separate constraint and can throttle long before inference does. Avoid unnecessary H.265 to H.264 transcoding — write evidence clips in the camera's native codec where possible. When transcoding is required (e.g. for legacy cloud ingest), split encoding load across nodes and reserve NVENC headroom by pinning recording to a subset of cameras per node rather than every stream on every node.

Privacy and Public-Space Notes

Public-space traffic analytics is regulated in most jurisdictions, and the rules vary enough that legal review is part of deployment, not an afterthought. The edge-first pattern in this architecture helps: raw video stays at the intersection, retention is short (3–14 days here), and only metadata, alerts, and selected evidence clips leave the site. Practical checklist:

Confirm local requirements for signage or public notification of camera analytics.
Keep retention at the minimum the operational need supports; document the retention policy per camera class (continuous footage vs evidence clips).
Restrict and audit access to evidence clips — export from the edge node should be logged and role-gated.
Prefer aggregate outputs (counts, speeds, occupancy) over identity-linked data upstream; where plate or face data is in scope, treat it as a separate regulated data flow with its own review.

Scaling Decisions

4-8 cameras: Orin NX can be sufficient for 1080p detection workloads unless models are heavy or multi-model.
10-13 cameras: use 1 × AGX Orin or 2 × Orin NX. Single Orin NX is no longer sufficient at 30 FPS 1080p detection.
14-16 cameras: use 2 × AGX Orin, mapped at roughly 8 cameras per node. Single AGX Orin is only viable below ~13 streams at 30 FPS 1080p.
Mostly 4K cameras: favor AGX-class capacity or split zones across multiple nodes.
Mission-critical alerting: design for redundancy, monitoring, failover, and thermal stability — not FPS alone.

Scaling to Multiple Intersections

A corridor or city-wide rollout repeats this intersection unit rather than centralizing video. The division of labor that keeps it tractable:

Stays at the edge (per intersection): decode, inference, tracking, event logic, recording, and time-critical alerting. These are latency- and bandwidth-bound to the intersection and do not benefit from centralization.
Centralizes: fleet health and thermal telemetry, dashboards and cross-intersection analytics computed on metadata (counts, travel times, incident patterns), model and policy distribution, and long-term evidence archive for clips that must outlive local retention.
Per-site backhaul stays small: because only metadata and clips travel, adding intersections scales backhaul roughly with event volume, not camera count — the aggregation point never ingests 64+ Mbps of raw video per site.
Standardize the unit: identical node images, camera configs, and VLAN layout per intersection turn a 20-site corridor into fleet operations instead of 20 bespoke systems, and make node swap the standard repair action.

Validate This Architecture With EdgeAIStack

System Designer — recommendation, headroom, risks, and alternatives.
Network Bandwidth — mixed-resolution stream and uplink sizing.
Storage Endurance — retention and write-endurance sizing.
Power Budget — PoE, enclosure cooling, and compute power planning.

FAQ

Can Jetson Orin NX handle 16 traffic cameras?

Not at 1080p 30 FPS INT8 detection. A single Orin NX caps near ~7 streams. 10-13 cameras at 30 FPS 1080p needs either 1 × AGX Orin or 2 × Orin NX; 16 cameras needs 2 × AGX Orin.

Is a single AGX Orin enough for 16 cameras?

No — not for sustained 16-camera 1080p 30 FPS INT8 detection. A single AGX Orin 64GB caps at ~13 streams (CPU and NVDEC ceilings). Use 2 × AGX Orin (~8 cams per node) as the baseline. Single AGX Orin is only appropriate at or below ~13 streams.

Why two nodes instead of one bigger box?

At 1080p 30 FPS, a single AGX Orin is bottlenecked by its CPU pipeline and NVDEC decoders before inference compute is the limit. Two AGX Orin nodes split decode and inference load cleanly, give ~1.6× stream headroom (26 streams of capacity for 16 requested), and provide a natural zonal partition for camera ownership.

Should smart city analytics stream everything to cloud?

Typically no. Process locally and send metadata, alerts, selected clips, and node telemetry upstream.

What is the biggest deployment risk?

Sustained operation under real-world heat, storage writes, network isolation constraints, and model complexity is usually the top risk, not peak lab benchmark numbers.