Jetson Deployment Checklist: Production Setup Steps That Prevent Failure
Last updated: March 2026
A production Jetson node needs more than a working demo. This checklist covers the setup steps that prevent storage overflow, security gaps, reboot failures, thermal issues, and silent pipeline outages in the field.
Quick Answer
Production readiness depends on completing all five phases in sequence: platform setup (JetPack, packages, GPU verification), storage and power (NVMe, Docker root, nvpmodel), security hardening (SSH keys, firewall, camera passwords), pipeline deployment (Docker, TensorRT, systemd service), and monitoring (SMART, thermal, pipeline metrics). Each phase builds on the previous; skipping earlier phases creates downstream failures that are harder to diagnose remotely.
Most deployment failures come not from the inference model itself but from skipped hardening, missing monitoring, or non-persistent runtime setup. A node that works perfectly during setup often fails within weeks of field deployment because configuration is lost on reboot, security gaps go undetected, or storage fills silently. This checklist ensures the node survives reboot, persists configuration, detects problems early, and is hardened before network exposure.
The most common deployment mistake is treating first boot as production-ready. Real production readiness requires persistence across reboot (systemd services, /etc/fstab), observability (SMART monitoring, thermal alerts, pipeline metrics), storage durability (NVMe with write endurance), and hardening (SSH keys, firewall, no default credentials) before the node ever touches a customer network. If the node cannot survive an unexpected reboot and recover automatically with full pipeline functionality, it is not production-ready.
Who This Page Is For
- Teams preparing the first Jetson node for production deployment
- Engineers converting a lab demo into a field deployment
- Operators standardizing repeatable rollout steps
- Teams trying to avoid reboot failures and silent outages
- Anyone documenting a deployment baseline for future node replication
Production Readiness at a Glance
- JetPack: Pinned version flashed, CUDA/TensorRT verified, packages updated
- Storage: NVMe mounted with noatime, Docker root on NVMe, ring buffer configured
- Security: SSH key-only auth, UFW enabled, default credentials changed on node and cameras
- Pipeline: systemd service enabled, auto-start tested after reboot, TensorRT engine built on target hardware
- Monitoring: SMART alerts at 80% TBW, thermal alerts at 90°C, pipeline FPS/latency metrics active
Rule: All five areas must be complete before handing a node off to operations. Partial deployments create time-bomb failure modes.
Why this matters: Edge nodes deployed under time pressure almost always have gaps in security hardening and monitoring. Security gaps surface 6–18 months later as network intrusions or data exfiltration. Missing monitoring means drive failures, thermal throttling, and pipeline stalls go undetected until a site visit confirms what logs would have shown weeks earlier.
Engineering Summary
- JetPack version determines your entire software stack: CUDA, cuDNN, TensorRT, and driver versions are all JetPack-pinned. Document the exact JetPack build—mismatches cause inference failures that are hard to debug remotely.
- NVMe is mandatory for production: eMMC and SD card storage fills rapidly under Docker layers, logs, and video ring buffers, and lacks the write endurance for sustained workloads. Move Docker root and all write-heavy paths to NVMe on day one.
- Security defaults are all wrong out of the box: JetPack ships with password-based SSH, a known default account, and unnecessary services running. All three must be corrected before a node is network-connected at a customer site.
- TensorRT engines are device-specific: An engine built on an Orin NX will not run on an Orin Nano. Build and cache engines on the exact target hardware; never copy engines between different Jetson SKUs.
- Monitoring must be active before deployment, not after first failure: SMART drive health, thermal zone readings, and pipeline liveness checks must be confirmed operational before the node ships. The first sign of a problem should be an alert, not a service call.
Checklist assumptions: This checklist is sequential, and skipping earlier phases usually creates downstream failures that are harder to diagnose later. For example, skipping Phase 1 package updates can break Phase 4 TensorRT builds with CUDA incompatibility errors, while skipping Phase 2 Docker root migration often surfaces later as disk pressure and monitoring alerts.
Note: The checklist assumes a production inference node with NVMe storage, persistent network connectivity, and remote management requirements. Headless lab nodes or rapid prototyping deployments may skip some hardening steps temporarily, but production nodes need all five phases complete before handoff to operations.
Estimated Setup Time by Phase
Strategic summary: The first node takes longest because it establishes the baseline. Later nodes become faster only if setup is scripted and documented—a manual repeat of the first node's steps will take just as long.
| Phase | First Node | With Provisioning Script | Primary Time Driver |
|---|---|---|---|
| 1. Initial Setup | 60–90 min | 30–45 min | JetPack flash + package updates |
| 2. Storage & Power | 30–60 min | 10–15 min | Partition layout + fstab verification |
| 3. Security Hardening | 45–60 min | 10–15 min | SSH config + firewall rules + VLAN verification |
| 4. Pipeline Deployment | 90–180 min | 30–60 min | TensorRT engine build (5–30 min) + end-to-end validation |
| 5. Monitoring | 30–60 min | 10–20 min | Alert configuration + documentation |
Note: Estimates assume single-node setup with hardware on hand. First-time TensorRT engine builds for large models (YOLOv8l INT8) add 15–30 minutes to Phase 4.
Recommendation: If you expect to deploy more than one node, convert every manual step in this page into provisioning code after the first successful build. The time investment pays for itself on the second node and becomes a repeatable baseline for all future nodes.
Phase 1: Initial Setup
This phase establishes a known-good software baseline and confirms the node is usable before configuration work begins.
[ ] Flash JetPack via SDK Manager or sdkmanager CLI
Use the current JetPack release from the NVIDIA developer portal. Flash from a Linux
host using SDK Manager, or use the pre-flashed SD card image for developer kits.
Verify the JetPack version matches your target software stack.
[ ] Complete initial OS setup (first boot)
Set hostname, timezone, and locale during first-boot setup. Use a hostname that identifies
the node's physical location (e.g., edgenode-warehouse-01). Avoid generic
hostnames like "jetson" that create conflicts when multiple nodes appear on the same network.
[ ] Update all packages
Run sudo apt update && sudo apt upgrade -y immediately after first boot.
JetPack images are rarely current with upstream security patches.
[ ] Verify CUDA, cuDNN, and TensorRT versions
Run dpkg -l | grep -E "cuda|cudnn|tensorrt" to confirm versions installed.
Document these versions — they determine which model export and conversion paths are
supported. Mismatched runtime versions are a common cause of inference failures.
[ ] Test GPU availability
Run python3 -c "import torch; print(torch.cuda.is_available())" (if PyTorch
is installed) or run a TensorRT sample to confirm the GPU is accessible and functional.
Phase 2: Storage and Power Configuration
This phase moves the node from demo storage assumptions to production durability and power behavior.
[ ] Install and format NVMe SSD
Insert the NVMe drive, partition it, and format according to your storage layout plan.
For the recommended 4-partition layout, see
storage layout and ring buffer design.
Mount partitions persistently via /etc/fstab with noatime on
the video partition.
[ ] Move Docker root to NVMe
By default, Docker stores layers in /var/lib/docker on the OS drive.
On systems with a small eMMC or SD card boot drive, this fills quickly. Update
/etc/docker/daemon.json to set "data-root": "/opt/docker"
pointing to the NVMe.
[ ] Configure nvpmodel power mode
Select the power mode matching your thermal design:
sudo nvpmodel -m 1 for 7W mode (Orin Nano), -m 0 for max
power. Make the setting persistent across reboots by editing /etc/nvpmodel.conf.
For thermal context, see fanless enclosure
thermal constraints.
[ ] Set jetson_clocks for sustained inference performance
Run sudo jetson_clocks to lock CPU, GPU, and EMC clocks to maximum for
benchmarking. For production, evaluate whether jetson_clocks or the
dynamic clock governor better matches your latency vs. power requirements.
[ ] Verify swap configuration
Check swap with free -h. Jetson enables zRAM swap by default. For
production inference nodes where memory pressure should be managed by right-sizing
hardware rather than swapping, consider disabling swap to prevent unpredictable
latency spikes. See RAM sizing for
edge inference for guidance on sizing correctly.
Phase 3: Security Hardening
This phase closes the most common attack paths before the node is exposed to customer or enterprise networks.
[ ] Change default OS user password
The default JetPack account uses a well-known password. Change it immediately:
passwd. Disable or remove the default account if a dedicated service
account is used for deployment automation.
[ ] Configure SSH key authentication; disable password SSH
Add a public key to ~/.ssh/authorized_keys. Then set
PasswordAuthentication no in /etc/ssh/sshd_config and
restart sshd. This prevents brute-force password attacks on SSH.
[ ] Configure UFW firewall
Enable UFW with a default-deny policy. Allow only necessary ports:
SSH (22), RTSP ingestion if cameras connect to the node on a routed path, and
any application-specific management ports. Block all inbound traffic by default.
[ ] Disable unnecessary services
Disable services not needed in production:
sudo systemctl disable bluetooth,
sudo systemctl disable avahi-daemon,
sudo systemctl disable cups.
Reduce attack surface by running only required services.
[ ] Configure unattended-upgrades for security patches
Install unattended-upgrades and configure it to automatically apply
security updates. Review the configuration to exclude kernel updates (which may
require JetPack re-flash) while allowing userspace security patches.
[ ] Change default camera credentials
Log in to each IP camera's web interface and change the default username/password.
Document credentials securely. Cameras with default credentials are one of the
most common entry points for network intrusions.
[ ] Verify VLAN isolation
From a device on the camera VLAN, verify that it cannot reach the management VLAN
or the internet (unless explicitly permitted). From the compute node's management
interface, verify that it can reach the camera VLAN for RTSP but camera devices
cannot initiate connections to the management VLAN.
Phase 4: Inference Pipeline Deployment
This phase makes the inference application persistent, reproducible, and verifiably correct on the target hardware.
[ ] Deploy inference application via Docker
Use the NVIDIA container toolkit with --runtime nvidia for GPU access.
Pin container image versions explicitly — avoid :latest tags in production.
Test container startup time; long startup delays affect recovery time after node reboots.
[ ] Configure all RTSP stream URLs
Enumerate each camera's RTSP URL in the application configuration file. Use static
IP assignments or DNS entries — avoid relying on mDNS or dynamic hostnames.
Test each stream with ffprobe before starting the inference pipeline.
[ ] Run TensorRT engine build and verify
TensorRT engines are device-specific — build engines on the target Jetson, not on a
development machine. Engine build can take 5–30 minutes depending on model size.
Verify engine output with a test image before integrating into the full pipeline.
[ ] Configure ring buffer and log rotation
Set up the ring buffer cleanup process and configure logrotate. Test ring buffer
behavior by filling the video partition to capacity during QA. Verify that the
cleanup process runs correctly and that no recording gaps result from deletion timing.
[ ] Configure systemd service for auto-start
Create a systemd service unit for the inference application. Set
Restart=on-failure and RestartSec=10s for automatic
recovery. Enable the service: sudo systemctl enable inference-pipeline.service.
Test by rebooting the node and verifying the pipeline starts automatically.
[ ] Validate inference accuracy end-to-end
Run a known test scenario through the pipeline — a person walking in frame, a
vehicle of known type, or an object in a known position — and verify that detections
are correct and alerts fire as expected. Do not assume the pipeline works correctly
in production without end-to-end validation.
Phase 5: Monitoring and Maintenance
This phase ensures the first sign of a problem is an alert, not a site visit.
[ ] Configure SMART monitoring for NVMe
Install smartmontools and configure smartd to run weekly
health checks. Log SMART attributes to a remote monitoring system. Alert when
Percentage Used exceeds 80% or Available Spare drops below 20%.
[ ] Set up thermal monitoring
Log junction temperatures via tegrastats or a custom script reading
/sys/devices/virtual/thermal/thermal_zone*/temp. Alert when any thermal
zone exceeds 90°C. Sustained throttling indicates an enclosure thermal design issue
that must be resolved before leaving the node in production.
[ ] Configure NUT for UPS monitoring
If a UPS is installed (recommended — see
power and UPS for edge deployments),
configure NUT to monitor battery state and trigger graceful shutdown when battery
falls below 30% charge or 3 minutes estimated runtime.
[ ] Establish pipeline health metrics
Export metrics from the inference pipeline: frames processed per second, inference
latency (p50, p95, p99), dropped frames, missed detections on test scenes.
A healthy pipeline should show stable FPS and latency over 24-hour periods with no
degradation trend.
[ ] Document node configuration
Record: JetPack version, TensorRT version, model name and version, RTSP stream URLs,
power mode, partition layout, NVMe model and serial number, UPS model, and network
configuration (IPs, VLANs). Store this in a version-controlled configuration repository.
Edge nodes deployed without documentation become unmaintainable.
Checklist Summary Table
| Phase | Key Tasks | Risk If Skipped |
|---|---|---|
| 1. Initial Setup | Flash JetPack, update packages, verify GPU | Unpatched CVEs, runtime version mismatch |
| 2. Storage & Power | NVMe layout, Docker root, nvpmodel, swap | OS storage overflow, thermal throttling, latency spikes |
| 3. Security Hardening | SSH keys, firewall, service disable, camera passwords, VLAN verify | Network intrusion, camera takeover, data exfiltration |
| 4. Pipeline Deployment | Docker, RTSP config, TensorRT build, ring buffer, systemd, validation | Pipeline failure on reboot, recording gaps, incorrect inference |
| 5. Monitoring | SMART, thermal, UPS, pipeline metrics, documentation | Silent failures, undetected drive wear, unrecoverable outages |
Common Pitfalls
- Skipping security hardening because the node is "on a private network": Private networks get breached. A Jetson on the corporate LAN with default SSH passwords and no firewall is one lateral move away from a significant incident.
- Not testing auto-start after reboot: A pipeline configured to start manually works fine during setup and fails silently after the first unexpected reboot. Test reboot recovery during QA, not after deployment.
- Building TensorRT engines on a different Jetson model: TensorRT engine files are not portable between different Jetson modules. An engine built on an Orin NX will not run on an Orin Nano and vice versa. Build engines on the exact target hardware.
- Deploying without end-to-end pipeline validation: Testing the inference model in isolation does not validate the full pipeline — RTSP ingestion, decode, pre-processing, inference, post-processing, and alert logic all need end-to-end testing with real camera input.
- Not pinning Docker image versions:
docker pull my-inference-app:latestin a production deployment script will silently pull a new image version on the next update run, potentially breaking the pipeline with incompatible changes. - Documenting nothing: Edge nodes deployed without configuration documentation are effectively black boxes. When the node fails 18 months later, the team has to reconstruct every configuration decision from scratch.
Final Sign-Off Checklist
- ☐ Node rebooted and inference pipeline confirmed auto-started via systemd?
- ☐ End-to-end pipeline validation completed with real camera input and a known test scenario?
- ☐ SMART monitoring confirmed active with alert threshold set at 80% Percentage Used?
- ☐ Thermal monitoring confirmed active; no thermal zones exceeding 85°C under sustained load?
- ☐ Node configuration documented and stored in version control (JetPack version, model, NVMe serial, VLAN layout)?
Frequently Asked Questions
How long does TensorRT engine optimization take?
Engine build time depends on model size and optimization level. YOLOv8n INT8: 3–8 minutes on Orin Nano. YOLOv8l INT8: 15–30 minutes on Orin NX. Build once, cache the engine file. Rebuild only when the model changes or JetPack is updated.
Should I use Docker or run the inference application natively?
Docker is strongly recommended for production deployments. It provides dependency isolation, reproducible environments, and simplified updates via image version pinning. The NVIDIA Container Toolkit provides full GPU access from within Docker containers on Jetson.
How do I remotely access a deployed Jetson?
SSH over the management VLAN is standard. For nodes without a VPN or direct network path, a reverse SSH tunnel or a lightweight VPN (WireGuard) allows secure remote access without exposing SSH to the internet. Never expose SSH directly to a public IP.
What is the recommended JetPack upgrade procedure?
JetPack upgrades (e.g., from 6.0 to 6.1) typically require a full re-flash. Document the current configuration, snapshot any custom configuration files, re-flash the new JetPack, then re-apply configuration from your documentation. A configuration-as-code approach (Ansible, custom provisioning scripts) makes this substantially faster.
How do I handle failed inference pipeline containers?
Configure the systemd service or Docker restart policy (--restart=on-failure:5) for automatic restart. Add a watchdog process that monitors pipeline liveness (e.g., checks that frames are being processed at expected rate) and restarts the container if the pipeline stalls without crashing.
Is there a way to A/B test model versions on a deployed node?
One approach: maintain two model directories (model-a and model-b) and an environment variable in the container configuration that selects the active model. Switching models becomes a container restart with the environment variable changed — no re-deployment of the container image required.
The Bottom Line
A Jetson deployment is production-ready only when it survives reboot, writes to durable storage, runs with monitoring, and is hardened before network exposure. A model working once in the lab is not a production sign-off.
Recommended Reading
- Jetson vs Coral TPU: Choosing the Right Accelerator Before Deployment
- RAM Sizing for Edge AI Inference
- Storage Layout and Ring Buffer Design for Edge AI
- Network Hardening and VLAN Setup for Edge AI Nodes
- Fanless Enclosure Selection and Thermal Validation
- YOLOv8 RAM Requirements on Jetson: What to Expect