// Power & Thermals

UPS Sizing for Edge AI Systems: Runtime Calculation, Power Draw, and Capacity Planning

Q: How do I calculate UPS capacity for multi-GPU edge systems?

Sum peak power of all GPUs, CPU, storage, and network under full inference load. Apply formula: (Peak W × Runtime Hours × 1.25) = Required Wh. Example: 500W × 0.25h × 1.25 = 156Wh. Verify both Wh capacity and VA/W power rating. Real-world runtime typically 70–80% of rated; at 45°C ambient, deduct additional 20–30%.

Q: What runtime is sufficient for edge AI graceful shutdown?

5–10 min for stateless inference; 10–20 min for model checkpointing and state persistence. Critical applications: 20+ min to ensure network notification and fallback activation. Measure your shutdown procedure, add 20% buffer, and account for network infrastructure protection (switch, modem, router).

Q: How does temperature affect UPS battery runtime?

Lithium-ion capacity degrades ~10–15% per 10°C above 25°C. At 45°C ambient, expect 20–30% runtime loss; at 55°C, 40–50% loss. Factor this into capacity planning: multiply calculated Wh by 1.25 for warm climates. Implement thermal management: ventilation, UPS isolation from heat sources, continuous temperature monitoring.

Q: What UPS topology is best for edge AI redundancy?

Hot-standby dual UPS (primary + secondary with automatic switchover) reduces MTTR and eliminates single-point failure. Requires compatible models and controller. Parallel UPS (both supply simultaneously) provides better load sharing. For non-critical: single UPS with quarterly health checks and documented replacement plan.

Last updated: March 2026

This page is the canonical authority on UPS sizing math for edge AI: measuring peak power draw, calculating Wh capacity with thermal derating, distinguishing Wh vs VA ratings, and planning for real-world runtime (typically 70–80% of rated). Most deployments at 300–800W require 150–300Wh with proper thermal management and shutdown buffer planning.

3 core inputs

1.25× multiplier

70–80% real runtime

Wh ≠ VA rating

Last updated: March 2026

Scope of This Page: This article focuses on UPS capacity calculation, runtime planning, Wh vs VA distinction, thermal derating, and shutdown window math. For full system power architecture spanning compute selection, PoE switch sizing, camera power budgets, and deployment topology strategy, see Power and UPS for Edge Deployments.

Quick Answer: UPS sizing requires three inputs: (1) peak power draw measured under realistic load, (2) minimum runtime for graceful shutdown (15–20 min typical), (3) ambient temperature and battery age derating. Formula: (Peak Power W × Runtime Hours × 1.25) = Required Wh. Example: 500W × 0.25 hours × 1.25 = 156Wh minimum. Critical: Actual runtime is typically 70–80% of rated time due to thermal derating, inverter losses, and aging. At 45°C ambient, expect 20–30% runtime loss. Always verify UPS power rating (W or VA) exceeds your peak load; Wh capacity is only half the picture. A 200Wh UPS must handle 500W instantaneous draw or will fail under load.

Who This Page Is For

Deployment engineers sizing UPS backup for single-node or distributed edge AI systems
Operations teams designing power resilience for mission-critical inference pipelines
Infrastructure planners calculating total power budget including backup and redundancy
System integrators evaluating UPS battery chemistry (lithium-ion vs. lead-acid) trade-offs for edge hardware

Power Draw Estimation for Edge AI Inference

Accurate UPS sizing begins with measuring or estimating the peak power consumption of all edge AI hardware under full inference load. Edge AI accelerators—GPUs, TPUs, and NPUs—form the primary power consumer, typically drawing 15–150W per device depending on architecture and utilization. NVIDIA Jetson modules range from 5W (Jetson Nano) to 70W (Jetson AGX Orin), while dedicated inference accelerators like Google TPUs consume 15–40W per unit. CPU overhead adds 10–40W depending on processor class and concurrent workload complexity. This section assumes lithium-ion chemistry (different thermal/aging profile from lead-acid) and 20°C ambient baseline; verify derating for your deployment temperature.

Beyond compute, storage and network interfaces contribute measurable power. NVMe SSDs consume 5–10W during sustained writes; Ethernet switches and wireless modules add 2–8W. In a multi-device edge cluster, these peripheral loads accumulate quickly. A realistic power budget for a compact edge AI box running dual GPUs, CPU, storage, and networking often reaches 300–500W sustained, with peak transient spikes 20–30% higher during model loading or batch inference.

Measurement is preferable to estimation. Deploy the target inference workload on candidate hardware and log power draw over 10–15 minutes using inline power meters or system monitoring APIs. This captures realistic sustained consumption rather than theoretical maximums. Document peak values, average utilization, and idle baselines separately—UPS sizing should target sustained peak load, not absolute transient spikes, unless your deployment includes sudden burst patterns.

Hardware Component	Typical Power Range	Notes
GPU (inference-optimized)	15–70W	Jetson Nano to AGX Orin; scales with model complexity
TPU / NPU	15–40W	Edge TPUs, Qualcomm Hexagon; lower power than GPU
CPU (ARM/x86)	10–40W	Quad-core ARM at 2GHz to 8-core x86 at 3GHz
Storage (NVMe)	5–10W	Active write; lower during reads and idle
Network (Ethernet/WiFi)	2–8W	Active transmission; varies by interface type

Runtime Requirements and Shutdown Scenarios

UPS runtime must accommodate the time needed to safely terminate inference workloads and persist critical state. Stateless inference—single-request processing with no persistent model state—requires only 5–10 minutes to complete in-flight requests, flush output queues, and notify upstream systems. However, most production edge AI deployments involve model checkpointing, session state persistence, or database synchronization, extending safe shutdown to 10–20 minutes.

Critical applications demand additional margin. If your edge AI system manages real-time control loops, autonomous decisions, or financial transactions, allocate 20+ minutes to ensure model state is safely written to persistent storage, network acknowledgments are received, and fallback systems are notified. This prevents data loss and orphaned transactions during unexpected power loss.

Graceful shutdown procedures should be automated and tested. Define a shutdown sequence: (1) stop accepting new inference requests, (2) allow in-flight requests to complete (timeout after 2–3 minutes), (3) flush output buffers and message queues, (4) checkpoint model state and session data, (5) notify upstream services and monitoring systems, (6) unmount network filesystems cleanly. Measure this sequence under realistic load and add 20% buffer to your calculated runtime requirement.

Network connectivity during power loss complicates shutdown. If your edge AI system relies on cloud synchronization or remote logging, ensure the UPS also protects network infrastructure (switch, modem, router). A UPS protecting only the compute hardware while network infrastructure loses power will fail to notify upstream systems. Plan accordingly in your power budget and UPS topology.

UPS Capacity Calculation and Headroom

UPS capacity is measured in watt-hours (Wh), representing the total energy available from the battery. The capacity calculation formula is:

(Peak Power in Watts × Runtime in Hours × 1.25) = Required Wh Rating

The 1.25 multiplier represents a conservative 20–30% safety margin. This headroom accounts for: (1) battery aging (capacity decreases ~2–3% annually), (2) thermal derating (10–15% per 10°C above 25°C ambient), (3) inverter efficiency losses (85–98%), and (4) depth-of-discharge margin (don't cycle to 100% DoD to extend battery life). Critical: Real-world runtime is typically 70–80% of rated time. A UPS rated for 20 minutes at 500W under ideal conditions will deliver ~14–16 minutes at 45°C ambient or after 2–3 years of aging.

Example calculation: A 500W peak load requiring 15 minutes of runtime:

500W × 0.25 hours = 125 Wh base requirement
125 Wh × 1.25 (safety margin) = 156 Wh minimum UPS capacity

Planning Takeaway — Wh vs. VA Distinction: UPS specifications often list both watt-hours (Wh = energy capacity) and volt-amperes (VA = power handling). For edge AI systems, verify both: calculate Wh for runtime duration, and verify VA rating exceeds your peak power draw (e.g., 500W peak requires a 600–750VA UPS minimum).

In practice, select a UPS rated for 150–200Wh to provide additional aging margin and accommodate unexpected power surges. For larger deployments (800W+ load, 20-minute runtime), 250–400Wh UPS units become necessary.

Verify the UPS specifications include both capacity (Wh) and power rating (W). A UPS must handle peak power draw; an undersized unit may fail to supply full current even if total energy is adequate. Check the inverter's sustained and peak power ratings match your hardware's demands.

Sustained Load	Runtime Target	Recommended UPS Capacity
300W	15 min	100–150Wh
500W	15 min	150–200Wh
800W	15 min	250–300Wh
500W	20 min	200–250Wh

Battery Chemistry and Thermal Considerations

Lithium-ion UPS systems dominate edge AI deployments due to higher energy density, longer cycle life (1000+ cycles), and faster charging. They offer 95–98% inverter efficiency, minimizing energy waste as heat. Lead-acid alternatives are cheaper but heavier, less efficient (85–90%), and degrade faster in high-temperature environments—typically unsuitable for edge AI where space and thermal management are critical.

Temperature dramatically affects battery runtime. Lithium-ion capacity degrades approximately 10–15% per 10°C above 25°C ambient. At 45°C (common in outdoor or industrial edge deployments), expect 20–30% runtime loss. At 55°C, runtime may drop 40–50%. Conversely, cold temperatures below 0°C also reduce capacity, though less severely than heat.

For edge AI systems deployed in warm climates or enclosed cabinets, implement thermal management: ensure adequate ventilation around the UPS, use thermal padding to isolate the UPS from external heat sources, and monitor battery temperature via UPS management software. Some advanced UPS models include temperature sensors and automatic load shedding if battery temperature exceeds safe thresholds (typically 50–55°C).

Cycle life and depth of discharge (DoD) matter for long-term reliability. Lithium-ion batteries rated for 1000 cycles at 80% DoD will degrade faster if regularly discharged to 100%. Design your UPS capacity with 20–30% reserve—do not size a UPS to discharge completely under normal shutdown scenarios. This extends battery lifespan from 3–5 years to 5–7 years in typical edge deployments.

Failure Mode Protection and Redundancy

Single-UPS configurations carry inherent risk: a UPS failure or battery degradation leaves edge AI systems vulnerable to power loss. Redundancy strategies mitigate this:

Hot-Standby Dual UPS: Two UPS units connected in parallel with automatic switchover logic. The primary UPS supplies power; the secondary monitors and takes over if the primary fails. This approach reduces mean time to repair (MTTR) and protects against single-unit failure. Requires compatible UPS models and a management controller to coordinate switchover.

Parallel UPS Configuration: Both UPS units supply power simultaneously, sharing the load. If one fails, the other continues supplying full power (assuming adequate capacity). This provides better load balancing and faster recovery but requires careful synchronization of inverter outputs.

Hot-Swap Battery Modules: Some enterprise UPS systems support battery cartridge replacement without powering down. This reduces downtime during battery maintenance or replacement and is valuable for mission-critical edge AI deployments.

For most edge AI deployments, a single UPS with quarterly battery health checks and a documented replacement plan is sufficient. However, if your edge AI system is deployed in a remote location or supports critical infrastructure, dual UPS with automatic switchover is justified.

Monitoring and Load Shedding Strategies

UPS management software provides visibility into battery health, remaining runtime, and load draw. Integrate this data into your edge AI monitoring stack: alert operations if UPS capacity drops below 80%, battery age exceeds 4 years, or estimated runtime falls below your shutdown window.

Load shedding extends runtime significantly. During battery mode, disable non-critical services to reduce power consumption: suspend logging and telemetry, disable secondary network interfaces, pause non-essential background tasks. This can extend runtime 30–50% without affecting core inference. For example, reducing load from 500W to 300W extends 15-minute runtime to 25 minutes—a substantial improvement.

Automate load shedding via UPS management APIs. When battery mode is triggered, send a signal to your edge AI orchestration layer to activate a "power conservation" profile. This profile stops accepting new inference requests, disables non-critical services, and prioritizes state persistence. After power is restored, the system returns to normal operation.

Monitor battery voltage and temperature continuously. If temperature exceeds 50°C or voltage drops below 10% of nominal, activate emergency shutdown rather than continuing inference. This prevents battery damage and data loss from unexpected power failure.

Decision Framework: From Measurement to Deployment

UPS sizing follows a structured three-step process:

Step 1: Measure Peak Power Draw
Deploy your target inference workload on candidate hardware for 10–15 minutes under realistic load. Log power consumption using inline meters or system APIs. Record sustained peak, average, and idle values. If hardware is not yet available, consult manufacturer datasheets and add 20% for system-level overhead (power conditioning, fans, etc.).

Step 2: Define Required Runtime
Document your graceful shutdown procedure and measure its duration under load. Add 20% buffer. For stateless inference, 10 minutes suffices; for state-persistent systems, target 15–20 minutes. Critical applications should plan for 20+ minutes to ensure network notification and fallback activation.

Step 3: Calculate Capacity and Verify at Ambient Temperature
Apply the formula: (Peak Power × Runtime Hours × 1.25). Verify the result against expected ambient temperature using the 10–15% per 10°C degradation curve. If your deployment is in a 45°C environment, multiply the calculated capacity by 1.25 to compensate for thermal losses. Select a UPS model that meets this adjusted requirement and verify its power rating (W) exceeds your peak load.

Step 4: Assess Redundancy and Monitoring Needs
For non-critical deployments, a single UPS with quarterly health checks is acceptable. For mission-critical systems, implement dual UPS with automatic switchover and continuous monitoring integration. Define alert thresholds and test failover scenarios quarterly.

Frequently Asked Questions

How do I calculate UPS capacity for multi-GPU edge systems?

Sum the peak power of all GPUs, CPU, storage, and network hardware under full inference load. Multiply by desired runtime in hours, then apply a 1.25 safety multiplier. Example: 500W peak × 0.25 hours × 1.25 = 156Wh minimum. Select a UPS rated 150–200Wh to provide aging margin.

What runtime is sufficient for edge AI graceful shutdown?

5–10 minutes for stateless inference; 10–20 minutes for model checkpointing and queue flushing. Critical applications should target 20+ minutes to ensure safe state persistence, network notification, and fallback system activation.

How does temperature affect UPS battery runtime?

Lithium-ion runtime degrades approximately 10–15% per 10°C above 25°C ambient. At 45°C, expect 20–30% runtime loss. Implement thermal management (ventilation, isolation, monitoring) to maintain consistent performance.

Should I use load shedding during battery mode?

Yes. Disable non-critical services (logging, telemetry, secondary networks) to extend runtime 30–50%. Prioritize inference and state persistence. Automate via UPS management software triggered on battery mode detection.

What UPS topology is best for edge AI redundancy?

Dual UPS units with automatic switchover (hot-standby) reduce single-point failure risk and support maintenance without downtime. Requires compatible UPS models and a management controller. For non-critical deployments, a single UPS with quarterly health checks is sufficient.

Bottom Line

UPS sizing for edge AI is a practical exercise grounded in three measurements: peak power draw, required runtime for graceful shutdown, and ambient temperature effects on battery chemistry. A typical deployment consuming 300–800W requires a 150–300Wh UPS rated for 15-minute runtime with 20–30% safety headroom. Lithium-ion chemistry, thermal management, and automated load shedding extend reliability. For mission-critical systems, redundant UPS configurations and continuous monitoring provide additional assurance. Test your shutdown procedures and battery health quarterly to maintain uptime and prevent data loss during unexpected power events.

Recommended next step: Use the Power Budget Planner to measure or estimate peak hardware load, then apply the capacity formula with your shutdown runtime. For multi-site deployments, use the Full Deployment Planner to coordinate UPS capacity across all edge nodes and define redundancy strategy.

Related (Cluster Picks):