I decided to write this blog post after attempting to configure APC’s PowerChute Network Shutdown (PCNS) software to gracefully shut down a number of Windows and Linux computers on my network, all of which are attached to various APC SmartUPS Uninterruptible Power Supply (UPS) units with AP9617 and AP9630 Network Management Cards (NMC) installed.
Throughout that process, I found the configuration options very confusing. Even after reading the online documentation and the software’s built-in help files, I still scratched my head over terms like Start of Shutdown, Low Battery Duration, Shutdown Delay, Return Delay, and Maximum Required Delay. But after some reading, experimentation, a phone call to APC, and some excellent help from a couple of techs in APC’s forums (word to my homies Angela and Bill!), I think I’ve finally figured it out… and I hope this article can help you figure it out, too. Please note that the information in this post applies directly to traditional APC UPS units that communicate via the UPS Link protocol. Some of the terms, principles, and methods discussed here will also apply to their newer MicroLink units, but the timelines discussed are specific to the UPS Link units.
OS Shutdown vs. UPS Turn Off
When we talk about “gracefully shutting down” in the event of a prolonged power failure, we’re really talking about shutting down or turning off of two separate items:
- Shutting down the operating system on the computer(s) plugged in to the UPS unit, and
- Turning off the actual UPS unit itself.
Of course, you may only care about item #1 — and figure that as long as your computer is safely shut down before the UPS unit’s battery runs out, it doesn’t really matter whether the UPS proactively turns itself off later in some “smart” way (based on some software setting, timer, or signal) or whether the UPS gets turned off in a “dumb” way by just running out of juice and dying.
However, the pieces of software we’re dealing with here do care about that distinction, and so whenever an APC product interface or documentation refers to “shutdown” (especially in older firmware versions) it can sometimes get confusing as to whether that means “shutting down” the OS on the PCNS client computer vs. “powering off ” the UPS unit itself. So throughout this article, I’ll use the terms “OS shutdown” and “UPS turn off” to make it clear which one we’re talking about.
Choose Your Shutdown Method: “Time Since” (PCNS-driven) vs. “Time Remaining” (UPS-driven)
When your system loses power and needs to switch over to battery backup, APC calls that an “On Battery Event.” Depending on your UPS, the age and size of its battery, what type of stuff you have plugged into it, and whether it had a full charge when the On Battery Event occurred, your UPS can keep your system running without external power for some period of time (called “runtime”). If the power comes back before you’re out of runtime, great. Your UPS switches back off battery and begins to recharge itself. But with a prolonged On Battery Event, you’ll eventually reach a point where you can’t wait for the power to come back any longer — let’s call that the “point of no return” — and that’s the point where you’ll want your system to safely shut itself down. Deciding what type of “point of no return” you want will determine how you configure your UPS’s NMC and your client systems’ PCNS settings. Basically, you need to ask yourself two questions:
- Is my “point of no return” based on how much time my system has been running on battery since losing power?
- Or is my “point of no return” based on how much time until my UPS runs out of batteries?
Your answer to that question determines whether you prefer a “time since” shutdown method (which is driven by the Event Shutdown Delay setting in the PCNS client software), or a “time until” shutdown method (which is driven by the Low Battery Duration setting in the UPS’s NMC software).
The “time since” (or PCNS-driven) method means that PCNS is responsible for initiating the OS shutdown and UPS turn off sequences after X number of minutes have passed since the On Battery Event. This X number of minutes is the Event Shutdown Delay and is manually configured via the Configure Events page on your PCNS client web interface. The benefit of the “time since” approach is that it’s predicable. Assuming you choose an Event Shutdown Delay of 10 minutes, then you know that once the UPS has been on battery for 10 minutes, PCNS will launch the process of gracefully shutting everything down. The downside to this approach, however, is that its timeline is inflexible — which can be somewhat risky if you’re running low on runtime. If, for example, you set an Event Shutdown Delay of 10 minutes, but you only have 8 minutes of battery life available when the On Battery event starts, your system won’t have enough time to wait through the Event Shutdown Delay — not to mention have any additional time to shut down the operating system — before your UPS runs out of juice and dies (although don’t worry, that’s not exactly what would happen). However, if you have plenty of battery runtime in the event of a power loss, this is probably a good option for you. Again, remember that a “time since losing power” OS shutdown is driven primarily by the PCNS client, which is where the Event Shutdown Delay is configured by the user.
The second option is the “time until” (or UPS-driven, or Low Battery Shutdown) method, meaning you configure the NMC on the UPS to initiate the OS shutdown and UPS turn off sequences when X number of minutes remain until the UPS runs out of battery. In this method, the X number of minutes is called the Low Battery Duration, which is configured by the user on the NMC. We’ll talk about how to pick an appropriate Low Battery Duration a bit later. For now, just know that when the On Battery Event occurs, your UPS will take over and provide backup power for as long as possible. Assuming you chose a Low Battery Duration of 15 minutes, then when your UPS has only 15 minutes of runtime remaining, the UPS will signal the PCNS client to begin the OS shutdown sequence. The UPS will then wait a pre-determined amount of time (we’ll discuss exactly how long in a little bit) to give the client(s) enough time to safely shut down the OS, after which the UPS will turn itself off. This “time until” approach is a little less predictable, but with the main benefit of allowing your system to keep running as long as possible during an On Battery Event. Again, remember that a “time until the battery dies” OS shutdown is driven primarily by the UPS, because that’s where the Low Battery Duration is configured by the user.
Now that you know the general difference between these two approaches, let’s take a closer look at how each of them works, and discuss how to set them up.
“Time Since” (PCNS-driven) Shutdown Details
To explain exactly how PCNS drives the “time since the On Battery Event” method of shutting down, I stole this graphic from an APC help file:
The top line represents the client system where the OS and PCNS are installed (APC assumes it’s a server, but I also have PCNS running on my desktop workstation). The bottom line represents the NMC-connected UPS into which the computer is plugged. For this example, let’s assume an On Battery Event occurs at midnight, and your Event Shutdown Delay is set for 10 minutes. When the On Battery Event occurs at 12:00AM, PCNS will wait until 12:10AM (the length of the Event Shutdown Delay) before doing anything else. If the power comes back on at 12:07AM, the shutdown process stops and everything’s cool. But if the power’s still out when the clock reaches 12:10AM, this baby is shutting down… even if the power comes back on at 12:11AM.
At 12:10AM (where the graphic says “UPS turn off initiates”), PCNS reaches its “point of no return” and decides to shut down the OS. At the same time, PCNS sends a signal to the UPS saying “DUDE! THIS $#%& JUST GOT REAL!” and triggers the UPS turn off sequence. From this point forward, the OS shutdown sequence and the UPS turn off sequence occur in parallel.
On the PCNS client:
- When the end of the Event Shutdown Delay is reached and PCNS commits to shutting things down, it first checks to see if you have a Shutdown Command File configured on the Configure Shutdown page of your PCNS web interface. If not, it skips to the next step. The Shutdown Command File is a single file (likely some sort of batch script) that can execute a list of commands on the system that you’d like to happen before the OS shutdown command is issued. Maybe it saves files, or creates backups, or sends an email alert, or whatever. When configuring a Shutdown Command File in PCNS, you’ll also enter a time (in seconds) of how long the file requires to run. If a Shutdown Command File exists, PCNS will run it, wait the number of seconds you selected (called the Command File Execution Time), and then move on to the next step.
- The graphic shows a 70 second delay at this point, but what really happens is the combination of two separate delays — presumably to provide a bit of extra cushion to your Command File Execution Time. However, these two delays always occur, even if no Shutdown Command File is configured in PCNS. The first delay is a 10 second wait called the Shutdown Delay (to prevent confusion, I call this the “OS Shutdown Delay”) where PCNS simply counts to ten before issuing the final shutdown command to the OS. You can change this wait period via the shutdownDelay option in the PCNS configuration file, but there’s probably no need to.
- After the 10-second OS Shutdown Delay, PCNS issues the final shutdown command to the operating system. That OS-specific command includes an additional 60 second delay called the Shutdown Command Duration (which I call the “OS Shutdown Command Duration”), and that’s how the graphic arrives at a total of 70 seconds. For the super geeky, in Linux, the command is:
/sbin/shutdown -h -t 60
and in Windows it’s:
shutdown /s /f /t 60 /d UP:6:12
See the 60 in both commands? You can change this delay value via the shutdownCommandDuration option in the PCNS configuration file (but don’t unless you really need to, which you shouldn’t). At this point, the operating system has already received the command to shut down, and is just counting to 60 before it pulls the trigger.
- After the 10-second OS Shutdown Delay and the 60-second OS Shutdown Command Duration, the OS really and truly begins shutting itself down. If your operating system’s shutdown command includes a system halt (and it probably does), and your hardware supports it (which it also probably does), the computer will power off once the OS shutdown is complete.
On the UPS:
- After the UPS’s NMC receives the command from PCNS to initiate the UPS turn off sequence, it waits… for some… period… of time. Exactly how long it waits (and exactly how it determines that wait period) is the subject of much discussion in various help forums (including the APC support forums). I also found the documentation a bit confusing. To quote one forum member: “Its like the manual was written in Chinese and translated to Russian and back to English.” 🙂 My opinion is that they probably just let an engineer write the documentation. However, I’ve confirmed that this is that actually happens. The first thing the UPS does after receiving the signal to initiate the turn off sequence is check the Command File Execution Time configured in PCNS. Then it compares that time Low Battery Duration value set by the user. Whichever time is longer becomes what’s called the Maximum Shutdown Time. That’s how long the UPS initially waits before moving on to the next step.
- Next, the UPS to waits for an additional two minutes, regardless of any other settings. My guess is that those two minutes serve as a padding to make extra sure that the Command File (if one exists) has plenty of time to do its thing. The Maximum Shutdown Time + this 2 minute cushion is called the Shutdown Timer.
- Third, since the UPS knows it’s getting really is close to turning off, it nervously waits again… for a third period of time! This wait is called the Shutdown Delay in the NMC, and it’s totally different than the 10 second “Shutdown Delay” on the PCNS client. So to prevent confusion I refer to this delay as the “UPS Shutdown Delay.” Crystal clear, right? 🙂
- Finally, after three separate waiting periods, during which it hopes your system has had plenty of time to run any final commands and shut down gracefully, the UPS turns itself off.
“Time Until” (UPS-driven or “Low Battery”) Shutdown Details
For the “time until” approach, the most important setting is the Low Battery Duration on the NMC, which determines how much remaining battery life is what you have decided creates a Low Battery Condition. For this example, let’s assume you chose 15 minutes. When the On Battery event occurs, the UPS will continue to power your system. If power isn’t restored, the battery level on the UPS will eventually decrease to where only 15 minutes of runtime remains. At that point, the UPS tells everyone “it’s time to get out of the pool” by sending the OS shutdown signal the PCNS client(s). Once that happens, both the PCNS client and the UPS follow their same respective timelines as shown in the graphic, starting at the point where it says “UPS Turn Off Initiates.”
On the PCNS client: If a Shutdown Command File exists on the client, PCNS will run it (if not, it won’t), then the PCNS client waits 70 seconds, and then it shuts down the OS.
On the UPS: Since the UPS is in a “Low Battery Condition,” and is therefore in a hurry to shut things down, it doesn’t use the Maximum Shutdown Time as a waiting period. Instead, it takes the Shutdown Timer (which is the Maximum Shutdown Time + 2 mins) subtracts the Low Battery Duration time, and waits that long instead. Then it waits for the Shutdown Delay period configured on the NMC, and then it turns itself off.
It’s critical in a UPS-driven approach to make sure that both the Low Battery Duration and Shutdown Delay are set properly on the NMC. The best way to do that is to understand the Maximum Required Delay, explained below.
Understanding the Maximum Required Delay
The Maximum Required Delay is calculated by the NMC and answers the question “How much time is sufficient to safely shut down all the operating systems connected to a particular NMC, rounded up to the nearest whole minute, and with an additional two-minute buffer?” It’s a useful guide to selecting an appropriate Low Battery Duration in the NMC, the UPS Shutdown Delay in the NMC, and/or Event Shutdown Delay in PCNS.
To see your NMC’s calculated Maximum Required Delay, open the NMC web interface, go to the Configuration / Shutdown section for the UPS, and look in the Start of Shutdown subsection.
The Maximum Required Delay is automatically recalculated by the NMC every time it boots (so the fastest way to recalculate it is to simply reboot the NMC). To arrive at this value, the APC documentation states that the NMC does the following:
- Queries every PCNS client attached to the UPS
- Calculates how long each client requires to safely shut down the operating system, factoring in any Command File Execution Time, the OS Shutdown Delay, and the OS Shutdown Command Duration
- Finds the client that reports the longest amount of required time, then rounds that amount up to the nearest whole minute — if the value is 3:50 it rounds up to 4:00, and if the value is 2:00 it rounds up to 3:oo
- Adds an additional two minutes to that rounded up value
- The resulting time becomes the Maximum Required Delay for that NMC
However, in practice, I ended up with slightly lower calculated Maximum Required Delays than I expected in my tests. With a Command File Execution time of 60 seconds on my PCNS client, I expected a Maximum Required Delay of at least 4 minutes, but got 3 instead.
I also noted that when you increase your Low Battery Duration higher than the Maximum Required Delay, the Maximum Required Delay gets recalculated to match it. As long as your Low Battery Duration is not lower than the Maximum Required Delay, you’re good.
The bottom line is that you should understand your system, and use conservative, common sense configuration values that represent the actual needs of your setup.
Understanding and Selecting Low Battery Duration
Low Battery Duration is a user-configurable value that answers the question “At what point (measured in minutes of remaining runtime) should I consider my UPS as being in a ‘Low Battery Condition?'” In other words, you want your UPS to be desginated as in a “Low Battery Condition” whenever this many minutes (or less) of runtime remain. Your Low Battery Duration should be at least as long as the NMC’s Maximum Required Delay.
The Low Battery Duration is the most important user-selected value. The longer the Low Battery Duration, the more time you have available to safely shut down the client(s) in a “Low Battery Condition.” But if you set this value higher than the available runtime of your UPS, you’ll get an error in your UPS log that says something like “UPS: The battery power is too low to support the load; if power fails, the UPS will be shut down immediately.”
Low Battery Duration must be at least as high as the Maximum Required Delay, but lower than the amount of runtime your UPS as with a full charge (you can check your current runtime on the Status/UPS tab).
My ideal Low Battery Duration value is calculated like this:
Fully Charged UPS Runtime Remaining – PCNS Event Shutdown Delay – 5 minutes = Low Battery Duration
If that equation results in a Low Battery Duration value that’s lower than the Maximum Required Delay, that’s no good. Consider lowering your PCNS Event Shutdown Delay and try again. If the resulting Low Battery Duration is still too low, you can reduce that 5 minute buffer amount in the equation.
For me, the ideal Low Battery Duration is low enough so that after losing power, the PCNS client has plenty of runtime available to gracefully shut down the OS without ever reaching the Low Battery Condition, but still high enough that you have enough time to successfully perform a Low Battery Shutdown if needed. Experiment and test things out to find the Low Battery Duration that works best for your situation.
How to Configure Your PCNS Clients and UPS for Shutdown
With all that explanation out of the way, it’s now time to focus on how to configure your PCNS clients and UPS to gracefully shut down. My recommended configuration is to set up your PCNS client and NMC settings for a “time since” method, but also configure the “time until” settings as a safety net. That way, if you lose power and your UPS enters a Low Battery State before your Event Shutdown Delay finishes, your system will switch over to the somewhat faster “Low Battery Shutdown” method in an effort to shut down safely before the UPS runs out of batteries.
On the PCNS client:
- In the PCNS client web interface, go to the Configure Events page and click the Shut Down System box for the UPS: On Battery event. Select “Yes, I want to shut down the PCNS operating system” and enter the number of seconds to use as your Event Shutdown Delay (i.e. how long to run on battery before the “point of no return” decision to shut down is made). The default is 120 seconds, which is 2 minutes. If you have at least 15 minutes of UPS run time available under normal conditions, then I recommend something closer to 480-500 seconds (8-10 minutes) to allow some more time for the power to come back on.
- On the Configure Shutdown page, enter the full path to your Command File and the amount of time (in seconds) it needs. You may not need a Command File (I don’t use one), but the option’s available here if you do.
- Also on the Configure Shutdown page, select “Turn off the UPS after the shutdown finishes” on this page. When you’re done, press Apply.
On the UPS NMC:
- In the NMC web interface, go to the Configuration / Shutdown section for the UPS. Take note of the Maximum Required Delay value. That’s how long the NMC believes your “slowest” PCNS client needs to shut down safely. If this value is higher than your Low Battery Duration, then you need to set a higher Low Battery Duration.
- Select a Low Battery Duration that is at least as high as the Maximum Required Delay, but lower than the amount of runtime your UPS as with a full charge (see above for a discussion of picking the perfect Low Battery Duration).
- Pick a Shutdown Delay. This is how long the UPS will remain running after it believes the PCNS clients are shut down. In practice, shutting down the PCNS client will drastically reduce the load on the UPS, which will drastically increase the UPS’s available runtime, so you should actually have time for a decent delay here. If your UPS is powering other equipment (like a network switch) in addition to a PCNS client, I tend to go for the longest delays possible to keep those things running as long as I can.
Testing Things Out
Of course, it’s impossible to do this with mission critical server equipment that’s already deployed, but you should test out your PCNS and NMC configurations before trusting them in the real world. Once you’ve got everything configured how you think you want it, save all open files on your system, grab a stopwatch (I use an app on my phone), open the NMC admin interface on a device not connected to the PCNS client system (again, I use my phone), and pull your UPS’s power plug out of the outlet. Listen for the beeps, watch the monitor for alerts, watch the messages on the NMC, and make sure things are happening in the time period(s) you expect. Better to find out now that you had something configured incorrectly, tweak, test, and repeat until you’re happy.
If, after reading this post, you’re still having trouble understanding all the terms involved with configuring a graceful shutdown with your UPS, don’t feel bad — you’re not alone. Feel free to sound off in the comments, but I can’t promise I’ll have time to reply. Your best bet is to check our APC’s excellent support forums at http://forums.apc.com/ and ask your question(s) in there.
This blog post is gracefully shutting down in 3… 2… 1…
UPDATE: If you want to better understand your currently-configured shutdown times, and see them in action, log into to your UPS Network Management Card’s web interface, go to the UPS tab, select the Scheduling, and then configured a One-time Shutdown. You’ll see a dialog like this:
The numbers that appear in the intro paragraph will explain the exact timing of a shutdown, given your current configuration.
In the above example, I configured a manual shutdown begin at 2:10 AM on January 9, 2016. Although it’s more accurate to say that I want the UPS to “signal” a shutdown at that time. The above screenshot is taken from an actual Smart-UPS 1000 unit in use at my office, and I have its Max Shutdown Time set at 8 minutes. As it explains in the dialog text, at 2:10 AM the UPS will send a signal to the PCNS service running on any attached servers to begin shutting down (there’s only one attached to this unit). It will wait until 2:20 AM (8 + 2 minutes later) for the server to shut down, then wait an extra 90 seconds (just to be safe), and shut itself off (which would be at 2:21:30 AM). After you’ve PCNS set up on your system, doing a manual shutdown like this (while you’re standing there watching it) is a smart idea, and will allow you to make sure the shutdown will happen as you expect in a real-life shutdown scenario when nobody is around to monitor it.
- APC Support Forums – Keep an eye out for Angela and Bill, who’ve always been very helpful.
- PCNS Regular Shutdown – retrieved from the APC Forums, this flowchart helps explain the logic decisions made by PCNS when shutting down.
- PCNS Low Battery Shutdown – retrieved from the APC Forums, this flowchart helps explain the logic decisions made by PCNS when shutting down in a Low Battery Condition.