Recent Outages - Continuity of Operations Plan

June 11, 2024

The ESIF HPC Data center has experienced a significant number of planned and unplanned outages this year. Many of the planned outages were driven by electrical and mechancial work required to upgrade the ESIF data center from 5MW to 7.5MW of electrical and cooling capacity. As a final step we are also required to verify the Emergency Power Off (EPO) all of that disruptive work is complete and we do not anticipate as many impactful planned outages for the next couple of years.

On the unplanned outages we were notified that Xcel Energy, the electrical utility provider for NREL that due to high winds and high fire danger they may impact service to NREL. To protect the HPC systems we chose to preemptively power those systems off. Moving forward we anticipate that these sort of utility driven outages may become more common. To better plan for those events and mitigate some of the risks to equipment and research productivity the Advanced Computing Operations team is improving on a conitinuity of operations plan. As those plans are further developed we will share more details. 

Tags:

Share