Learn about a successful experiment aboard the International Space Station simulating a compute failure and successfully recovering from it with no data or transaction loss in under 30 seconds.
Let us imagine an outage in space, not on Earth; the consequences of downtime could be catastrophic. Fortunately in HPE, we have a solution that helps prevent downtime, both on Earth and space: HPE Serviceguard for Linux (SGLX) maximizes application availability by continuously monitoring the health of your infrastructure including hardware, operating system, virtualization, storage, network, application, and any other parameter that may impact the functioning of applications. When a failure is detected, SGLX automatically and transparently ensures that normal operations are resumed in mere seconds, on a healthy node.
SGLX has been protecting mission-critical environments for decades, and as further proof of its robustness and resiliency, the solution has just been successfully tested in the harsh conditions of space, as part of the HPE Spaceborne project.
Environment setup
The Spaceborne Computer-2 consists of HPE ProLiant DL360 servers and HPE Edgeline Converged Edge systems. For the failover experiment we leveraged two HPE ProLiant DL360 servers with Red Hat Enterprise Linux as the operating system to form a cluster with two nodes: one of them acting as the primary node and the other as the hot standby node, ready to take over in case of any failure.
We ran this experiment in three environments: A Test and Development system, a Production system located on Earth in NASA premises, and a Production system located on space aboard the International Space Station. These three environments were configured in the exact same way.