A clinical-stage company approached us to resolve the challenges they were having with their cloud infrastructure each time they deployed a critical line of business applications on the system.
Kubernetes on GCP
The company had an IT team with little knowledge on the best strategy to deal with a recurring bug when specific service level applications are deployed which made their cloud operations to randomly shut down several times within a 24hr period, and made especially worse during off-hours.
We used “MatosSphere” to run a deep observability test on the system to detect the root cause of the issues. This enabled us to 1-click remediate their cloud environment by reverting to the last working state of the infrastructure. Then we set up replicas for working pods so that at least one workload pod is always active at any instance of time in case of k8’s. After that, we auto remediate using snapshots to trigger new instances and finally setup DR to enable continuous uptime.
Our 1-click remediation solution fully restored their cloud infra into operation and they no longer had issues whenever applications were deployed.The company achieved 99.99% seamless performance and optimum security of their cloud infrastructure and constantly used snapshots to trigger new instances and auto-remediate issues immediately.