Chaos Engineering – A Breakthrough in Digital Immunity

Within the organizational infrastructure, the rapid evolution of cloud assessment and deployment is driven by scalability, flexibility, accessibility, and enhanced IT security. However, the upsurge in cloud adoption, migration, and cloud-native development programs through the pandemic has exposed the risk of potential vulnerabilities across distributed networks.

There have been several instances where a sudden increase in online traffic or unforeseen cyberattacks have led to service failures and monetary losses as well as adverse impact on organizational reputation and brand integrity. The costly outages create a domino effect, leading to a loss in customer confidence and, in some cases, regulatory action against the organization. There is thus an urgent need to design robust and resilient solutions to address these challenges and safeguard organizations against potential threats. This is where chaos engineering can come to the rescue.

What is Chaos Engineering?

Chaos engineering is the most effective approach to identifying unanticipated and unknown system weaknesses. This unique science deliberately disrupts the system to identify weak points, anticipate failures, and rectify the architecture to predict user experience. Quality assurance (QA) engineers find chaos testing more effective than performance and disaster recovery testing as it helps unearth latent bugs. This technique helps engineering teams redesign and restore the organization’s infrastructure and make it more resilient through crises.

Is Chaos Engineering a Game Changer?

Chaos engineering has gained considerable traction, with global leaders increasingly adopting the practice to boost their organizations’ digital immunity. Chaos engineering ensures:

1.    Financial security: By preventing large-scale outages in a controlled environment, chaos engineering prevents financial loss

2.    Technical advantage: By providing developers with a better understanding of the production environment, it stops data and application loss during an outage

3.    Enhanced user experience: By minimizing system disruptions, it offers a smooth user experience

Proof Point: Chaos Testing and its Organizational Impact

A leading global bank collaborated with Infosys Validation Solutions to build a highly resilient platform to ensure 99.99% platform availability without degradation. Infosys designed a one-touch intelligent automation framework and implemented chaos engineering with GameDay. It used tools such as Gremlin, K6, and the Jenkins CICD pipeline to run chaos tests in parallel with load tests. The robust solution revitalized the environmental configurations of the bank, reducing the failure detection time to less than 15 minutes. The bank now successfully provides a best-in-class user experience while saving costs and meeting compliance regulations.

To know more about Infosys’ journey towards a new age of digital transformation using quality continuum, click here.


Author Details

Jack Hinduja

Jack Hinduja has been with Infosys for 7+ years and has total 15+ years of experience in Telecom domain and Banking domain. He has extensive experience is leading Quality Assurance (QA) and Validation delivery for leading Telco providers and Banks across geographies. He has led enablement of transformation in digital quality assurance, implementing Performance Engineering and Chaos Engineering practices across financial services enterprise clients. He holds a bachelor’s degree in Electronics and Communication Engineering. Outside of his professional life, he enjoys travelling and football.


Leave a Comment

Your email address will not be published.