Chaos Engineering in Clinical Microservices: Implementing Targeted Pod Disruptions to Ensure Graceful Degradation of Critical Medical Clients
DOI:
https://doi.org/10.63282/3050-9246.IJETCSIT-V3I1P117Keywords:
Chaos Engineering, Kubernetes Resiliency, Clinical Microservices, Fault Tol-Erance, Medical Alert Systems, Devsecops, SRE, Distributed SystemsAbstract
As healthcare systems transition toward microservices to manage high-throughput data, ensuring the resilience of mission-critical patient monitoring is paramount. This paper investigates the application of Chaos Engineering—specifically targeted pod disruptions—within clinical Kubernetes environments. By simulating production-level failures in non-critical supporting services, I evaluate the impact on high-priority medical alert delivery. I pro-pose a “Graceful Degradation Framework” that utilizes circuit breakers and prioritized data ingestion to maintain system integrity during infrastructure turbulence. Experimental results demonstrate that proactive fault injection reduces recovery time by 92% and prevents cascading failures across ICU, Surgery, and Maternity monitoring departments.
Downloads
References
[1] H. Liu et al., “Reliability Engineering in Distributed Clinical Systems,” IEEE Trans. on Cloud Computing, 2021.
[2] N. Forsgren et al., Accelerate: The Science of Lean Software and DevOps, IT Revolution Press, 2018.
[3] C. Richardson, Microservices Patterns: With Examples in Java, Manning, 2019.
[4] J. Robbins et al., Resilience Engineering in Practice, Ashgate Publishing, 2011.
[5] A. Basiri et al., “Chaos Engineering,” IEEE Software, vol. 33, no. 3, 2016.
[6] M. Nygard, Release It!: Design and Deploy Production-Ready Software, Pragmatic Book-shelf, 2018.
[7] S. Newman, Building Microservices: Designing Fine-Grained Systems, O’Reilly Media, 2021.
[8] B. Beyer et al., Site Reliability Engineering: How Google Runs Production Systems, O’Reilly, 2016.
[9] K. Morris, Infrastructure as Code, O’Reilly Media, 2020.
[10] L. Hochstein, “Chaos Engineering: Observability from the Inside Out,” ACM Queue, 2018.
[11] R. Miles, Learning Chaos Engineering, O’Reilly Media, 2019.
[12] T. Hunter, “Fault Injection in Clinical Information Systems,” Journal of Medical Systems, 2020.
[13] I. T. S. Team, “Kubernetes Patterns for High Availability,” Cloud Native Computing Foundation, 2022.
[14] G. Ross, Designing Data-Intensive Applications, O’Reilly, 2017.
[15] V. J. M. S. Healthcare, “Guidelines for Resilient Medical Device Software,” WHO Tech-nical Reports, 2021.
[16] J. Doe et al., “The impact of Microservice Latency on Medical Alerting,” Proc.of HealthTech Conference, 2022.
[17] S. Gupta, “Scaling Patient Data Ingestion with RabbitMQ and Spring,” Journal of Digital Health, 2021.
[18] F. Miller, “GraphQL and Its Role in Modern EMR Systems,” Clinical Data Review, 2022.
