Accelerating Defect and Vulnerability Discovery with ML + HPC: High-Throughput Simulation Analytics for Software Quality Engineering

Authors

  • Aditi Mishra Information Technology Manipal University Jaipur Jaipur, Rajasthan, India. Author
  • Harsh Vardhan Artificial Intelligence Manipal University Jaipur Jaipur, Rajasthan, India. Author
  • Rohan Shetty Information Technology Manipal University Jaipur Jaipur, Rajasthan, India. Author
  • Pallavi Deshmukh Artificial Intelligence Manipal University Jaipur Jaipur, Rajasthan, India. Author

DOI:

https://doi.org/10.63282/3050-9246.IJETCSIT-V7I1P118

Keywords:

Defect Prediction, Vulnerability Detection, High-Performance Computing, Simulation Analytics, Test Amplification, Cloud-Native Software Engineering, Explainable AI, Federated Learning

Abstract

Modern software systems evolve under tight delivery cycles, heterogeneous cloud deployments, and increasingly stringent security and compliance requirements. While machine learning (ML) has demonstrated promise for predicting defect-prone and vulnerability-prone components, many deployments remain bounded by data sparsity, limited execution context, and constrained throughput of dynamic testing. Meanwhile, high-performance computing (HPC) infrastructures provide abundant parallelism, yet they are often underutilized for software quality engineering workflows that require large-scale simulation, test amplification, and telemetry-driven analytics. This manuscript proposes an integrated ML + HPC methodology for accelerating defect and vulnerability discovery through high-throughput simulation analytics. The approach treats quality discovery as a throughput-optimized, data-centric pipeline: (i) generate execution diversity via scalable simulation and test amplification, (ii) capture and normalize multi-level telemetry from builds, tests, runtime traces, and security checks, (iii) learn ranking and classification models that prioritize code regions and change sets for deeper analysis, and (iv) close the loop with root-cause triage and remediation signals. We formalize the research gap as the missing coupling between predictive models and scalable execution diversity, and we outline an evaluation framework spanning prediction quality, defect and vulnerability yield, and end-to-end cost-per-finding. The resulting methodology enables software teams to scale discovery beyond conventional CI constraints, improving early warning capability and accelerating remediation in complex enterprise-grade systems.

Downloads

Download data is not yet available.

References

[1] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” in Proc. 6th Symp. Operating Systems Design and Implementation (OSDI), 2004, pp. 137–150.

[2] T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell, “A systematic literature review on fault prediction performance in software engineering,” IEEE Trans. Softw. Eng., vol. 38, no. 6, pp. 1276–1304, Nov.–Dec. 2012, doi: 10.1109/TSE.2011.103.

[3] Gudi, S. R. (2023). Enhancing Reliability in Java Enterprise Systems through Comparative Analysis of Automated Testing Frameworks. International Journal of Emerging Trends in Computer Science and Information Technology, 4(2), 151-160. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I2P115

[4] S. K. Gunda, "Enhancing Software Fault Prediction with Machine Learning: A Comparative Study on the PC1 Dataset," 2024 Global Conference on Communications and Information Technologies (GCCIT), BANGALORE, India, 2024, pp. 1-4, https://doi.org/10.1109/GCCIT63234.2024.10862351.

[5] Indrasena Manga, “Edge Software Engineering for Lightweight AI: Real-Time Environmental Data Processing with Embedded Systems ”, Journal of Computational Analysis and Applications (JoCAAA), vol. 34, no. 6, pp. 88–104, Jun. 2025.

[6] Y. Shin and L. Williams, “An Empirical Model to Predict Security Vulnerabilities Using Code Complexity Metrics,” in Proc. Empirical Software Engineering and Measurement (ESEM), 2008.

[7] S. K. Gunda, "Analyzing Machine Learning Techniques for Software Defect Prediction: A Comprehensive Performance Comparison," 2024 Asian Conference on Intelligent Technologies (ACOIT), KOLAR, India, 2024, pp. 1-5, https://doi.org/10.1109/ACOIT62457.2024.10939610.

[8] M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: Cluster Computing with Working Sets,” in Proc. 2nd USENIX Conf. Hot Topics in Cloud Computing (HotCloud), 2010.

[9] Thalakanti, R. R. ., Goud Bandari, S. S., & Sivva, S. D. . (2024). Federated Learning for Privacy Preserving Fraud Detection across Financial Institutions: Architecture Protocols and Operational Governance. International Journal of Emerging Research in Engineering and Technology, 5(2), 108-114. https://doi.org/10.63282/3050-922X.IJERET-V5I2P111

[10] Gudi, S. R. (2024). Design and Evaluation of Secure Microservices Architecture for HIPAA-Compliant Prescription Processing on AWS and OpenShift. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 5(2), 144-149. https://doi.org/10.63282/3050-9262.IJAIDSML-V5I2P116

[11] Bandari, S. S. G. ., Sivva, S. D. ., & Thalakanti, R. R. (2024). Regulatory Grade Fraud Detection using Explainable Artificial Intelligence with Auditable Decision Pathways and Empirical Validation on Banking Data. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 5(3), 139-147. https://doi.org/10.63282/3050-9262.IJAIDSML-V5I3P115.

[12] I. Manga, "AutoML for All: Democratizing Machine Learning Model Building with Minimal Code Interfaces," 2025 3rd International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), Erode, India, 2025, pp. 347-352, doi: 10.1109/ICSCDS65426.2025.11167529.

[13] Reddy Mittamidi VK. Leveraging AI and ML for Predictive Monitoring and Error Mitigation in Change Data Capture Pipelines. IJETCSIT 2025 Aug. 21;6(3):104-11. Available from: https://ijetcsit.org/index.php/ijetcsit/article/view/515

[14] Gudi, S. R. (2024). AI-Driven Fax-to-Digital Prescription Automation: A Cloud-Native Framework Using OCR, Machine Learning, and Microservices for Pharmacy Operations. International Journal of Emerging Research in Engineering and Technology, 5(1), 111-116. https://doi.org/10.63282/3050-922X.IJERET-V5I1P113

[15] Sivva SD, Thalakanti RR, Bandari SSG, Yettapu SDR. AI-Driven Decision Intelligence for Agile Software Lifecycle Governance: An Architecture-Centered Framework Integrating Machine Learning Defect Prediction and Automated Testing. IJETCSIT 2023 Dec. 30 ;4(4):167-72. Available from: https://ijetcsit.org/index.php/ijetcsit/article/view/554

[16] S. K. Gunda, "Comparative Analysis of Machine Learning Models for Software Defect Prediction," 2024 International Conference on Power, Energy, Control and Transmission Systems (ICPECTS), Chennai, India, 2024, pp. 1-6, https://doi.org/10.1109/ICPECTS62210.2024.10780167.

[17] Z. Li, D. Zou, S. Xu, X. Ou, H. Jin, S. Wang, Z. Deng, and Y. Zhong, “VulDeePecker: A Deep Learning-Based System for Vulnerability Detection,” in Proc. Network and Distributed System Security Symposium (NDSS), 2018, doi: 10.14722/ndss.2018.23158.

[18] Gudi, S. R. (2024). Leveraging Predictive Analytics and Redis-Backed Caching to Optimize Specialty Medication Fulfillment and Pharmacy Inventory Management. International Journal of AI, BigData, Computational and Management Studies, 5(3), 155-160. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V5I3P116

[19] I. Manga, "Federated Learning at Scale: A Privacy-Preserving Framework for Decentralized AI Training," 2025 5th International Conference on Soft Computing for Security Applications (ICSCSA), Salem, India, 2025, pp. 110-115, doi: 10.1109/ICSCSA66339.2025.11170780.

[20] Krishna GV, Reddy BD, Vrindaa T. EmoVision: An Intelligent Deep Learning Framework for Emotion Understanding and Mental Wellness Assistance in Human Computer Interaction. 2025 Oct ;6(4):14-20. https://ijaidsml.org/index.php/ijaidsml/article/view/295

[21] S. R. Gudi, "Ensuring Secure and Compliant Fax Communication: Anomaly Detection and Encryption Strategies for Data in Transit," 2025 4th International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), Tirupur, India, 2025, pp. 786-791, https://doi.org/10.1109/ICIMIA67127.2025.11200537

[22] Raikar, T., & Apelagunta, V. (2025). Implementing SAP Fiori in S/4HANA transitions: Key guidelines, challenges, strategic implications, AI integration recommendations. Journal of Engineering Research and Sciences, 4(11), 1–9. https://doi.org/10.55708/JS0411001

[23] Gunda, S. K. (2025). Accelerating Scientific Discovery With Machine Learning and HPC-Based Simulations. In B. Ben Youssef & M. Ben Ismail (Eds.), Integrating Machine Learning Into HPC-Based Simulations and Analytics (pp. 229-252). IGI Global Scientific Publishing. https://doi.org/10.4018/978-1-6684-3795-7.ch009.

[24] Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, and D. Jiang, “CodeBERT: A Pre-Trained Model for Programming and Natural Languages,” arXiv:2002.08155, 2020.

[25] S. R. Gudi, "Monitoring and Deployment Optimization in Cloud-Native Systems: A Comparative Study Using OpenShift and Helm," 2025 4th International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), Tirupur, India, 2025, pp. 792-797, https://doi.org/10.1109/ICIMIA67127.2025.11200594

[26] Raikar, T. (2025). High-Performance In-Memory Computing: A Research Study on SAP S/4 HANA Database Layer. American Journal of Technology, 4(2), 93-113. https://doi.org/10.58425/ajt.v4i2.449

[27] I. Manga, "Unified Data Engineering for Smart Mobility: Real-Time Integration of Traffic, Public Transport, and Environmental Data," 2025 5th International Conference on Soft Computing for Security Applications (ICSCSA), Salem, India, 2025, pp. 1348-1353, doi: 10.1109/ICSCSA66339.2025.11170800.

[28] Reddy Mittamidi VK. AI/ML Powered Intelligent Root Cause Analysis and Automated Remediation for Multi System Data Integrity Issues. IJAIBDCMS 2025 Nov. 14;6(4):133-41. Available from: https://ijaibdcms.org/index.php/ijaibdcms/article/view/338

[29] Srikanth Reddy Gudi. (2025). A Comparative Analysis of Pivotal Cloud Foundry and OpenShift Cloud Platforms. The American Journal of Applied Sciences, 7(07), 20–29. https://doi.org/10.37547/tajas/Volume07Issue07-03

[30] Kishore Varma Alluri AK. Using Salesforce CRM and Deep Learning (CNN) Techniques to Improve Patient Journey Mapping and Engagement in Small and Medium Healthcare Organizations. IJAIDSML 2025 Nov. 22 ;6(4):101-9. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/330

[31] Gunda, S. K., Yalamati, S., Gudi, S. R., Manga, I., & Aleti, A. K. (2025). Scalable and adaptive machine learning models for early software fault prediction in agile development: Enhancing software reliability and sprint planning efficiency. International Journal of Applied Mathematics, 38(2s). https://doi.org/10.12732/ijam.v38i2s.74

[32] M. Allamanis, E. T. Barr, P. Devanbu, and C. Sutton, “A Survey of Machine Learning for Big Code and Naturalness,” ACM Computing Surveys, 2018.

[33] S. R. Gudi, "Deconstructing Monoliths: A Fault-Aware Transition to Microservices with Gateway Optimization using Spring Cloud," 2025 6th International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 2025, pp. 815-820, https://doi.org/10.1109/ICESC65114.2025.11212326

[34] I. Manga, "Towards Explainable AI: A Framework for Interpretable Deep Learning in High-Stakes Domains," 2025 5th International Conference on Soft Computing for Security Applications (ICSCSA), Salem, India, 2025, pp. 1354-1360, doi: 10.1109/ICSCSA66339.2025.11170778.

[35] Thalakanti, R. R., & Goud Bandari, S. S. . (2024). Intelligent Continuous Integration and Delivery for Banking Systems using Machine Learning Driven Risk Detection with Real World Deployment Evaluation. International Journal of AI, BigData, Computational and Management Studies, 5(4), 168-175. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V5I4P118

[36] Gudi, S. R. (2025). Enhancing optical character recognition (OCR) accuracy in healthcare prescription processing using artificial neural networks. European Journal of Artificial Intelligence and Machine Learning, 4(6). https://doi.org/10.24018/ejai.2025.4.6.79

[37] Kishore Varma Alluri AK. Salesforce CRM Framework for Real Time DeFi Portfolio Intelligence and Customer Engagement Forecasting in Web3 Based Decentralized Finance Ecosystems Using ML Techniques. IJAIBDCMS 2025 Nov. 6;6(4):99-107. Available from: https://ijaibdcms.org/index.php/ijaibdcms/article/view/319

[38] S. K. Gunda, "Automatic Software Vulnerabilty Detection Using Code Metrics and Feature Extraction," 2025 2nd International Conference On Multidisciplinary Research and Innovations in Engineering (MRIE), Gurugram, India, 2025, pp. 115-120, https://doi.org/10.1109/MRIE66930.2025.11156601.

[39] R. R. Thalakanti, "Enhancing Convergence in Fully Connected Neural Networks via Optimized Backpropagation," 2025 2nd International Conference on Computing and Data Science (ICCDS), Chennai, India, 2025, pp. 1-6, doi: 10.1109/ICCDS64403.2025.11209625.

[40] I. Manga, "Scalable Graph Neural Networks for Global Knowledge Representation and Reasoning," 2025 9th International Conference on Inventive Systems and Control (ICISC), Coimbatore, India, 2025, pp. 1399-1404, doi: 10.1109/ICISC65841.2025.11188341.

[41] Gunda, S.K. (2026). A Hybrid Deep Learning Model for Software Fault Prediction Using CNN, LSTM, and Dense Layers. In: Bakaev, M., et al. Internet and Modern Society. IMS 2025. Communications in Computer and Information Science, vol 2672. Springer, Cham. https://doi.org/10.1007/978-3-032-05144-8_21.

Published

2026-02-08

Issue

Section

Articles

How to Cite

1.
Mishra A, Vardhan H, Shetty R, Deshmukh P. Accelerating Defect and Vulnerability Discovery with ML + HPC: High-Throughput Simulation Analytics for Software Quality Engineering. IJETCSIT [Internet]. 2026 Feb. 8 [cited 2026 Feb. 14];7(1):124-31. Available from: https://www.ijetcsit.org/index.php/ijetcsit/article/view/571

Similar Articles

81-90 of 459

You may also start an advanced similarity search for this article.