Machine Learning-Based Prediction of Technical Debt in Continuous Software Development Using Repository Mining and Developer Activity Analytics

Authors

  • Dr. Deepshikha Tiwari Assistant Professor, CSE, Thapar Institute of Engineering & Technology, Patiala, Punjab, India. Author
  • Dr. Shivani Sharma Assistant Professor, CSE, Thapar Institute of Engineering & Technology, Patiala, Punjab, India. Author
  • Dr. Kapil Tomar Assistant Professor, AI & ML, Thapar Institute of Engineering & Technology, Patiala, Punjab, India. Author
  • Dr. Saurabh Arora Assistant Professor, AI & ML, Thapar Institute of Engineering & Technology, Patiala, Punjab, India. Author

DOI:

https://doi.org/10.63282/3050-9246.IJETCSIT-V5I4P122

Keywords:

Technical Debt Prediction, Repository Mining, Developer Analytics, Machine Learning, Continuous Software Development, Self-Admitted Technical Debt, Software Maintenance, CI/CD Risk, Explainable AI

Abstract

Technical debt is no longer an occasional by-product of rushed implementation; it has become a persistent socio-technical risk in continuous software development environments where code, build scripts, configuration artifacts, tests, issue tickets, pull requests, and deployment workflows evolve at high velocity. Conventional technical debt management methods often depend on manual code reviews, static analysis alerts, or retrospective refactoring decisions, which are insufficient for detecting emerging debt before it becomes costly to repay. This paper proposes a machine learning-based framework for predicting technical debt in continuous software development by integrating repository mining, developer activity analytics, change-process metrics, self-admitted technical debt signals, and software lifecycle governance indicators. The proposed framework models technical debt as an evolving risk state rather than a static code smell, combining structural code metrics, commit-level churn, ownership dispersion, pull-request review behavior, issue-tracker semantics, CI/CD instability, and developer workload signals. The study introduces a multi-source feature architecture, a temporal labeling strategy, an explainable ensemble learning pipeline, and a debt-prioritization layer designed for continuous integration environments. Unlike review-oriented approaches that summarize technical debt literature, this manuscript develops a research-oriented predictive design that can be operationalized in modern software engineering pipelines. The expected contribution is a decision-support mechanism that helps engineering teams identify debt-prone modules, anticipate repayment urgency, improve sprint planning, and align refactoring decisions with delivery risk, developer capacity, and long-term maintainability.

Downloads

Download data is not yet available.

References

[1] R. R. Thalakanti and S. S. Goud Bandari, “Intelligent Continuous Integration and Delivery for Banking Systems using Machine Learning Driven Risk Detection with Real World Deployment Evaluation,” International Journal of AI, BigData, Computational and Management Studies, vol. 5, no. 4, pp. 168–175, 2024, https://doi.org/10.63282/3050-9416.IJAIBDCMS-V5I4P118.

[2] A. Potdar and E. Shihab, “An Exploratory Study on Self-Admitted Technical Debt,” in Proceedings of the 2014 IEEE International Conference on Software Maintenance and Evolution (ICSME), Victoria, BC, Canada, 2014, pp. 91–100, https://doi.org/10.1109/ICSME.2014.31.

[3] S. R. Gudi, “Design and Evaluation of Secure Microservices Architecture for HIPAA-Compliant Prescription Processing on AWS and OpenShift,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 5, no. 2, pp. 144–149, 2024, https://doi.org/10.63282/3050-9262.IJAIDSML-V5I2P116.

[4] Z. Li, P. Avgeriou, and P. Liang, “A Systematic Mapping Study on Technical Debt and Its Management,” Journal of Systems and Software, vol. 101, pp. 193–220, 2015, https://doi.org/10.1016/j.jss.2014.12.027.

[5] S. K. Gunda, S. D. R. Yettapu, S. Bodakunti, and S. B. Bikki, “Decision Intelligence Methodology for AI-Driven Agile Software Lifecycle Governance and Architecture-Centered Project Management,” 2023 Mar. 30, vol. 4, no. 1, pp. 102–108, https://doi.org/10.63282/3050-9262.IJAIDSML-V4I1P112.

[6] Y. Kamei, E. Shihab, B. Adams, A. E. Hassan, A. Mockus, A. Sinha, and N. Ubayashi, “A Large-Scale Empirical Study of Just-in-Time Quality Assurance,” IEEE Transactions on Software Engineering, vol. 39, no. 6, pp. 757–773, 2013, https://doi.org/10.1109/TSE.2012.70.

[7] S. D. Sivva, “An End-to-End AI-Based Systems Engineering Paradigm for Lifecycle Governance, Predictive Quality Assurance, Automation Economics, and Cybersecurity Intelligence,” Journal of Frontiers in Multidisciplinary Research, vol. 4, no. 1, pp. 600–604, 2023, https://doi.org/10.54660/.JFMR.2023.4.1.600-604.

[8] R. Moser, W. Pedrycz, and G. Succi, “A Comparative Analysis of the Efficiency of Change Metrics and Static Code Attributes for Defect Prediction,” in Proceedings of the 30th International Conference on Software Engineering (ICSE), Leipzig, Germany, 2008, pp. 181–190, https://doi.org/10.1145/1368088.1368114.

[9] S. S. G. Bandari, S. D. Sivva, and R. R. Thalakanti, “Regulatory Grade Fraud Detection using Explainable Artificial Intelligence with Auditable Decision Pathways and Empirical Validation on Banking Data,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 5, no. 3, pp. 139–147, 2024, https://doi.org/10.63282/3050-9262.IJAIDSML-V5I3P115.

[10] E. da S. Maldonado and E. Shihab, “Detecting and Quantifying Different Types of Self-Admitted Technical Debt,” in Proceedings of the 7th International Workshop on Managing Technical Debt (MTD), Bremen, Germany, 2015, pp. 9–15.

[11] S. R. Gudi, “Enhancing Reliability in Java Enterprise Systems through Comparative Analysis of Automated Testing Frameworks,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 4, no. 2, pp. 151–160, 2023, https://doi.org/10.63282/3050-9246.IJETCSIT-V4I2P115.

[12] N. Nagappan and T. Ball, “Use of Relative Code Churn Measures to Predict System Defect Density,” in Proceedings of the 27th International Conference on Software Engineering (ICSE), St. Louis, MO, USA, 2005, pp. 284–292, https://doi.org/10.1145/1062455.1062514.

[13] R. R. Thalakanti, S. S. Goud Bandari, and S. D. Sivva, “Federated Learning for Privacy Preserving Fraud Detection across Financial Institutions: Architecture Protocols and Operational Governance,” International Journal of Emerging Research in Engineering and Technology, vol. 5, no. 2, pp. 108–114, 2024, https://doi.org/10.63282/3050-922X.IJERET-V5I2P111.

[14] F. Rahman and P. Devanbu, “Ownership, Experience and Defects: A Fine-Grained Study of Authorship,” in Proceedings of the 33rd International Conference on Software Engineering (ICSE), Honolulu, HI, USA, 2011, pp. 491–500, https://doi.org/10.1145/1985793.1985860.

[15] S. K. G. Gunda, “The Future of Software Development and the Expanding Role of ML Models,” International Journal of Emerging Research in Engineering and Technology, vol. 4, no. 2, pp. 126–129, 2023, https://doi.org/10.63282/3050-922X.IJERET-V4I2P113.

[16] D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J.-F. Crespo, and D. Dennison, “Hidden Technical Debt in Machine Learning Systems,” in Advances in Neural Information Processing Systems, vol. 28, 2015, pp. 2503–2511.

[17] W. Cunningham, “The WyCash Portfolio Management System,” in Proceedings of the 7th ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), Vancouver, BC, Canada, 1992, pp. 29–30, https://doi.org/10.1145/157709.157715.

[18] Q. Huang, E. Shihab, X. Xia, D. Lo, and S. Li, “Identifying Self-Admitted Technical Debt in Open Source Projects Using Text Mining,” Empirical Software Engineering, vol. 23, no. 1, pp. 418–451, 2018.

[19] S. R. Gudi, “Leveraging Predictive Analytics and Redis-Backed Caching to Optimize Specialty Medication Fulfillment and Pharmacy Inventory Management,” International Journal of AI, BigData, Computational and Management Studies, vol. 5, no. 3, pp. 155–160, 2024, https://doi.org/10.63282/3050-9416.IJAIBDCMS-V5I3P116.

[20] C. Bird, N. Nagappan, B. Murphy, H. Gall, and P. Devanbu, “Don’t Touch My Code! Examining the Effects of Ownership on Software Quality,” in Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE), Szeged, Hungary, 2011, pp. 4–14, https://doi.org/10.1145/2025113.2025119.

[21] S. D. Sivva, R. R. Thalakanti, S. S. G. Bandari, and S. D. R. Yettapu, “AI-Driven Decision Intelligence for Agile Software Lifecycle Governance: An Architecture-Centered Framework Integrating Machine Learning Defect Prediction and Automated Testing,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 4, no. 4, pp. 167–172, 2023, https://doi.org/10.63282/3050-9246.IJETCSIT-V4I4P118.

[22] Y. Li, M. Soliman, and P. Avgeriou, “Identifying Self-Admitted Technical Debt in Issue Tracking Systems Using Machine Learning,” Empirical Software Engineering, vol. 27, article no. 131, 2022, https://doi.org/10.1007/s10664-022-10128-3.

[23] S. K. Gunda, “Comparative Analysis of Machine Learning Models for Software Defect Prediction,” in 2024 International Conference on Power, Energy, Control and Transmission Systems (ICPECTS), Chennai, India, 2024, pp. 1–6, https://doi.org/10.1109/ICPECTS62210.2024.10780167.

[24] M. Balerao, “A Converged Artificial Intelligence Architecture for Innovation, Software Lifecycle Optimization, and Cybersecurity Risk Mitigation,” International Journal of Multidisciplinary Futuristic Development, vol. 4, no. 1, pp. 117–120, 2023, https://doi.org/10.54660/IJMFD.2023.4.1.117-120.

[25] D. Tsoukalas, N. Mittas, A. Chatzigeorgiou, D. Kehagias, A. Ampatzoglou, T. Amanatidis, and L. Angelis, “Machine Learning for Technical Debt Identification,” IEEE Transactions on Software Engineering, vol. 49, no. 4, pp. 1645–1662, 2023, https://doi.org/10.1109/TSE.2021.3124372.

[26] S. R. Gudi, “AI-Driven Fax-to-Digital Prescription Automation: A Cloud-Native Framework Using OCR, Machine Learning, and Microservices for Pharmacy Operations,” International Journal of Emerging Research in Engineering and Technology, vol. 5, no. 1, pp. 111–116, 2024, https://doi.org/10.63282/3050-922X.IJERET-V5I1P113.

[27] Y. Li, M. Soliman, and P. Avgeriou, “Automatic Identification of Self-Admitted Technical Debt from Four Different Sources,” Empirical Software Engineering, vol. 28, article no. 65, 2023, https://doi.org/10.1007/s10664-023-10297-9.

[28] S. K. Gunda, “Fault Prediction Unveiled: Analyzing the Effectiveness of Random Forest, Logistic Regression, and KNeighbors,” in 2024 2nd International Conference on Self Sustainable Artificial Intelligence Systems (ICSSAS), Erode, India, 2024, pp. 107–113, https://doi.org/10.1109/ICSSAS64001.2024.10760620.

[29] I. Manga, S. D. Sivva, and V. K. Manga, “The Adaptive Intelligence in Cloud Systems: A Unified Architecture for AI Enhanced Observability and Automated Root Cause Analysis,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 5, no. 1, pp. 160–166, 2024, https://ijaidsml.org/index.php/ijaidsml/article/view/366.

[30] N. Mutyam, “Graph-Based Modeling of Service Dependencies for Predicting Failure Propagation in Distributed Systems,” International Journal of Multidisciplinary Evolutionary Research, vol. 5, no. 1, pp. 113–116, 2024, https://doi.org/10.54660/IJMER.2024.5.1.113-116.

[31] R. R. Thalakanti, S. S. Goud Bandari, and S. D. Sivva, “Federated Learning for Privacy Preserving Fraud Detection across Financial Institutions: Architecture Protocols and Operational Governance,” International Journal of Emerging Research in Engineering and Technology, vol. 5, no. 2, pp. 108–114, 2024, https://doi.org/10.63282/3050-922X.IJERET-V5I2P111.

[32] S. D. Sivva, S. D. R. Yettapu, R. R. Thalakanti, and S. S. G. Bandari, “AI-Driven Decision Intelligence for Agile Software Lifecycle Governance: An Architecture-Centered Framework Integrating Machine Learning Defect Prediction and Automated Testing,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 4, no. 4, pp. 167–172, 2023, https://doi.org/10.63282/3050-9246.IJETCSIT-V4I4P118.

Published

2024-12-30

Issue

Section

Articles

How to Cite

1.
Tiwari D, Sharma S, Tomar K, Arora S. Machine Learning-Based Prediction of Technical Debt in Continuous Software Development Using Repository Mining and Developer Activity Analytics. IJETCSIT [Internet]. 2024 Dec. 30 [cited 2026 Jun. 23];5(4):192-20. Available from: https://www.ijetcsit.org/index.php/ijetcsit/article/view/754

Similar Articles

11-20 of 578

You may also start an advanced similarity search for this article.