Automated QA Testing for AI-Generated Game Content: Using LLMs to Validate NPC Behavior and Narrative Integrity

Authors

  • Mr. Mohnish Neelapu Automation Lead Numeric Technologies INC., USA. Author

DOI:

https://doi.org/10.63282/3050-9246/ICRTCSIT-128

Keywords:

Quality Assurance, Procedural Content Generation, Large Language Models, Game Testing, Narrative Consistency, NPC Behavior

Abstract

Procedural content generation (PCG) and generative artificial intelligence (AI) has led to the modern generation of games to provide dynamic conversation, procedural non-player character (NPC) behavior, and story-based choices. Despite providing more depth and replayability, it becomes a major challenge to quality assurance (QA) where more non-deterministic worlds are required to be tested with scripts or rules no longer sufficient. The proposed paper will describe an LLM-based QA system, which is meant to be used to mechanize the process of testing narrative consistency, NPC coherence, and rule compliance in AI-generated game content. It encompasses four significant components, namely, a game simulation environment, an LLM validation engine, a logs extraction layer, and a game developer-specific QA report module, which shows that the game-based approach is more efficient and more accurate than the old-fashioned QA, with a higher narrative coherence (95% vs. 82%), behavior coherence (93% vs. 80%), and consistency of rule enforcement. Such results suggest that semantic reasoning may be applied on the basis of LLM and enables scalable and automated tests of QA, involving more players and enhancing a development pipeline

Downloads

Download data is not yet available.

References

[1] Junior, J. D. A. L., Ribeiro, G. P. S., Pessoa, R. F., Magalhães, A. H. T., & Rodrigues, M. A. F. (2025, April). Generative AI for Facial Expressions in 3D Game Characters: A Retrieval-Augmented Approach. In 2025 IEEE/ACM 9th International Workshop on Games and Software Engineering (GAS) (pp. 9-16). IEEE.

[2] Ternar, A., Denisova, A., Cunha, J. M., Kultima, A., & Guckelsberger, C. (2025). Generative AI in Game Development: A Qualitative Research Synthesis. arXiv preprint arXiv:2509.11898.

[3] Zargham, N., Friehs, M. A., Tonini, L., Alexandrovsky, D., Ruthven, E. G., Nacke, L. E., & Malaka, R. (2025). Let’s talk games: An expert exploration of speech interaction with NPCs. International Journal of Human–Computer Interaction, 41(5), 3592-3612.

[4] Dash, S. K. (2025). Artificial Intelligence in Gaming: Innovations, Impacts, and Future Directions.

[5] Alharthi, S. A. (2025). Generative AI in Game Design: Enhancing Creativity or Constraining Innovation? Journal of Intelligence, 13(6), 60.

[6] Giunchi, D., Numan, N., Gatti, E., & Steed, A. (2024, March). Dreamcodevr: Towards democratizing behavior design in virtual reality with speech-driven programming. In 2024 IEEE Conference Virtual Reality and 3D User Interfaces (VR) (pp. 579-589). IEEE.

[7] Alanazi, N., Al-Batineh, M., & Abu-Rayyash, H. (2025). SauDial: The Saudi Arabic dialects game localization dataset. Data in Brief, 111906.

[8] Soliman, M. M., Ahmed, E., Darwish, A., & Hassanien, A. E. (2024). Artificial intelligence powered Metaverse: analysis, challenges and future perspectives. Artificial Intelligence Review, 57(2), 36.

[9] Ratican, J., & Hutson, J. (2024). Video game development 3.0: AI-driven collaborative co-creation. Metaverse, 5(2).

[10] Scarlato III, A. J. (2024). Assessing the Impact of AI-Assisted Software Development and User Experience of a College Football Simulation Game: A Study of Player and Industry Professional Perspectives (Doctoral dissertation, University of South Florida).

[11] Santiago III, J. M. (2025). Exploring the Potential of Co-creative AI in Tabletop Role-Playing Games (Doctoral dissertation, Salzburg University of Applied Sciences Paris).

[12] Sohrawardi, S. J., Wu, Y. K., Hickerson, A., & Wright, M. (2024, May). Dungeons & Deepfakes: Using scenario-based role-play to study journalists' behavior towards using AI-based verification tools for video content. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (pp. 1-17).

[13] Zhang, B., Xu, M., & Pan, Z. (2025). Human-AI Collaborative Game Testing with Vision Language Models. arXiv preprint arXiv:2501.11

[14] Junius, N., & Carstensdottir, E. (2023, October). Expressive response curves: testing expressive game feel with A. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 19, No. 1, pp. 284-294).

[15] Feldmeier, P., Straubinger, P., & Fraser, G. (2023, December). Playtest: A gamified test generator for games. In Proceedings of the 2nd International Workshop on Gamification in Software Development, Verification, and Validation (pp. 47-51).

[16] Maleki, M. F., & Zhao, R. (2024, November). Procedural content generation in games: A survey with insights on emerging llm integration. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 20, No. 1, pp. 167-178).

[17] Mao, X., Yu, W., Yamada, K. D., & Zielewski, M. R. (2024). Procedural content generation via generative artificial intelligence. arXiv preprint arXiv:2407.09013.

[18] Amadori, P. V., Bradley, T., Spick, R., & Moss, G. (2024). Robust Imitation Learning for Automated Game Testing. arXiv preprint arXiv:2401.04572.

[19] Wang, J., Huang, Y., Chen, C., Liu, Z., Wang, S., & Wang, Q. (2024). Software testing with large language models: Survey, landscape, and vision. IEEE Transactions on Software Engineering, 50(4), 911-936.

[20] Peng, X., Quaye, J., Rao, S., Xu, W., Botchway, P., Brockett, C., ... & Dolan, B. (2024, August). Player-driven emergence in llm-driven game narrative. In 2024 IEEE Conference on Games (CoG) (pp. 1-8). IEEE.

[21] Maleki, M. F., & Zhao, R. (2024, November). Procedural content generation in games: A survey with insights on emerging llm integration. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 20, No. 1, pp. 167-178).

[22] K. R. Kotte, L. Thammareddi, D. Kodi, V. R. Anumolu, A. K. K and S. Joshi, "Integration of Process Optimization and Automation: A Way to AI Powered Digital Transformation," 2025 First International Conference on Advances in Computer Science, Electrical, Electronics, and Communication Technologies (CE2CT), Bhimtal, Nainital, India, 2025, pp. 1133-1138, doi: 10.1109/CE2CT64011.2025.10939966.

[23] B. C. C. Marella, G. C. Vegineni, S. Addanki, E. Ellahi, A. K. K and R. Mandal, "A Comparative Analysis of Artificial Intelligence and Business Intelligence Using Big Data Analytics," 2025 First International Conference on Advances in Computer Science, Electrical, Electronics, and Communication Technologies (CE2CT), Bhimtal, Nainital, India, 2025, pp. 1139-1144, doi: 10.1109/CE2CT64011.2025.10939850.

[24] Thirunagalingam, A. (2023). Improving Automated Data Annotation with Self-Supervised Learning: A Pathway to Robust AI Models Vol. 7, No. 7,(2023) ITAI. International Transactions in Artificial Intelligence, 7(7).

[25] Settibathini, V. S., Kothuru, S. K., Vadlamudi, A. K., Thammreddi, L., & Rangineni, S. (2023). Strategic analysis review of data analytics with the help of artificial intelligence. International Journal of Advances in Engineering Research, 26, 1-10.

[26] Sehrawat, S. K. (2023). The role of artificial intelligence in ERP automation: state-of-the-art and future directions. Trans Latest Trends Artif Intell, 4(4).

[27] Gopi Chand Vegineni. 2024/12/3. Exploring Anomalies in Dark Web Activities for Automated Threat Identification, FMDB Transactions on Sustainable Computing Systems. 2(4), PP - 189-200.

[28] Naga Surya Teja Thallam. (2024). AI-Enabled Disaster Recovery for Cloud Infrastructure: Proactive Failure Detection and Recovery Strategies. International Scientific Journal of Engineering and Management, 3(8).

[29] S. K. Gunda, "Software Defect Prediction Using Advanced Ensemble Techniques: A Focus on Boosting and Voting Method," 2024 International Conference on Electronic Systems and Intelligent Computing (ICESIC), Chennai, India, 2024, pp. 157-161, https://doi.org/10.1109/ICESIC61777.2024.10846550

[30] Reddy, R. R. P. (2024). Enhancing Endpoint Security through Collaborative Zero-Trust Integration: A Multi-Agent Approach. International Journal of Computer Trends and Technology, 72(8), 86-90.

[31] Vijay Kumar Kasuba, (2025). Use of AI in Project Management: A Risk or Reward? International Journal of Computer Trends and Technology(IJCTT), Volume 73 Issue 5, 70-74, May 2025

[32] Kanji, R. K., & Subbiah, M. K. (2024). Developing Ethical and Compliant Data Governance Frameworks for AI-Driven Data Platforms. Available at SSRN 5507919.

[33] Varinder Kumar Sharma - Advanced 5G Technologies for Mission-Critical Public Safety Communications: A Contemporary Literature Review - Volume 13 Issue 4, Jul - Aug 2025 IJIRMPS. DOI: https://doi.org/10.37082/IJIRMPS.v13.i4.232651

[34] Amrish Solanki, Kshitiz Jain, Shrikaa Jadiga, "Building a Data-Driven Culture: Empowering Organizations with Business Intelligence," International Journal of Computer Trends and Technology (IJCTT), vol. 72, no. 2, pp. 46-55, 2024. Crossref, https://doi.org/10.14445/22312803/ IJCTT-V72I2P109

Published

2025-10-10

How to Cite

1.
Neelapu M. Automated QA Testing for AI-Generated Game Content: Using LLMs to Validate NPC Behavior and Narrative Integrity. IJETCSIT [Internet]. 2025 Oct. 10 [cited 2025 Nov. 8];:198-20. Available from: https://www.ijetcsit.org/index.php/ijetcsit/article/view/448

Similar Articles

21-30 of 245

You may also start an advanced similarity search for this article.