ML Models That Learn Query Patterns and Suggest Execution Plans

Authors

  • Nagireddy Karri Senior IT Administrator Database, Sherwin-Williams, USA. Author

DOI:

https://doi.org/10.63282/3050-9246.IJETCSIT-V4I1P115

Keywords:

Query Optimization, Machine Learning Models, Execution Plans, Query Workload Patterns, Database Systems, Reinforcement Learning, Predictive Optimization

Abstract

Planning of efficient query execution is the key aspect of the modern database systems as the increasing demand of using data-driven decision making placed an unprecedented computational load on both relational and non-relational databases. Although robust, traditional cost-based query optimizers fail to dynamically scale in a complex or ad hoc environment. The most recent development in the field of the machine learning (ML) models can be treated as another viewpoint paradigm that can capture the patterns of query execution and forecast efficient plans, yet with near real-time time scale. The paper provides the in-depth discussion of ML models, which are intended to learn query patterns and recommend an execution strategy, such as supervised, unsupervised, and reinforcement learning strategies. The proposed methodology incorporates a hybrid ML-based query optimizer that learns query history using past query logs to model query workloads and predict the execution time and offer optimized execution plans. Decades of experiments have shown that ML- based models are more adaptable in workloads, more robust, and less latent than their traditional optimizers. The findings point to the fact that ML-based models can lower query execution time up to 30% on average than traditional optimizers and generalize across heterogeneous workloads. Other issues that we solve in this work include the engineering of features used to represent a query, cross-database transferability, and interpretability of the ML decisions in the execution planning. There are three important contributions in this work: (i) literature review summing up the progress in the field of ML to optimize queries; (ii) a common methodology where the models in ML can be integrated into query optimizers, and (iii) empirical evidence on the superiority of ML under dynamic workloads. The purpose of this paper is to create a logical framework that helps researchers and practitioners easily incorporate ML models into query optimization pipelines, to improve the performance of database systems in commercial operations

Downloads

Download data is not yet available.

References

[1] Selinger, P. G., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A., & Price, T. G. (1979, May). Access path selection in a relational database management system. In Proceedings of the 1979 ACM SIGMOD international conference on Management of data (pp. 23-34).

[2] Chaudhuri, S. (1998, May). An overview of query optimization in relational systems. In Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems (pp. 34-43).

[3] Kruse, S., Kaoudi, Z., Contreras-Rojas, B., Chawla, S., Naumann, F., & Quiané-Ruiz, J. A. (2020). RHEEMix in the data jungle: a cost-based optimizer for cross-platform systems. The VLDB Journal, 29(6), 1287-1310.

[4] Tekale, K. M. T., & Enjam, G. reddy . (2022). The Evolving Landscape of Cyber Risk Coverage in P&C Policies. International Journal of Emerging Trends in Computer Science and Information Technology, 3(3), 117-126. https://doi.org/10.63282/3050-9246.IJETCSIT-V3I1P113

[5] Krishnan, S., Yang, Z., Goldberg, K., Hellerstein, J., & Stoica, I. (2018). Learning to optimize join queries with deep reinforcement learning. arXiv preprint arXiv:1808.03196.

[6] Heitz, J., & Stockinger, K. (2019). Join query optimization with deep reinforcement learning algorithms. arXiv preprint arXiv:1911.11689.

[7] Marcus, R., & Papaemmanouil, O. (2018). Towards a hands-free query optimizer through deep learning. arXiv preprint arXiv:1809.10212.

[8] Yang, Z., Chiang, W. L., Luan, S., Mittal, G., Luo, M., & Stoica, I. (2022, June). Balsa: Learning a query optimizer without expert demonstrations. In Proceedings of the 2022 International Conference on Management of Data (pp. 931-944).

[9] Tekale, K. M., & Rahul, N. (2022). AI and Predictive Analytics in Underwriting, 2022 Advancements in Machine Learning for Loss Prediction and Customer Segmentation. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(1), 95-113. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I1P111

[10] Marcus, R., Negi, P., Mao, H., Tatbul, N., Alizadeh, M., & Kraska, T. (2021, June). Bao: Making learned query optimization practical. In Proceedings of the 2021 International Conference on Management of Data (pp. 1275-1288).

[11] Yu, X., Chai, C., Li, G., & Liu, J. (2022). Cost-based or learning-based? A hybrid query optimizer for query plan selection. Proceedings of the VLDB Endowment, 15(13), 3924-3936.

[12] Zahir, J., & El Qadi, A. (2016). A recommendation system for execution plans using machine learning. Mathematical and Computational Applications, 21(2), 23.

[13] Akdere, M., Çetintemel, U., Riondato, M., Upfal, E., & Zdonik, S. B. (2012, April). Learning-based query performance modeling and prediction. In 2012 IEEE 28th International Conference on Data Engineering (pp. 390-401). IEEE.

[14] Hellerstein, J. M. (1998). Optimization techniques for queries with expensive methods. ACM Transactions on Database Systems (TODS), 23(2), 113-157.

[15] Ramadan, M., El-Kilany, A., Mokhtar, H. M., & Sobh, I. (2022). Rl_qoptimizer: a reinforcement learning based query optimizer. IEEE Access, 10, 70502-70515.

[16] Fent, P., & Neumann, T. (2021). A practical approach to groupjoin and nested aggregates. Proceedings of the VLDB Endowment, 14(11), 2383-2396.

[17] Hamilton, W., Bajaj, P., Zitnik, M., Jurafsky, D., & Leskovec, J. (2018). Embedding logical queries on knowledge graphs. Advances in neural information processing systems, 31.

[18] Marcus, R., & Papaemmanouil, O. (2019). Flexible operator embeddings via deep learning. arXiv preprint arXiv:1901.09090.

[19] Zongheng Yang, Machine Learning for Query Optimization, online. https://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-194.pdf

[20] Jalali, S. M. J., Osório, G. J., Ahmadian, S., Lotfi, M., Campos, V. M., Shafie-khah, M., ... & Catalão, J. P. (2021). New hybrid deep neural architectural search-based ensemble reinforcement learning strategy for wind power forecasting. IEEE Transactions on Industry Applications, 58(1), 15-27.

[21] Sharma, M., Singh, G., & Singh, R. (2019). A review of different cost-based distributed query optimizers. Progress in Artificial Intelligence, 8(1), 45-62.

[22] Kaoudi, Z., Quiané-Ruiz, J. A., Contreras-Rojas, B., Pardo-Meza, R., Troudi, A., & Chawla, S. (2020, April). ML-based cross-platform query optimization. In 2020 IEEE 36th International Conference on Data Engineering (ICDE) (pp. 1489-1500). IEEE.

[23] Tekale, K. M. (2022). Claims Optimization in a High-Inflation Environment Provide Frameworks for Leveraging Automation and Predictive Analytics to Reduce Claims Leakage and Accelerate Settlements. International Journal of Emerging Research in Engineering and Technology, 3(2), 110-122. https://doi.org/10.63282/3050-922X.IJERET-V3I2P112

Published

2023-03-30

Issue

Section

Articles

How to Cite

1.
Karri N. ML Models That Learn Query Patterns and Suggest Execution Plans. IJETCSIT [Internet]. 2023 Mar. 30 [cited 2025 Oct. 27];4(1):133-41. Available from: https://www.ijetcsit.org/index.php/ijetcsit/article/view/406

Similar Articles

21-30 of 299

You may also start an advanced similarity search for this article.