Automated Hyperparameter Optimization for Deep Neural Networks Using Bayesian Optimization and Genetic Algorithms

Richards  Heiden

doi:10.63282/3050-9246.IJETCSIT-V2I4P103

Authors

Richards Heiden Department of Computer Science, University of Helsinki, Finland. Author

DOI:

https://doi.org/10.63282/3050-9246.IJETCSIT-V2I4P103

Keywords:

Bayesian Optimization, Genetic Algorithm, Hyperparameter Optimization, Deep Neural Networks, Convergence Speed, Computational Cost, Validation Accuracy, Machine Learning, Neural Architecture Search, Optimization Techniques

Abstract

Deep Neural Networks (DNNs) have achieved remarkable success in various domains, including computer vision, natural language processing, and reinforcement learning. However, the performance of these models is highly dependent on the choice of hyperparameters, which are often set manually through trial and error. This process is time-consuming, resourceintensive, and requires significant expertise. To address this challenge, this paper explores the use of automated hyperparameter optimization (HPO) techniques, specifically Bayesian Optimization (BO) and Genetic Algorithms (GA), to improve the efficiency and effectiveness of hyperparameter tuning for DNNs. We provide a comprehensive review of the theoretical foundations of BO and GA, discuss their implementation in the context of DNNs, and evaluate their performance on a variety of benchmark datasets. Our results demonstrate that both BO and GA can significantly enhance the performance of DNNs, with BO generally outperforming GA in terms of convergence speed and final model performance. We also discuss the limitations and potential future directions for research in this area

Downloads

Download data is not yet available.

References

[1] Bergstra, J., Bardenet, R., Bengio, Y., & Kégl, B. (2011). Algorithms for hyper-parameter optimization. Advances in Neural Information Processing Systems, 24, 2546–2554.

[2] Cho, H., Kim, Y., Lee, E., Choi, D., Lee, Y., & Rhee, W. (2019). DEEP-BO for hyperparameter optimization of deep networks. arXiv preprint arXiv:1905.09680. https://arxiv.org/abs/1905.09680

[3] Frazier, P. I. (2018). A tutorial on Bayesian optimization. arXiv preprint arXiv:1807.02811. https://arxiv.org/abs/1807.02811

[4] Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential model-based optimization for general algorithm configuration. In International Conference on Learning and Intelligent Optimization (pp. 507–523). Springer.

[5] Jamieson, K., & Talwalkar, A. (2016). Non-stochastic best arm identification and hyperparameter optimization. In Artificial Intelligence and Statistics (pp. 240–248). PMLR.

[6] Kandasamy, K., Schneider, J., & Póczos, B. (2015). High dimensional Bayesian optimisation and bandits via additive models. In International Conference on Machine Learning (pp. 295–304). PMLR.

[7] Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., & Talwalkar, A. (2017). Hyperband: A novel bandit-based approach to hyperparameter optimization. Journal of Machine Learning Research, 18(1), 6765–6816.

[8] Li, C., Jiang, J., Zhao, Y., Li, R., Wang, E., Zhang, X., & Zhao, K. (2021). Genetic algorithm-based hyper-parameters optimization for transfer convolutional neural network. arXiv preprint arXiv:2103.03875. https://arxiv.org/abs/2103.03875

[9] Loshchilov, I., & Hutter, F. (2016). CMA-ES for hyperparameter optimization of deep neural networks. arXiv preprint arXiv:1604.07269. https://arxiv.org/abs/1604.07269

[10] Maclaurin, D., Duvenaud, D., & Adams, R. (2015). Gradient-based hyperparameter optimization through reversible learning. In International Conference on Machine Learning (pp. 2113–2122). PMLR.

[11] Mendoza, H., Klein, A., Feurer, M., Springenberg, J. T., & Hutter, F. (2016). Towards automatically-tuned neural networks. In Automated Machine Learning (pp. 141–156). Springer.

[12] Mockus, J. (2012). Bayesian approach to global optimization: Theory and applications. Springer Science & Business Media.

[13] Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical Bayesian optimization of machine learning algorithms. Advances in Neural Information Processing Systems, 25, 2951–2959.

[14] Wu, J., Toscano-Palmerin, S., Frazier, P. I., & Wilson, A. G. (2019). Practical multi-fidelity Bayesian optimization for hyperparameter tuning. arXiv preprint arXiv:1903.04703. https://arxiv.org/abs/1903.04703

[15] Yao, Z., Wang, M., Kwok, J. T., & Ni, L. M. (2020). Efficient neural architecture search via proximal iterations. In AAAI Conference on Artificial Intelligence (pp. 6665–6672).

[16] Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. IEEE Computational Intelligence Magazine, 13(3), 55–75.

[17] Yu, T., & Zhu, H. (2020). Hyper-parameter optimization: A review of algorithms and applications. arXiv preprint arXiv:2003.05689. https://arxiv.org/abs/2003.05689

[18] Zhang, Z., & Zhang, J. (2019). Deep neural network hyperparameter optimization with orthogonal array tuning. arXiv preprint arXiv:1901.06824. https://arxiv.org/abs/1901.06824

[19] Zoph, B., & Le, Q. V. (2017). Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578. https://arxiv.org/abs/1611.01578

[20] Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. (2018). Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8697–8710).

Automated Hyperparameter Optimization for Deep Neural Networks Using Bayesian Optimization and Genetic Algorithms

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

How to Cite

Similar Articles

callforpaper

Submission

Menu

Latest publications

Information

Reach US

Ethics and Policies

Important Links

Downloads & Indexing

Similar Articles

Advanced Computational Techniques for Large-Scale Data Manipulation in High-Performance Computing

Bidirectional Curriculum Learning: Decelerating and Re-accelerating Learning for Robust Convergence

Optimizing Claims Reserves and Payments with AI: Predictive Models for Financial Accuracy

AI for Microservice Monitoring & Anomaly Detection

Self-Penalizing Neural Networks: Built-in Regularization Through Internal Confidence Feedback

State-of-the-Art Machine Learning approaches for Fraud Detection in Financial Institutions

ML Models for Early Detection of Mental Health Disorders Using Wearable IoT Devices

AI-Powered Renewable Energy Forecasting: A Hybrid Deep Learning and Physics-Based Model for Solar and Wind Energy Prediction in Smart Grid Applications

AI-Powered Predictive Analytics for Supply Chain Optimization: A Risk-Resilient Framework

High-Performance Computing Architectures: Memory Hierarchy Optimization Strategi