Serverless Cloud Engineering Methodologies for Scalable and Efficient Data Pipeline Architectures

Authors

  • Dilliraja Sundar Independent Researcher, USA. Author

DOI:

https://doi.org/10.63282/3050-9246.IJETCSIT-V4I2P118

Keywords:

Serverless Computing, Function-As-A-Service (Faas), Cloud-Native Design Patterns, Data Engineering, Cost Optimization, Auto-Scaling

Abstract

The concept of serverless cloud engineering has become an interesting paradigm of creating highly scalable and operationally lean data pipelines. The current paper suggests a holistic approach to architecture of serverless data pipelines with event-driven ingestion, cloud-native architecture patterns, and multi-zone storage on the lakehouse, object store, and NoSQL layers. Compare the serverless methods at first, with traditional ETL systems, warehouse-based system, and containerized microservice pipelines, indicating the operational overhead and scaling constraints of server-ful models. Then encode the key principles of no-server management, automatic scaling, event-based execution and pay-per-use economics and translate them into tangible patterns including fan-out/fan-in, asynchronous task execution and event streaming. The architecture proposed describes the integration of APIs, workflow orchestration, event routers and serverless analytics engines to create an end to end and cloud native data platform capable of serving both batch and streaming workloads as well as hybrid and multi-cloud environments. The framework of efficiency is presented that deals with cost optimization, auto-scaling policies, minimum latency, and techniques of reliability (idempotency, retries, and dead-letter handling). Based on the representative experimental configurations in the recent literature, incorporate findings on the comparison of serverless and non-serverless deployments in terms of throughput, latency and cost measures. As can be analyzed, serverless pipelines are especially beneficial in bursty and elastic workloads, whereas hybrid patterns are still applicable to long-lasting and stateful processing. In general, the article provides a conceptual roadmap as well as empirically based rationale to the adoption of the serverless server engineering methodologies in the new data pipeline architectures in the present day

Downloads

Download data is not yet available.

References

[1] Poojara, S. R., Dehury, C. K., Jakovits, P., & Srirama, S. N. (2022). Serverless data pipeline approaches for IoT data in fog and cloud computing. Future Generation Computer Systems, 130, 91-105.

[2] Chowdhury, R. H. (2021). Cloud-Based Data Engineering for Scalable Business Analytics Solutions: Designing Scalable Cloud Architectures to Enhance the Efficiency of Big Data Analytics in Enterprise Settings. Journal of Technological Science & Engineering (JTSE), 2(1), 21-33.

[3] Dehury, C., Jakovits, P., Srirama, S. N., Tountopoulos, V., & Giotis, G. (2020, September). Data pipeline architecture for serverless platform. In European Conference on Software Architecture (pp. 241-246). Cham: Springer International Publishing.

[4] Serverless data pipelines: ETL workflow with Step Functions and Athena, 2021. Online. https://dev.to/aws-builders/serverless-data-pipelines-etl-workflow-with-step-functions-and-athena-4hhf

[5] Mukherjee, R., & Kar, P. (2017, January). A comparative review of data warehousing ETL tools with new trends and industry insight. In 2017 IEEE 7th International Advance Computing Conference (IACC) (pp. 943-948). IEEE.

[6] Vohra, D. (2016). Kubernetes microservices with Docker. Apress.

[7] Jangda, A., Pinckney, D., Brun, Y., & Guha, A. (2019). Formal foundations of serverless computing. Proceedings of the ACM on Programming Languages, 3(OOPSLA), 1-26.

[8] Rajan, R. A. P. (2018, December). Serverless architecture-a revolution in cloud computing. In 2018 Tenth International Conference on Advanced Computing (ICoAC) (pp. 88-93). IEEE.

[9] Deepa, A. A. (2016). Building a Serverless Data Pipeline using AWS Glue and Lambda.

[10] Torkura, K. A., Sukmana, M. I., Cheng, F., & Meinel, C. (2017, November). Leveraging cloud native design patterns for security-as-a-service applications. In 2017 IEEE International Conference on Smart Cloud (SmartCloud) (pp. 90-97). IEEE.

[11] Building serverless ETL pipelines on AWS, impetus, 2022. online. https://www.impetus.com/resources/blog/building-serverless-etl-pipelines-on-aws/

[12] Rovnyagin, M. M., Shipugin, V. A., Ovchinnikov, K. A., & Durachenko, S. V. (2021). Intelligent container orchestration techniques for batch and micro-batch processing and data transfer. Procedia Computer Science, 190, 684-689.

[13] McGrath, G., & Brenner, P. R. (2017, June). Serverless computing: Design, implementation, and performance. In 2017 IEEE 37th International Conference on Distributed Computing Systems Workshops (ICDCSW) (pp. 405-410). IEEE.

[14] Citation: A. Tiwari, ”Kubernetes for Big Data Workloads”, Abhishek Tiwari, 2017. doi:10.59350/wh60e-4g784

[15] Cutting through the confusion: data warehouse vs. data lake vs. data lakehouse, itrexgroup, 2022. online. https://itrexgroup.com/blog/data-warehouse-vs-data-lake-vs-data-lakehouse-differences-use-cases-tips/

[16] Fan, C. F., Jindal, A., & Gerndt, M. (2020, May). Microservices vs Serverless: A Performance Comparison on a Cloud-native Web Application. In CLOSER (pp. 204-215).

[17] Wu, Y., Dinh, T. T. A., Hu, G., Zhang, M., Chee, Y. M., & Ooi, B. C. (2022, June). Serverless data science-are we there yet? a case study of model serving. In Proceedings of the 2022 international conference on management of data (pp. 1866-1875).

[18] Amariucai, T. (2021). Performance Characterization of Serverless Computing. University of Edinburgh Project Archive.

[19] Enes, J., Expósito, R. R., & Touriño, J. (2020). Real-time resource scaling platform for big data workloads on serverless environments. Future Generation Computer Systems, 105, 361-379.

[20] Ali, A., Pinciroli, R., Yan, F., & Smirni, E. (2020, November). Batch: Machine learning inference serving on serverless platforms with adaptive batching. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 1-15). IEEE.

[21] Kulkarni, S. G., Liu, G., Ramakrishnan, K. K., & Wood, T. (2019, July). Living on the edge: Serverless computing and the cost of failure resiliency. In 2019 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN) (pp. 1-6). IEEE.

[22] Jayaram, Y., & Sundar, D. (2023). AI-Powered Student Success Ecosystems: Integrating ECM, DXP, and Predictive Analytics. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 4(1), 109–119. https://doi.org/10.63282/3050-9262.IJAIDSML-V4I1P113

[23] Bhat, J. (2022). The Role of Intelligent Data Engineering in Enterprise Digital Transformation. International Journal of AI, BigData, Computational and Management Studies, 3(4), 106–114. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V3I4P111

[24] Nangi, P. R., Obannagari, C. K. R. N., & Settipi, S. (2022). Self-Auditing Deep Learning Pipelines for Automated Compliance Validation with Explainability, Traceability, and Regulatory Assurance. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(1), 133–142. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I1P114

[25] Jayaram, Y., Sundar, D., & Bhat, J. (2022). AI-Driven Content Intelligence in Higher Education: Transforming Institutional Knowledge Management. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(2), 132–142. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I2P115

[26] Nangi, P. R. (2022). Multi-Cloud Resource Stability Forecasting Using Temporal Fusion Transformers. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(3), 123–135. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I3P113

[27] Bhat, J., Sundar, D., & Jayaram, Y. (2022). Modernizing Legacy ERP Systems with AI and Machine Learning in the Public Sector. International Journal of Emerging Research in Engineering and Technology, 3(4), 104–114. https://doi.org/10.63282/3050-922X.IJERET-V3I4P112

[28] Jayaram, Y., & Bhat, J. (2022). Intelligent Forms Automation for Higher Ed: Streamlining Student Onboarding and Administrative Workflows. International Journal of Emerging Trends in Computer Science and Information Technology, 3(4), 100–111. https://doi.org/10.63282/3050-9246.IJETCSIT-V3I4P110

[29] Nangi, P. R., Obannagari, C. K. R. N., & Settipi, S. (2022). Enhanced Serverless Micro-Reactivity Model for High-Velocity Event Streams within Scalable Cloud-Native Architectures. International Journal of Emerging Research in Engineering and Technology, 3(3), 127–135. https://doi.org/10.63282/3050-922X.IJERET-V3I3P113

[30] Jayaram, Y., & Sundar, D. (2022). Enhanced Predictive Decision Models for Academia and Operations through Advanced Analytical Methodologies. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(4), 113–122. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I4P113

[31] Bhat, J., & Sundar, D. (2022). Building a Secure API-Driven Enterprise: A Blueprint for Modern Integrations in Higher Education. International Journal of Emerging Research in Engineering and Technology, 3(2), 123–134. https://doi.org/10.63282/3050-922X.IJERET-V3I2P113

[32] Nangi, P. R., Reddy Nala Obannagari, C. K., & Settipi, S. (2022). Predictive SQL Query Tuning Using Sequence Modeling of Query Plans for Performance Optimization. International Journal of AI, BigData, Computational and Management Studies, 3(2), 104–113. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V3I2P111

Published

2023-06-30

Issue

Section

Articles

How to Cite

1.
Sundar D. Serverless Cloud Engineering Methodologies for Scalable and Efficient Data Pipeline Architectures. IJETCSIT [Internet]. 2023 Jun. 30 [cited 2025 Dec. 24];4(2):182-9. Available from: https://www.ijetcsit.org/index.php/ijetcsit/article/view/507

Similar Articles

101-110 of 389

You may also start an advanced similarity search for this article.