Orchestrating Distributed Machine Learning Pipelines Using Microservices and Kubernetes
Keywords:
Distributed Machine Learning, Microservices, Kubernetes, Pipeline Orchestration, Containerization, ML Deployment, CI/CDSynopsis
Purpose: This paper explores the orchestration of distributed machine learning (ML) pipelines using microservices architecture and Kubernetes, focusing on scalability, modularity, and deployment flexibility.
Design/methodology/approach: We conducted a literature review of ten relevant publications, emphasizing architectural patterns, container orchestration, and machine learning workflows. Diagrams and tables provide a structured representation of pipeline orchestration and deployment.
Findings: Microservices offer agility and scalability for ML pipelines, while Kubernetes enhances fault-tolerance and resource optimization. The integration facilitates reproducible training and deployment at scale.
Practical implications: Our findings guide ML engineers and DevOps teams in designing distributed systems with robust CI/CD support, leveraging container orchestration and modular services.
Originality/value: This paper synthesizes advancements and proposes a standardized orchestration model, backed by real-world architectural patterns and deployment strategies.
References
(1) Bisong, E. (2019). Kubeflow and Kubeflow Pipelines. In Building Machine Learning and Deep Learning Models on Google Cloud Platform (pp. 501–519). Springer.
(2) Boag, S., Dube, P., Herta, B., & Hummer, W. (2017). Scalable Multi-Framework Multi-Tenant Lifecycle Management of Deep Learning Training Jobs. Workshop on ML Systems. http://learningsys.org/nips17/assets/papers/paper_29.pdf
(3) Gummadi, V. P. K. (2019). Microservices architecture with APIs: Design, implementation, and MuleSoft integration. Journal of Electrical Systems, 15(4), 130–134. https://doi.org/10.52783/jes.9328
(4) Devarakonda, R. R. (2017). A Microservices-Based Approach for Scalable Deployment of Machine Learning Models on a Cloud-Based Platform. SSRN. https://ssrn.com/abstract=5234707
(5) Felstaine, E., & Hermoni, O. (2018). Machine Learning, Containers, Cloud Natives, and Microservices. In Artificial Intelligence for Autonomous Networks. Taylor & Francis.
(6) Khoonsari, P. E., Moreno, P., Bergmann, S., et al. (2019). Interoperable and scalable data analysis with microservices: Applications in metabolomics. Bioinformatics, 35(19), 3752–3760. https://doi.org/10.1093/bioinformatics/btz160
(7) Pham, A. (2018). Building Continuous Delivery Pipeline for Microservices. Theseus.fi. https://www.theseus.fi/handle/10024/145611
(8) Ribeiro, J. L., Figueredo, M., & Araujo, A. (2019). A Microservice Based Architecture Topology for Machine Learning Deployment. 2019 IEEE Latin America Conference. https://ieeexplore.ieee.org/document/9071708
(9) Sayfan, G. (2019). Hands-On Microservices with Kubernetes. Packt Publishing.
(10) Staar, P. W. J., Dolfi, M., Auer, C., & Bekas, C. (2018). Corpus Conversion Service: A Machine Learning Platform to Ingest Documents at Scale. ACM Digital Library. https://doi.org/10.1145/3219819.3219834
(11) Tesliuk, A., Bobkov, S., Ilyin, V., & Novikov, A. (2019). Kubernetes Container Orchestration as a Framework for Scientific Data Analysis. 2019 Ivannikov ISPRAS Open Conference. https://ieeexplore.ieee.org/document/8990167
Published
Series
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.