Abstract
Serverless computing has shown vast potential for big data analytics applications, especially involving machine learning algorithms. Nevertheless, little consideration has been given in the literature to cloud-agnostic serverless architectures that leverage existing parallel implementations of machine learning algorithms. This work bridges this gap by proposing a multi-cloud serverless architecture for distributed machine learning, that enables machine learning engineers without cloud computing expertise to effortlessly port already implemented parallel machine learning algorithms to serverless, whilst overcoming vendor lock-in. In this work, two stateful machine learning algorithms have been ported to serverless, k-means clustering and logistic regression. The serverless implementation of k-means provided superior performance and scalability compared to a serverful implementation when using a number of workers that is equal to or slightly lower than the total number of vCPUs available on the VM running the serverful implementation. Additionally, it achieved an 87-fold speedup compared to a sequential implementation. Moreover, two storage designs of the shared state will be proposed for the serverless implementations, one that requires locks for updating the shared state, and another that is lock-free. Our experimental evaluation demonstrates that the performance of the lock-free serverless implementation of k-means declines with the increase in the number of clusters.
Original language | English |
---|---|
Title of host publication | Proceedings - 2024 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2024 |
Publisher | IEEE |
Pages | 131-140 |
ISBN (Electronic) | 979-8-3503-6730-0 |
ISBN (Print) | 979-8-3503-6731-7 |
DOIs | |
Publication status | Published - 8 Apr 2025 |
Event | 11th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2024 - Sharjah, United Arab Emirates Duration: 16 Dec 2024 → 19 Dec 2024 |
Conference
Conference | 11th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2024 |
---|---|
Country/Territory | United Arab Emirates |
City | Sharjah |
Period | 16/12/24 → 19/12/24 |
Bibliographical note
This is an author-produced version of the published paper. Uploaded in accordance with the University’s Research Publications and Open Access policy.Keywords
- Distributed Machine Learning
- Big Data
- Serverless Architectures
- Cloud Agnostic
- Multicloud
- Lithops