ML scalability
Introducing cloud for doing ML training at scale
Business context
BICS is a communications platform company providing international communications, messaging and connectivity to operators and enterprises, globally.
BICS relies heavily on ML predictive analytics in several use-cases: to forecast pricing evolution, combat fraud, analyse mobility, etc. Most of the data processing and analytics is happening in a local data center, BICS wants to experiment with a hybrid setup, where some ML jobs run in the cloud. The main benefits of the cloud in this context is scalability and cost efficiency.
Scope & objectives
Execute one ML use-case in the cloud, fully integrated into the existing data processing framework
Adapt the existing software tooling (CI/CD, scheduler, network connectivity) to work with cloud resources.
Key results
We set up a machine learning environment on AWS, leveraging SageMaker. Model predictions, model training and also hyperparameter tuning jobs are all running on SageMaker.The SageMaker jobs are integrated with the on-premises scheduler, and data flows seamlessly between on-premises network and the cloud.
Impact
Scalability of the cloud means that BICS can run on-demand ML tuning jobs, while keeping their existing on-premises cluster for scheduled jobs.
This drives more innovation and experimentation, resulting in a more scalable and sustainable way-of-working for their data scientists.