Arrikto, the leader in machine learning on Kubernetes, participated in the announcement of Kubeflow 1.5, the latest version of the open source MLOps platform, with contributions from Google, Arrikto, IBM, Twitter and Rakuten, alongside numerous other contributors. Kubeflow 1.5 delivers lower infrastructure costs, and helps simplify the operation of the end-to-end machine learning platform. Originally developed by Google, Kubeflow is a complete MLOps toolkit, including integrated components for model development, model training, multi-step pipelines, AutoML, serving, monitoring, artifact management, and experiment tracking.
Running production machine learning workflows at scale is notoriously expensive due to outsized requirements on CPUs, GPUs, storage, and memory. Kubeflow 1.5 introduces several key features to reduce these costs. The PyTorch training operator can now be scaled up and down introducing elastic training to make use of ephemeral or spot instances. Arrikto contributed the ability to monitor notebook servers and shut down those that are idle while consuming costly resources. Additionally, the new early validation of hyperparameter tuning vastly improves model accuracy thereby reducing the overfitting of the model, eliminating the costs that would otherwise be incurred.
Arrikto also delivered features that give Kubeflow’s UI a more uniform user experience across a variety of its components, in order to reduce the complexity of machine learning at scale. The latest release also simplifies support for high-availability options in its AutoML component. Finally, Kubeflow 1.5 adds the MPI framework to Kubeflow’s Unified Training Operator, creating a single operator for handling the most popular frameworks including TensorFlow, PyTorch, MXNet, XGBoost and now, MPI.
“Kubeflow is one of the most powerful tools for reducing the complexity of machine learning at scale” said Constantinos Venetsanopoulos, CEO at Arrikto. “As large-scale machine learning projects drive more and deeper value for the world’s largest companies, tools like Kubeflow will be instrumental in helping those companies not get bogged down in complexity and cost that so often hamper those efforts. Arrikto’s mission to help lead the Kubeflow community will continue as we make the development, training and serving of models at scale less complex, more cost efficient and a create a more tightly integrated experience.”
Aside from making several feature contributions to the release, Arrikto members also spearheaded the release process of Kubeflow 1.5. Arrikto’s leadership in the Kubeflow community extends beyond its code contributions. The company is dedicated to making machine learning more accessible to data scientists. It has organized dozens of community meetups, attended by more than 3,000 members, and offers free MLOps training, which more than 3,500 students have enrolled in over the last 6 months. Arrikto also sponsors data scientists to make open source contributions to the Kubeflow project, rewarding the efforts of developers unaffiliated by commercial interests.