Source: containerjournal.com
Run:AI this week announced the general availability of a namesake platform based on Kubernetes that enables IT teams to virtualize graphical processor unit (GPU) resources.
Company CEO Omri Geller says the goal is to enable IT teams to maximize investments in expensive GPUs by leveraging a single line of code to plug in its platform on top of Kubernetes. That would enable IT teams to take advantage of container orchestration to schedule artificial intelligence (AI) workloads across multiple GPUs, and allows certain AI workloads to be prioritized over others, he says.
Geller notes that GPUs don’t lend themselves well to traditional virtual machines. Kubernetes provides an alternative approach to virtualizing bare-metal GPU resources, which are among the most expensive IT infrastructure resource any IT organization can invoke in the cloud or deploy in on-premises IT environments.
Containers are widely employed by AI models because they provide a means to efficiently manage access to large amounts of data without having to update the entire AI application every time an element of that AI model changes. One of the biggest challenges organizations now face is maximizing GPU resources at a time when the number of AI models being built and deployed continues to expand rapidly. Most organizations not only have to frequently update AI models when new data sources become available, but they also discover that over time many AI models need to be replaced altogether, as business conditions and circumstances evolve.
Of course, as those AI models are updated they also need to be slipstreamed into applications, which in turn creates a series of new DevOps challenges for organizations.
For the most part, GPUs today are employed to train AI models because they are more efficient than traditional commercial processors. However, most of the inference engines on which AI models are run are deployed on commercial x86 processors from Intel and AMD. Rival NVIDIA, however, is making a case for replacing x86 processors with GPUs that are becoming less expensive. The Run:AI platform can be employed to maximize GPUs to either train AI models or run inference engines, Geller says.
It’s not at all clear to what degree AI will be injected into applications. There’s no doubt AI will play a significant role in the future of application development. However, most organizations won’t be able to afford to inject AI instantly into every application or re-engineer business processes overnight, so it may still be a while before AI is all-pervasive. In fact, the truth is there is a lot of trial and error involving AI than most data scientists would care to admit.
In the meantime, however, it’s becoming clear Kubernetes will emerge as a de facto platform for both building and deploying AI applications, especially as platforms such as Kubeflow, a set of tools for building AI models, continue to mature. One major issue now is determining how to efficiently manage all the AI workloads that now are heading in the direction of Kubernetes.