VMware Private AI Foundation with NVIDIA Now Available

Introduction

Generative AI (Gen AI) stands out as a key emerging trend poised to revolutionize enterprises over the next 5 to 10 years. This AI wave centers on large language models (LLMs) that can handle vast and diverse datasets, enabling natural language text or speech interactions with AI models.


Related Image:  © VMware by Broadcom

The investment and focus on developing LLMs have surged, leading to updates of existing models and the introduction of new ones like Gemini (formerly Bard), Llama 2, PaLM 2, and DALL-E. Some of these models are open-source, while others are proprietary to companies like Google, Meta, and OpenAI. The future value of Gen AI lies in refining and tailoring domain-specific models unique to each business and industry. A significant advancement in utilizing LLMs is Retrieval Augmented Generation (RAG), where LLMs are connected to extensive and varied datasets, enabling businesses to engage with the LLM about the data.

VMware, now part of Broadcom, offers software that modernizes, optimizes, and secures the workloads of complex organizations in data centers across various clouds, applications, and out to the enterprise edge. VMware Cloud Foundation software empowers enterprises to innovate, transform their business, and adopt a wide array of AI applications and services. It provides a unified platform for managing all workloads, including VMs, containers, and AI technologies, through a self-service, automated IT environment.

In August 2023, at VMware Explore in Las Vegas, the company announced VMware Private AI and VMware Private AI Foundation in partnership with NVIDIA. Today, at NVIDIA GTC, VMware is excited to announce the Initial Availability of VMware Private AI Foundation with NVIDIA.

VMware Private AI Foundation with NVIDIA

Broadcom and NVIDIA are collaborating to harness the potential of Gen AI and enhance productivity through their joint GenAI platform, VMware Private AI Foundation, with NVIDIA.

Leveraging the robust private cloud platform, VMware Cloud Foundation, VMware Private AI Foundation with NVIDIA incorporates the new NVIDIA NIM inference microservices, AI models from NVIDIA and the broader community (like Hugging Face), and NVIDIA AI tools and frameworks, all accessible through NVIDIA AI Enterprise licenses.

This unified Gen AI platform empowers enterprises to execute RAG workflows, fine-tune and customize LLM models, and conduct inference workloads within their data centers, addressing concerns such as privacy, choice, cost, performance, and compliance. It simplifies Gen AI implementations for enterprises by providing an intuitive automation tool, deep learning VM images, a vector database, and GPU monitoring capabilities. VMware Private AI Foundation with NVIDIA is available as an add-on SKU for VMware Cloud Foundation. It's important to note that NVIDIA AI Enterprise licenses must be separately purchased from NVIDIA.


Related Image:  © VMware by Broadcom

Key Benefits

Let’s explore the key advantages of VMware Private AI Foundation with NVIDIA:

  1. Ensuring Privacy, Security, and Compliance of AI Models: This solution offers an architectural approach to AI services that prioritizes privacy, security, and data control, along with integrated security and management features. VMware Cloud Foundation includes advanced security measures like Secure Boot, Virtual TPM, and VM encryption. NVIDIA AI Enterprise services come with management software that enhances workload and infrastructure utilization for scaling AI model development and deployment. The AI software stack includes over 4,500 open-source software packages, including third-party and NVIDIA software. NVIDIA AI Enterprise services also provide patches for critical and high CVEs, ensuring production and long-term support branches and maintaining API compatibility across the entire stack. VMware Private AI Foundation with NVIDIA facilitates on-premises deployments, enabling enterprises to easily address regulatory compliance challenges without majorly re-architecting their existing environment.
  2. Achieving Accelerated Performance for GenAI Models with Any LLMs: Broadcom and NVIDIA have integrated software and hardware capabilities to maximize performance for GenAI models. These capabilities, integrated into the VMware Cloud Foundation platform, include GPU monitoring, live migration, and load balancing; Instant Cloning, allowing the deployment of multi-node clusters with pre-loaded models within seconds; virtualization and pooling of GPUs; and scaling of GPU input/output with NVIDIA NVLink and NVIDIA NVSwitch. A recent benchmark study comparing AI workloads on the VMware + NVIDIA AI-Ready Enterprise Platform against bare metal demonstrates comparable or superior performance. Running AI workloads on virtualized solutions maintains performance while offering the benefits of virtualization, such as simplified management and enhanced security. NVIDIA NIM enables enterprises to run inference on a range of optimized LLMs, from NVIDIA models to community models like Llama-2 and open-source LLMs like Hugging Face, with exceptional performance.
  3. Simplifying GenAI Deployment and Optimizing Costs: VMware Private AI Foundation with NVIDIA simplifies deployment and provides a cost-effective solution for GenAI models. It includes features like a vector database for enabling RAG workflows, deep learning VMs, and a quick start automation wizard, streamlining the deployment process. The platform offers unified management tools and processes, leading to significant cost reductions. This approach enables the virtualization and sharing of infrastructure resources, such as GPUs, CPUs, memory, and networks, resulting in substantial cost savings, particularly for inference use cases where full GPUs may not be necessary.

Architecture 


Related Image:  © VMware by Broadcom

VMware Cloud Foundation, a comprehensive private cloud infrastructure solution, and NVIDIA AI Enterprise, a cloud-native software platform, are the foundational components of the VMware Private AI Foundation with NVIDIA platform. Together, they empower enterprises to deploy private and secure Gen AI models.

VMware's specialized capabilities

  • Deep Learning VM templates: Setting up a deep learning VM can be complex and time-consuming, leading to inconsistencies and missed optimization opportunities. VMware Private AI Foundation with NVIDIA offers pre-configured deep learning VMs with necessary software frameworks, such as NVIDIA NGC, libraries, and drivers, eliminating the need for manual setup.
  • Vector Databases for RAG workflows: Vector databases play a crucial role in RAG workflows, enabling fast data querying and real-time updates to improve LLM outputs without the need for costly retraining. VMware leverages PNG vector on PostgreSQL to enable vector databases, managed through native infrastructure automation in VMware Cloud Foundation's Data Services Manager, simplifying the deployment and management of databases.
  • Catalog Setup Wizard: AI project infrastructure provisioning involves complex steps typically handled by LOB admins. They select and deploy VM classes, Kubernetes clusters, vGPUs, and AI/ML software from the NGC catalog. However, this process can be time-consuming and may result in non-compliant or non-scalable infrastructure. To address this, VMware Cloud Foundation introduces the Catalog Setup Wizard, allowing LOB admins to efficiently design and provide optimized AI infrastructure catalog items through a self-service portal. This reduces the manual workload for admins and shortens the waiting time for DevOps and data scientists by simplifying the infrastructure creation process.
  • GPU Monitoring: Visibility into GPU usage and performance metrics enables organizations to optimize performance, ensure reliability, and manage costs in GPU-accelerated environments. VMware Private Foundation with NVIDIA introduces GPU monitoring capabilities in VMware Cloud Foundation, providing insights into GPU resource utilization across clusters and hosts. This allows admins to optimize GPU usage and improve performance and cost-efficiency.

NVIDIA AI Enterprise's Exciting Capabilities

  • NVIDIA NIM: NVIDIA NIM is a set of user-friendly microservices designed to accelerate the deployment of Gen AI across enterprises. This versatile microservice supports a wide range of models, from leading community models to NVIDIA-built models to custom AI models optimized for the NVIDIA accelerated stack. Built on NVIDIA Triton Inference Server, NVIDIA TensorRT, TensorRT-LLM, and PyTorch, NVIDIA NIM facilitates seamless AI inferencing at scale, helping developers deploy AI in production efficiently and confidently.
  • NVIDIA NeMo Retriever: NVIDIA NeMo Retriever, part of the NVIDIA NeMo platform, consists of CUDA-X Gen AI microservices that enable organizations to connect custom models to diverse business data and provide highly accurate responses. NeMo Retriever offers top-notch information retrieval with low latency, high throughput, and maximum data privacy, allowing organizations to leverage their data for real-time business insights. NeMo Retriever enhances GenAI applications with improved RAG capabilities, connecting to business data wherever it's located.
  • NVIDIA RAG LLM Operator: The NVIDIA RAG LLM Operator simplifies the deployment of RAG applications into production. It streamlines the deployment of RAG pipelines developed using NVIDIA AI workflow examples, eliminating the need to rewrite code.
  • NVIDIA GPU Operator: The NVIDIA GPU Operator automates the lifecycle management of the software needed to utilize GPUs with Kubernetes. It provides advanced functionality, including improved GPU performance, utilization, and telemetry. GPU Operator enables organizations to focus on building applications rather than managing Kubernetes infrastructure.
  • This platform receives support from leading server OEMs, including Dell, HPE, and Lenovo.


Similar Articles
Ezmata Technologies Pvt Ltd
You manage your core business, while we manage your Infrastructure through ITaaS.