AI-IN-A-BOX Performance Evaluation: Kubernetes on VMware vSphere Platform with AMD EPYC 9004 Series

Kubernetes on VMware vSphere is a hybrid cloud platform designed for developing, modernizing, and deploying applications, including AI-enabled applications powered by containers and Kubernetes. It provides a comprehensive set of operational and developer services and tools, including Serverless, Service Mesh, and Pipelines.

Kubernetes Version: 1.28.13

VMware vSphere Platform Version: 8.0 U2

The AMD EPYC 9004 series, known as "Genoa," represents AMD's latest lineup of high-performance server processors, designed for the most demanding workloads across various industries.

AMD EPYC 9654: 4th Gen processor with 96 cores and 192 threads

This document provides a performance evaluation of an AI-IN-A-BOX Kubernetes running on VMware vSphere Platform, powered by AMD EPYC 9004 series CPUs, for serving Large Language Model (LLM) inferencing with the open-source LLM, Llama2. It outlines the key findings from Infobell IT's comprehensive and independent performance benchmarking of the Llama2 model, conducted using Infobell IT’s innovative benchmarking tool, EchoSwift.

LLM Model	SUT Config	Input and Output tokens combinations
Llama2 7B	96 Core CPUs and 48GB memory	32 Input tokens and 256 Output tokens 256 Input tokens and 32 Output tokens

EchoSwift is a specialized tool for benchmarking inference in Large Language Models (LLMs). It offers comprehensive performance and scalability assessments, measuring key metrics like latency and throughput under varying parallel request loads. These insights help customers make informed decisions about the optimal configurations for their LLMs and aids in identifying potential bottlenecks in their deployments of LLM models under load. The insights gained from this benchmarking tool provide clear and actionable guidance, helping customers make informed decisions about the optimal configuration for their large language models (LLMs).

Get it now

Whats your name?

Job Title

Company

Email address

Phone Number

Country

book

CPU Type