Welcome to WarBulletin - your new best friend in the world of gaming. We're all about bringing you the hottest updates and juicy insights from across the gaming universe. Are you into epic RPG adventures or fast-paced eSports? We've got you covered with the latest scoop on everything from next-level PC gaming rigs to the coolest game releases. But hey, we're more than just news! Ever wondered what goes on behind the scenes of your favorite games? We're talking exclusive interviews with the brains behind the games, fresh off-the-press photos and videos straight from gaming conventions, and, of course, breaking news that you just can't miss. We know you love gaming 24/7, and that's why we're here round the clock, updating you on all things gaming. Whether it's the lowdown on a new patch or the buzz about the next big gaming celeb, we're on it.

Contacts

  • Owner: SNOWLAND s.r.o.
  • Registration certificate 06691200
  • 16200, Na okraji 381/41, Veleslavín, 162 00 Praha 6
  • Czech Republic

NVIDIA Hopper H200 GPU Continues To Dominate In Latest MLPerf 4.0 Results: Up To 3x Gain In GenAI With TensorRT-LLM

NVIDIA continues to push the AI envelope with its strong TensorRT-LLM suite, boosting the H200 GPUs to new heights in the latest MLPerf v4.0 results.

Blackwell Is Here But NVIDIA Continues Pushing Hopper H100 & H200 AI GPUs With New TensorRT-LLM Optimizations For Up To 3x Gain In MLPerf v4.0

Generative AI or GenAI is an emerging market and all hardware manufacturers are trying to grab their slice of the cake. But despite their best efforts, it's NVIDIA that has so far taken the bulk of the share and there's no stopping the green giant as it has showcased some utterly strong benchmarks and records within the MLPerf v4.0 inference results.

Related Story Qualcomm, Intel, & Google Join Hands To Come For NVIDIA, Plans On Dethroning CUDA Through oneAPI

Fine-tuning on TensorRT-LLM has been ongoing ever since the AI Software suite was released last year. We saw a major increase in performance with the previous MLPerf v3.1 results & now with MLPerf v4.0, NVIDIA is supercharging Hopper's performance. Why inference matters is because it accounts for 40% of the data center revenue (generated last year). Inference workloads range from LLMs (Large Language Models), Visual Content, and Recommenders. As these models increase in size, there comes more complexity and the need to have both strong hardware and software.

That's why TensorRT-LLM is there as a state-of-the-art inference compiler that is co-designed with the NVIDIA GPU architectures. Some features of TensorRT-LLMs include:

  • In-Flight Sequence Batching (Optimizes GPU Utilization)
  • KV Cache Management (Higher GPU Memory Utilization)
  • Generalized Attention (XQA Kernel)
  • Multi-GPU Multi-Node (Tensor & Pipeline Parallel)
  • FP8 Quantization (Higher Perf & Fit Larger Models)

Using the latest TensorRT-LLM optimizations, NVIDIA has managed to squeeze in an additional 2.9x performance for its Hopper GPUs (such as the H100) in MLPerf v4.0 versus MLPerf v3.1. In today's benchmark results, NVIDIA has set new performance records in MLPerf

Read more on wccftech.com