100,000 H100 Clusters: Power, Network Topology, Ethernet vs InfiniBand, Reliability, Failures, Checkpointing
SemiAnalysis
by Dylan Patel
1M ago
There is a camp that feels AI capabilities have stagnated ever since GPT-4’s release. This is generally true, but only because no one has been able to massively increase the amount of compute dedicated to a single model. Every model that has been released is roughly GPT-4 level (~2e25 FLOP of training compute). This is because the training compute dedicated to these models have also been roughly the same level. In the case of Google’s Gemini Ultra, Nvidia Nemotron 340B, and Meta LLAMA 3 405B, the FLOPS dedicated were of similar magnitude or even higher when compared to GPT-4, but an inferior a ..read more
Visit website
OpenAI Chip Team Is Now Serious
SemiAnalysis
by Dylan Patel
1M ago
While OpenAI’s chip ambitions have been rumored for a while, their actions towards building their own AI chips have essentially amounted to a lot of talking, but that’s all changed. Get 20% off a group subscription Read more ..read more
Visit website
How Dell Is Beating Supermicro
SemiAnalysis
by Dylan Patel
2M ago
At Nvidia’s GTC 2024, Jensen went to Dell’s booth and chanted Dell over and over. Jensen even called out Michael Dell in the audience on stage during his keynote speech. Nvidia is clearly excited about Dell’s prospects as an AI server company, but why? Dell was extremely late to AI servers. In the previous generation A100, they only offered the low volume low end 4xA100 servers (Redstone) geared towards the significantly smaller HPC market. Furthermore, in HPC, Dell only has 7% market share of publicly disclosed systems whereas their chief competitors Lenovo and HPE have 32.6% and 22.4% respec ..read more
Visit website
Apple’s AI Strategy: Apple Datacenters, On-device, Cloud, And More
SemiAnalysis
by Dylan Patel
2M ago
Nvidia continues to ramp their production to service the world’s insatiable demand for GPUs, and yet, our Accelerator Model’s extensive checks show Apple’s purchases of GPUs are quite miniscule. In fact, they aren’t even a top 10 customer. Furthermore, while all eyes are on WWDC, Apple’s only announcing AI there, not shipping. The question on everyone’s mind is… what the heck is Apple doing in AI? Mark Gurman laid out the features Apple is announcing at WWDC. Furthermore there’s a variety of rumors floating around from others, so let’s get to the bottom of what’s really happening, how, and wha ..read more
Visit website
OpenAI Is Doomed? - Et tu, Microsoft?
SemiAnalysis
by Dylan Patel
2M ago
All eyes are on how long the profitless spending on AI continues. H100 rental pricing is falling every month and availability is growing quickly for medium sized clusters at fair pricing. Despite this, it’s clear that demand dynamics are still strong. While the big tech firms are still the largest buyers, there is an increasingly diverse roster of buyers around the world still increasing GPU purchasing sequentially. Most of the exuberance isn’t due to any sort of revenue growth, but rather due to the rush to build ever larger models based on dreams about future business. The clear target that ..read more
Visit website
Intel’s 14A Magic Bullet: Directed Self-Assembly (DSA)
SemiAnalysis
by Dylan Patel
3M ago
Intel’s 18A node gets most of the spotlight recently – with an ongoing battle between TSMC’s and Intel’s management teams on the merits of TSMC N2 vs Intel’s 18A. However, it is 14A that will be the make-or-break node for Intel Foundry. Winning customers starts with process technology, and Intel is betting big here, but they need a generation where everyone gets comfortable. Customers will use 18A to dip their toes in the Intel waters with less critical chips that are not core to their business; if all goes well, they will look to 14A as the main process for their linchpin designs – think the ..read more
Visit website
Nvidia Blackwell Perf TCO Analysis - B100 vs B200 vs GB200NVL72
SemiAnalysis
by Dylan Patel
3M ago
Nvidia’s announcement of the B100, B200, and GB200 has garnered more attention than even iPhone launches, at least among the nerds of the world. The real question that everyone is asking is, what is the real performance increase? Nvidia’s claimed 30x, but is that true? Moreover, the question is really, what is the performance/TCO? In the last generation, with the H100, the performance/TCO uplift over the A100 was poor due to the huge increase in pricing, with the A100 actually having better TCO than the H100 in inference because of the H100’s anemic memory bandwidth gains and massive price inc ..read more
Visit website
Is Intel Back? Foundry & Product Resurgence Measured
SemiAnalysis
by Dylan Patel
4M ago
Before Pat Gelsinger took over Intel as CEO, the company spent over a decade in a slow descent due to a focus on financial engineering. The decline was set in motion by the then CEO, Paul Otellini, who made the shortsighted decision to turn down the iPhone contract due to apprehension over margins. The main concern was that Apple's customization demands would be costly and would be amortized over low volume projections that turned out to be woefully underestimated by Intel. This led to Intel missing out on the last decade’s largest area of growth: mobile. Intel’s own assessment of its proces ..read more
Visit website
Nvidia’s Optical Boogeyman – NVL72, Infiniband Scale Out, 800G & 1.6T Ramp
SemiAnalysis
by Dylan Patel
4M ago
At GTC, Nvidia announced 8+ different SKUs and configurations of the Blackwell architecture. While there are some chip level differences such as memory and CUDA core counts, most of the configurations are system level such as form factor, networking, CPU, and power consumption. Nvidia is offering multiple 8 GPU baseboard style configurations, but the main focus for Nvidia at GTC was their vertically integrated DGX GB200 NVL72. Rather than the typical 8 GPU server we are accustomed to, it is a single integrated rack with 72 GPUs, 36 CPUs, 18 NVSwitches, 72 InfiniBand NICs for the backend networ ..read more
Visit website
Nvidia B100, B200, GB200 - COGS, Pricing, Margins, Ramp - Oberon, Umbriel, Miranda
SemiAnalysis
by Dylan Patel
4M ago
This post was published on 3/18 and then taken down due to moral reasons. It is republished on 3/23 with some content removed. We stand by the analysis regarding margin. Nvidia announced their new generation of Blackwell GPUs at GTC. We eagerly await the full architecture white paper to be released to detail the the much needed improvements to the tensor memory accelerator and exact implementation of new MX number formats, discussed here. We discussed many of the high level features of the architecture such as process node, package design, HBM capacity, SerDes speeds, here, but let’s dive a bi ..read more
Visit website

Follow SemiAnalysis on FeedSpot

Continue with Google
Continue with Apple
OR