Wednesday, March 20, 2024

NVIDIA GTC Financial Analyst Q&A March 19, 2024 Video and Transcript

First up, from the company:

The stock is trading down  $4.15 (-0.46%) at $889.83.

And from Seeking Alpha, Mar. 19, 2024 5:15 PM ET:  

Company Participants

Jensen Huang - Founder and Chief Executive Officer
Colette Kress - Executive Vice President and Chief Financial Officer

Conference Call Participants

Ben Reitzes - Melius Research
Vivek Arya - Bank of America Merrill Lynch
Stacy Rasgon - Bernstein Research
Matt Ramsay - TD Cowen
Tim Arcuri - UBS
Brett Simpson - Arete Research
C.J. Muse - Cantor Fitzgerald
Joseph Moore - Morgan Stanley
Atif Malik - Citi
Pierre Ferragu - New Street Research
Aaron Rakers - Wells Fargo
Will Stein - Truist Securities

 Jensen Huang
Good morning. Nice to see all of you. All right. What's the game plan?

Colette Kress
Okay. Well, we've got a full house and we're thanking you all for coming out for our first in-person in such a long time. Jensen and I are here to kind of really go through any questions that you have, questions from yesterday.

And we're going to go through a series of folks that are going to be in the aisles that you can just reach-out to us, raise your hand, we'll get to you with a mic and Jensen are here to answer any questions from yesterday.

We thought that would be a better plan for you. I know you have already asked quite a few questions, both last night and this morning, but rather than giving you a formal presentation, we're just going to go through of good Q&A today. Sound like a good plan.

I'm going to turn it to Jensen to see if he wants to add some opening remarks because we have just a quick introduction. We'll do it that way. Okay.

Jensen Huang
Yeah. Thank you. First, great to see all of you. There were so many things I wanted to say yesterday and probably have said -- and wanted to say better, but I got to tell you, I've never presented at a rock concert before. I don't know about you guys, but I've never presented in a rock concert before. The -- I had simulated what it was going to be like, but when I walked on stage, it still took my breath away. And so anyways, I did the best I could.

Next, after the tour, I'm going to do a better job, I'm sure. I just need a lot more practice. But there were a few things I wanted to tell you. Is there a clicker -- oh, look at that. See, this is like spatial computing. It's -- by the way, if you get -- I don't know you'll get a chance, because it takes a little step up, but if you get a chance to see Omniverse in Vision Pro, it is insane. Completely incomprehensible how realistic it is.

All right. So we spoke about five things yesterday and I think the first one really deserves some explanation. I think the first one is, of course, this new industrial revolution. There were two -- there are two things that are happening, two transitions that are happening. The first is moving from general purpose computing to accelerated computing. If you just looked at the extraordinary trend of general-purpose computing, it has slowed down tremendously over the years.

And in fact, we've known that it's been slowing down for about a decade and people just didn't want to deal with it for a decade, but you really have to deal with it now. And you can see that people are extending the depreciation cycle of their data centers as a result. You could buy a whole new set of general purpose servers and it's not going to improve your throughput of your overall data center dramatically.

And so you might as well just continue to use what you have for a little longer. That trend is never going to reverse. General purpose computing has reached this end. We're going to continue to need it and there's a whole lot of software that runs on it, but it is very clear we should accelerate everything we can.

There are many different industries that have already been accelerated, some that are very large workloads that we really would like to accelerate more. But the benefits of accelerated computing is very, very clear.

One of the areas that I didn't spend time on yesterday that I really wanted to was data processing. NVIDIA has a suite of libraries that before you could do almost anything in a company, you have to process the data. You have to, of course, ingest the data, and the amount of data is extraordinary. Zettabytes of data being created around the world, just doubling every couple of years, even though computing is not doubling every couple of years.

So you know that data processing, you're on the wrong side of that curve already on data processing. If you don't move to accelerated computing, your data processing bills just keep on going up and up and up and up. And so for a lot of companies that recognize this, AstraZeneca, Visa, Amex, Mastercard, so many, so many companies that we work with, they've reduced their data processing expense by 95%, basically 20 times reduction.

To the point the acceleration is so extraordinary now with our suite of libraries called rapids, that the inventor of Spark, who started a great company called Databricks, and they are the cloud large scale data processing company, they announced that they're going to take Databricks their photon engine, which is their crown jewel and they're going to accelerate that with NVIDIA GPUs.

Okay. So the benefit of acceleration, of course, pass along savings to your customers, but very importantly, so that you can continue to sustainably compute. Otherwise, you're on the wrong side of that curve. You'll never get on the right side of the curve. You have to accelerate. The question is today or tomorrow? Okay. So accelerated computing. We accelerated algorithms so quickly that the marginal cost of computing has declined so tremendously over the last decade that it enabled this new way of doing software called generative AI.

Generative AI, as you know, requires a lot of flops, a lot of flops, a lot of computation. It is not a normal amount of computation, an insane amount of computation. And yet it can now be done cost effectively that consumers can use this incredible service called ChatGPT. So, it's something to consider that accelerated computing has dropped, has driven down the marginal cost of computing so far that enabled a new way of doing something else.

And this new way is software written by computers with a raw material called data. You apply energy to it. There's an instrument called GPU supercomputers. And what comes out of it are tokens that we enjoy. When you're interacting with ChatGPT, you're getting all -- it's producing tokens.

Now, that data center is not a normal data center. It's not a data center that you know of in the past. The reason for that is this. It's not shared by a whole lot of people. It's not doing a whole lot of different things. It's running one application 24/7. And its job is not just to save money, its job is to make money. It's a factory.

This is no different than an AC generator of the last industrial revolution. And it's no different than the raw material coming in is, of course, water. They applied energy to it and turns into electricity. Now it's data that comes into it. It's refined using data processing, and then, of course, generative AI models.

And what comes out of it is valuable tokens. This idea that we would apply this basic method of software, token generation, what some people call inference, but token generation. This method of producing software, producing data, interacting with you, ChatGPT is interacting with you.

This method of working with you, collaborating with you, you extend this as far as you like, copilots to artificial intelligence agents, you extend the idea as long as you like, but it's basically the same idea. It's generating software, it's generating tokens and it's coming out of this thing called an AI generator that we call GPU supercomputers. Does that make sense?

And so the two ideas. One is the traditional data centers that we use today should be accelerated and they are. They're being modernized, lots and lots of it, and more and more industries one after another. And so what is a trillion dollars of data centers in the world will surely all be accelerated someday. The question is, how many years would it take to do? But because of the second dynamic, which is its incredible benefit in artificial intelligence, it's going to further accelerate that trend. Does that make sense?

However, the second data center, the second type of data center called AC generators or excuse me, AI generators or AI factories, as I've described it as, this is a brand new thing. It's a brand new type of software generating a brand new type of valuable resource and it's going to be created by companies, by industries, by countries, so on and so forth, a new industry.

I also spoke about our new platform. People are -- there are a lot of speculations about Blackwell. Blackwell is both a chip at the heart of the system, but it's really a platform. It's basically a computer system. What NVIDIA does for a living is not build the chip. We build an entire supercomputer, from the chip to the system to the interconnects, the NVLinks, the networking, but very importantly the software.

Could you imagine the mountain of electronics that are brought into your house, how are you going to program it? Without all of the libraries that were created over the years in order to make it effective, you've got a couple of billion dollars' worth of asset you just brought into your company.

And anytime it's not utilized is costing you money. And the expense is too incredible. And so our ability to help companies not just buy the chips, but to bring up the systems and put it to use and then working with them all the time to make it -- put it to better and better and better use, that is really important.

Okay. That's what NVIDIA does for a living. The platform we call Blackwell has all of these components associated with it that I showed you at the end of the presentation to give you a sense of the magnitude of what we've built. All of that, we then disassemble. This is the hard -- this is the part that's incredibly hard about what we do.

We build this vertically integrated thing, but we build it in a way that can be disassembled later and for you to buy it in parts, because maybe you want to connect it to x86. Maybe you want to connect it to a PCI-Express fabric. Maybe you want to connect it across a whole bunch of fiber, okay, optics.

Maybe you want to have very large NVLink domains. Maybe you want smaller NVLink domains. Maybe you can use arm, maybe so on and so forth. Does it make sense? Maybe you would like to use Ethernet. Okay, Ethernet is not great for AI. It doesn't matter what anybody says.

You can't change the facts. And there's a reason for that. There's a reason why Ethernet is not great for AI. But you can make Ethernet great for AI. In the case of the ethernet industry, it's called Ultra Ethernet. So in about three or four years, Ultra Ethernet is going to come, it'll be better for AI. But until then, it's not good for AI. It's a good network, but it's not good for AI. And so we've extended Ethernet, we've added something to it. We call it Spectrum-X that basically does adaptive routing. It does congestion control. It does noise isolation.

Remember, when you have chatty neighbors, it takes away from the network traffic. And AI, AI is not about the average throughput. AI is not about the average throughput of the network, which is what Ethernet is designed for, maximum average throughput. AI only cares about when did the last student turn in their partial product? It's the last person. A fundamentally different design point. If you're optimizing for highest average versus the worst student, you will come up with a different architecture. Does it make sense?

Okay. And because AI has all reduce all to all, all gather, just look it up in the algorithm, the transformer algorithm, the mixture of experts algorithm, you'll see all of it. All these GPUs all have to communicate with each other and the last GPU to submit the answer holds everybody back. That's how it works. And so that's the reason why the networking is such a large impact.

Can you network everything together? Yes. But will you lose 10%, 20% of utilization? Yes. And what's 10% to 20% utilization if the computer is $10,000? Not much. But what's 10% to 20% utilization if the computer is $2 billion? It paid for the whole network, which is the reason why supercomputers are paid -- are built the way they are. Okay.

And so anyways, I showed examples of all these different components and our company creates a platform and all the software associated with it, all the necessary electronics, and then we work with companies and customers to integrate that into their data center, because maybe their security is different, maybe their thermal management is different, maybe their management plane is different, maybe they want to use it just for one dedicated AI, maybe they want to rent it out for a lot of people to do different AI with.

The use cases are so broad. And maybe they want to build an on-prem and they want to run VMware on it. And maybe somebody just wants to run Kubernetes, somebody wants to run Slurm. Well, I could list off all of the different varieties of environments and it is completely mind blowing.

And we took all of those considerations and over the course of quite a long time, we've now figured out how to serve literally everybody. As a result, we could build supercomputers at scale. But basically what NVIDIA does is build data centers. Okay. We break it up into small parts and we sell it as components. People think as a result, we're a chip company.

The third thing that we did was we talked about this new type of software called NIMs. These large language models are miracles. ChatGPT is a miracle. It's a miracle not just in what it's able to do, but the team that put it so that you can interact with ChatGPT in very high response rate. That is a world class computer science organization. That is not a normal computer science organization.

The OpenAI team that's working on this stuff is world class, is a world class team, some of the best in the world. Well, in order for every company to be able to build their own AI, operate their own AI, deploy their own AI, run it across multiple clouds, somebody is going to have to go do that computer science for them. And so instead of doing this for every single model, for every single company, every single configuration, we decided to create the tools and tooling and the operations and we're going to package up large language models for the very first time.

And you could buy it. You could just come to our website, download it and you can run it. And the way we charge you is all of those models are free. But when you run it, when you deploy it in an enterprise, the cost of running it is $4,500 per GPU per year. Basically, the operating system of running that language model.

Okay. And so the per instance, the per-use cost is extremely low. It's very, very affordable. And -- but the benefit is really great. Okay. We call that NIMs, NVIDIA Inference Microservices. You take these NIMs and you're going to have NIMs of all kinds. You're going to have NIMs of computer vision. You're going to have NIMs of speech and speech recognition and text to speech and you're going to have facial animation. You're going to have robotic articulation. You're going to have all kinds of different types of NIMs.

These NIMs, the way that you would use it is you would download it from our website and you would fine tune it with your examples. You would give it examples. You say the way that you responded to that question isn't exactly right. It might be right in another company, but it's not right in ours. And so I'm going to give you some examples that are exactly the way we would like to have it. You show it your work products. This is the way -- this is what a good answer looks like. This is what right answer looks like, whole bunch of them.

And we have a system that helps you curate that process that tokenize that, all of the AI processing that goes along with it, all the data processing that goes along with it, fine tuning that, evaluate that, guardrail that so that your AIs are very effective, number one, also very narrow.

And the reason why you want it to be very narrow is because if you're a retail company, you would prefer your AI just didn't pontificate about some random stuff, okay. And so whatever the questions are, it guardrails it back to that lane. And so that guard railing system is another AI. So, we have all these different AIs that help you customize our NIMs and you could create all kinds of different NIMs.

And we gave you some frameworks for many of them. And one of the very important ones is understanding proprietary data, because every company has proprietary data. And so we created a microservice called Retriever. It's state-of-the-art and it helps you take your database, which is structured or unstructured images or graphs or charts or whatever it is and we help you embed them.

We help you extract the meaning out of that data. And then we take the -- it's called semantics and what that semantic is embedded in a vector that vector is now indexed into a new database called vector database, okay. And that vector database, then afterwards you can just talk to it. You say, hey, how many mammals do I have, for example. And it goes in there and says, hey, look at that. You got a cat, you have a dog, you have a giraffe.

This is what you have in inventory, in your warehouse you have, okay, so on and so forth, all right. And so all of that is called NeMo and we have experts to help you. And then we put our -- we put a canonical NVIDIA infrastructure we call DGX Cloud in all of the world's clouds. And so we have DGX Cloud in AWS, we have DGX Cloud in Azure, we have DGX Cloud in GCP and OCI. And so we work with the world's enterprise companies, particularly the enterprise IT companies and we create these great AIs with them, but when they're done, they can run in DGX Cloud, which means we're effectively bringing customers to the world's clouds. A platform like us, a platform company, brings system makers customers and CSPs are system makers.

They rent systems instead of sell systems, but they're system makers. And so we bring customers to our CSPs, which is a very sensible thing to do just as we brought customers to HP and Dell and IBM and Lenovo and so on and so forth and Supermicro and CoreWeave, so on and so forth, we bring customers to CSPs because a platform company does that. Does that make sense?

If you're a platform company, you create opportunities for everybody in your ecosystem. And so the DGX Cloud allows us to land all of these enterprise applications in the world CSPs. And they want to do it on-prem. We have great partnerships with Dell that we announced yesterday, HP and others, that you can land those NIMs in their systems.

And then I talked about the next wave of AI, which is really about industrial AI. This -- that the vast majority of the world's industries, the largest in dollars, are heavy industries and heavy industries have never really benefited from IT. They've not benefited from a lot of the design and all of the digital.

It's called not digitization, but digitalization, putting it to use. They've not benefited from digitalization, not like our industry. And because our industry is completely digitalized, our technology advance is insanely great. We don't call it chip discovery. We call it chip design. Why do they call it drug discovery, like, tomorrow could be different than yesterday? Because it is.

And it's so much -- it's so complicated -- it's so complicated biology, it's so changed -- and the longitudinal impact is so great, because, as you know, life evolves at a different rate than transistors. And so therefore, cause and effect is harder to monitor because it happens over a large scale, large scale of systems and large scale of time. These are very complicated problems. Physics is very similar.

Okay. Industrial physics is very similar. And so we finally have the ability using large language models, the same technologies. If we can tokenize proteins, if we could tokenize -- if we can tokenize words, tokenize speech, tokenize images, we can tokenize articulation. This is no different than speech, right?

We can tokenize proteins moving, that's no different than speech, okay. Just -- we can tokenize all these different things. We can tokenize physics then we can understand its meaning just like we've understood the meaning of words.

If we can understand its meaning and we can connect it to other modalities then we can do generative AI. So I just explained very quickly that 12 years ago I saw it, our company saw it with ImageNet. The big breakthrough was literally 12 years ago.

We said, interesting, but what are we actually looking at? Interesting, but what are you looking at? ChatGPT, I would say, everybody should say interesting, but what are we looking at? What are we looking at? We are looking at a computer software that can emulate you -- emulate us.

By reading our words, it's emulating the production of our words. Why -- if you can tokenize words and if you could tokenize articulation, for example, why can't it imitate us and generalize it in a way that ChatGPT has. So the ChatGPT moment for robotics has got to be around the corner. And so we want to enable people to be able to do that. And so we created this operating system that enables these AIs to be able to practice in a physically based world and we call it Omniverse.

Omniverse is not a tool. Omniverse is not even an engine. Omniverse are APIs, technology APIs that supercharge other people's tools. And so I'm super excited about the announcement with Dassault. They're using -- they're connecting to Omniverse API to supercharge 3DEXCITE. Microsoft is connected it to Power BI.

Rockwell has connected it to their tools for industrial automation. Siemens has connected to their, so it's a bunch of APIs that is physically based and it produces image or articulation and it connects a whole bunch of different environments. And so these APIs are intended to supercharge third party tools. And I'm super delighted to see the adoption across it, particularly in industrial automation. And so those are the five things that we did.

I'll do this next one very quickly. I'm sorry I took longer than I should, but let me do this next one really quickly. Look at that. All right. So this chart, don't over stare at it, but it's basically, it communicates several things. On top are developers. NVIDIA is a market maker, not share taker. The reason for that is everything we do doesn't exist when we started doing it. There is no such -- you just go up and down. In fact, even in originally 3D computer games didn't exist when we started working on it.

And so we had to go create the algorithms necessary. Real time ray tracing did not exist until we created it. And so all of these different capabilities did not exist until we created it. And once we created it, there are no applications for it. So we had to go cultivate and work with developers to integrate this technology we have just created so that applications could be benefited by it.

I just explained that for Omniverse. We invented Omniverse. We didn't take anything from anybody, didn't exist. And in order for it to be useful, we now have to have developers, Dassault, Ansys, Cadence, so on and so forth. Does that make sense? Rockwell, Siemens.

We need the developers to take advantage of our APIs, our technologies. Sometimes they're in the form of an SDK. In the case of Omniverse, I'm super proud that it's in the form of cloud APIs, because now it's so easy to use that you could use it in both ways, but APIs are much, much easier to use, okay. And we host Omniverse in the Azure cloud. And notice whenever we connect it to a customer, we create an opportunity for Azure.

So Azure is on the foundation, their system provider. Back in the old days, system providers used to be OEMs and they continue to be, but system providers on the bottom, developers on top. We invent technology in the middle. The technology that we invent happens to be chip last.

It's software first. And the reason for that is without a developer, there will be no demand for chips. And so NVIDIA is an algorithm company first and we create these SDKs. They call them DSLs, domain specific libraries. SQL is a domain specific library. You might have heard of Hadoop is a domain specific library in storage computing.

NVIDIA's cuDNN is potentially the most successful domain specific library short of SQL the world has ever seen. cuDNN is the domain specific library. It's computation engine library for deep neural networks. Without DNN, none of them would have been able to use CUDA. So DNN was invented.

Real time ray tracing optics, which led to RTX, makes sense. And we have hundreds of domain specific libraries. Omniverse is a domain specific library. And these domain specific libraries are integrated with developers on the software side, which then when the applications are created and there's demand for that application, creates opportunities for the foundation below. We are market makers, not share takers. Does that make sense?

And so what's the takeaway? The takeaway is you can't create markets without software. It has always been the case. That has never changed. You could build chips to make software run better, but you can't create a new market without software. What makes NVIDIA unique is that we're the only chip company I believe that can go create its own market and notice all the markets we're creating.

That's why we're always talking about the future. These are things that we're working on. We really -- nothing would give me more joy to work with the entire industry to create the computer aided drug design industry, not drug discovery industry, drug design industry.

We had to do drug designed the way we do drug chip design not chip discovery. And so I expect every single chip next year to be better than the one before, not as if I'm looking for truffles, which is discovery. Some days are good, some days are less good.

Okay, all right. So we have developers on top. We have our foundation on the bottom. The developers want something very, very simple. They want to make sure that your technology is performing, but they have to solve the problem, that they couldn't solve any other way. But the most important thing for a developer is installed base. And the reason for that is they don't sell hardware, their software doesn't get used if nobody has the hardware to run it.

Okay. So what developers want is installed base that has not changed since the beginning of time, is has not changed now. Artificial intelligence, if you develop artificial intelligence software and you want to deploy so that people could use it, you need installed base.

Second, the systems companies, the foundation companies they want killer apps. That's the way -- that's the reason why killer app word existed because where there is a killer app, there is customer demand, where there is customer demand, you can sell hardware.

And so, it turns out this loop is insanely hard to kick-start. And how many accelerated computing platforms can you really, really build? Can you have an accelerated computing platform for generative AI as well as industrial robotics, as well as quantum as well as 6G as well as weather prediction as well.

And you can have all these different versions because some of it is good at fluids. Some of it's good at particle. Some of it is good at biology. Some of it is good at robotics. Some of it is good at AI. Some of it is good at SQL. The answer is no. You need a general -- sufficiently general purpose accelerated computing platform. Just as the last computing platform was insanely successful because they ran everything.

Now NVIDIA is taken us a long time, but we basically run everything. If your software is accelerated, I am very certain, it runs on NVIDIA. Does that makes sense? Okay. If you have accelerated software, I am very, very certain it runs on NVIDIA. And the reason for that is because it probably ran on NVIDIA first.

Okay. All right. So this is the NVIDIA architecture. I spoke about whenever I give keynotes, I tend to touch on all of them, different pieces of it, something that -- some new things that we did in the middle, in this case, Blackwell. I spoke about there were so many good stuff and you really have to go to our tox, looks like a 1000 tox. 6G research, how 6G going to happen? Of course, AI. And why do you use the AI for? Robotic MIMO.

Why is MIMO so pre-installed meaning that, why does the algorithm come before the site. We should have site-specific MIMO just like Robotic MIMO. And so, reinforcement learning and the deals with the environment and so 6G of course is going to be software-defined, of course, it's going to be AI.

Quantum Computing, of course, we should be a great partner for the quantum computing industry. How else are you going to drive a Quantum Computer? To have the world's fastest computer sitting next to it.

And how are you going to stimulate a Quantum Computer, emulate the Quantum Computer? What is the programming model for Quantum Computer? You can't just program a Quantum Computer all by itself. You need to have classical computing sitting next to it. And sort of quantum would be kind of a quantum accelerator.

And so that -- who should go do that, well we've done that and so we work with all the industry on that. So across the board, some really, really great stuff. I wish I could have covered, we could have a whole keynote just on all that stuff. But we cover the whole gamut. Okay. So that was kind of yesterday. Thank you for that.

Question-and-Answer Session
A - Colette Kress
Okay. We have them going around and we'll see if we can grab your questions.
Jensen Huang
That was the question that I'm sure, first question goes. If you could have -- done the keynote in 10 minutes, why didn't just do yesterday in 10 minutes? Good question....