Giannandrea: "So these models, when you run them at run times, it's called inference, and the inference of large language models is incredibly computationally expensive. And so it's a combination of bandwidth in the device, it's the size of the Apple Neural Engine, it's the oomph in the device to actually do these models fast enough to be useful. You could, in theory, run these models on a very old device, but it would be so slow that it would not be useful.
Gruber: "So it's not a scheme to sell new iPhones?"
Joswiak: "No, not at all. Otherwise, we would have been smart enough just to do our most recent iPads and Macs, too, wouldn't we?"