Apple is quietly abandoning its cherished principle of vertical integration. According to a report by The Information, the next-generation Siri—powered by Google’s Gemini large language model—will not run solely on Apple’s in-house silicon. Instead, it will rely on NVIDIA’s Blackwell B200 data center GPUs hosted on Google Cloud, with hardware-based Confidential Compute enabled to protect user data during processing. This move marks Apple’s first significant opening of a core AI interface to external tech giants and reveals how geopolitical constraints and engineering bottlenecks are dismantling the myth of end-to-end control in the generative AI era.
For over a decade, Apple built the most tightly integrated consumer electronics ecosystem through its A- and M-series chips coupled with iOS. But generative AI poses a fundamental challenge: on-device hardware cannot handle inference for models with tens of billions of parameters, and building proprietary data centers offers neither scale nor training infrastructure. Despite years of secret development on its Ajax large model, Apple lacks the computational capacity to support real-time voice interactions for over a billion users globally. Partnering with Google Cloud and NVIDIA becomes the pragmatic path forward—Google provides the Gemini model and cloud platform, while NVIDIA delivers the B200’s staggering 20 petaFLOPS (FP4) per chip and mature AI software stack.
Yet this arrangement is far more than a procurement deal. It reflects an emerging division of labor among U.S. tech titans in the AI arms race: Google supplies models and cloud services, NVIDIA monopolizes foundational compute, and Apple focuses on user experience and endpoint integration. Together, they form a complementary triangle of “model–compute–interface.” Crucially, this architecture depends heavily on NVIDIA’s CUDA ecosystem and Blackwell’s NVLink interconnect technology, further entrenching its dominance in both AI training and inference. Market research firm TrendForce estimates NVIDIA will retain over 85% share of the global AI accelerator market in 2025, despite growing competition from custom chips by Broadcom, Amazon, and others.
This reliance, however, carries strategic risk. Apple has historically avoided overdependence on third parties—its shift from Intel to in-house silicon was precisely motivated by such concerns. Now, by outsourcing a critical AI pathway to NVIDIA and Google, Apple cedes partial control over the user experience. While Confidential Compute mitigates data exposure, factors like model behavior, response latency, and feature rollout cadence will be constrained by external infrastructure update cycles. More delicately, Google—the direct competitor behind Android and Pixel—is now an enabler of Apple’s AI capabilities, creating an unprecedented co-opetition dynamic.
Geopolitically, the partnership underscores internal coordination within the U.S. AI alliance. Amid U.S.-China tech decoupling, Washington prioritizes a trusted domestic AI supply chain. Apple, Google, and NVIDIA all fall squarely within this “trusted camp.” Although Blackwell chips are manufactured by TSMC (Taiwan, China), their design, software stack, and deployment remain fully within the U.S. technological sphere. In contrast, Huawei—despite innovations like its “Tau Law” and Ascend architecture—remains excluded from global mainstream AI infrastructure due to advanced node restrictions. Apple’s choice is ultimately a calculated trade-off between political safety and technical efficiency.
I judge this won’t be Apple’s endgame. The company is secretly building AI data centers in Arizona and aggressively hiring ML infrastructure engineers. Within two to three years, as Apple Silicon potentially scales to server-class designs (such as rumored ACD chips) and private small-language models mature, Apple will likely migrate Siri inference back to its own infrastructure. But during this transition, it must accept a new reality: in the AI era, true vertical integration is impossible. Even Apple must dance with former rivals on the geopolitical chessboard of compute.
When Siri’s voice is shaped by Google’s algorithms and driven by NVIDIA’s transistors, are we witnessing the birth of a new paradigm—where super-apps are no longer controlled by a single entity but woven together by multiple technological sovereignties? That may be a more profound question than raw chip performance.