The TEE and Remote Attestation Magic
Taking a step back, the idea of having a secure TEE where you can bring your data into it, run your application code, and know that others cannot see what is going on inside is a great one. A fundamental challenge for a customer is determining whether you have a TEE that is secure, rather than a random virtual machine on a compromised hypervisor. That is where remote attestation comes in. Remote attestation is the technical mechanism that allows an application to verify its compute environment is secure before sharing secrets. Instead of just trusting that a virtual machine is secure, this is a survey and verification process in the form of a cryptographic handshake.

How this works is that the TEE generates a signed “attestation document,” which acts as verifiable proof that the environment is a genuine TEE. That means gathering information about the hardware, firmware, and other components in the environment to ensure none have been tampered with. Imagine keeping a baseline of the entire stack, all the way from the hardware and into the virtual machine, and then having a cryptographically signed proof of what is running the TEE and comparing the two. If the attestation fails, even for a benign reason, then the customer knows that the TEE is not running in a trusted environment.

These days, the major cloud providers have reputations earned over the years for doing a lot to further security in the industry. Instead of just trusting that the compute environment a company like Microsoft or Google provides is secure, remote attestation of a TEE lets you verify that it is. Once you know that the TEE is secure, then you can move your encrypted data (at rest) to the TEE via encrypted networking (data in transit) so that you can process the data (in use.)
What About AI Accelerators?
At this point, you might be thinking that this works well for CPUs and the locally attached memory, but what about AI accelerators? After all, we are in an AI supercycle, and we have been talking about the TEE in terms of server CPUs running VMs on locally attached memory. That is where TDISP comes in, or TEE Device Interface Security Protocol. Here is a good PCI-SIG primer for TDISP if you want to get into more detail.

With TDISP, the goal is that we get to Confidential AI and other accelerated computing platforms by ensuring that the PCIe communication channels are secure, along with the accelerators. While CPUs have been implementing confidential computing, the AI accelerators are quickly adding this capability, and we expect that this will continue in the future.

While at first you might think that this section is just another “because everything needs to be AI these days” take a step back. Think of all of the confidential datasets that are massive, but are also sensitive because they are proprietary and/or they cover so much that the details cannot be made public. Those are the exact large data sets you might want to train or fine tune an AI model with. You might also want to perform inference on the existing data or use it to process new data. The kinds of companies and organizations that need confidential computing today, are also looking at that same data as an advantage in the era of AI.
A Few Thoughts on Where We Go From Here
That brings us to who is leading the charge. These days, cloud providers see industries like financial services, healthcare, and government as key confidential computing users. Whether that is keeping financial transaction data safe, maintaining HIPAA or GDPR compliance, or keeping secrets safe, all of those are key use cases that are pushing for confidential computing to proliferate. Those are large industries, but the broader AI industry may be the largest of all.
Given the importance and size of the markets that need confidential computing, it makes sense why there is an industry effort behind this, and that modern hardware continues to support new confidential computing features. We expect cloud providers to offer confidential computing as the default eventually. Once the hardware supports it, and they have to do it for a class of customers, it is easier to roll out to everyone. If you recall, one of the big reasons Intel moved from its older SGX idea to TDX is similar to AMD’s idea all along. Confidential computing will eventually be everywhere, not just in small application enclaves.

I also think that we will see more innovations in this space. Security researchers are wildly creative. I remember seeing a demo in Austin, TX, back in 2017 of the original AMD EPYC 7001 “Naples” and its SEV and memory encryption. Folks then were talking about the possibility of freezing DRAM chips and pulling data off. That was just before the Spectre/Meltdown side-channel attacks. Side-channel attacks are some of the biggest threat vectors to confidential computing.

We are already seeing new use cases, such as the AMD EPYC 9005, adding support for Trusted I/O via TDISP (or what they had called SEV-TIO) to address confidential computing in the era of AI accelerators. My sense is that as systems get larger and confidential computing takes over, we will see future feature updates in future chips.
One of the bigger challenges is that providing the TEE and remote attestation is not a trivial task since you have to build a chain all the way back to the hardware providers. For a cloud provider, this is a capability they can build and then deploy everywhere. It also provides a way for the provider to show they are a service provider without knowledge of what their customers are doing on their platforms. For an organization with a rack full of virtualization servers, implementing this capability is very challenging.
Final Words
If you asked any customer a question like “Would you prefer your computing environments to be secure, or would you prefer if others, including the cloud provider and strangers, could interact with your data?” I think most will answer that they want secure and confidential computing. Likewise, if you were asked, “Do you want the AI agents you interact with to run on confidential and secure platforms, or not?” I think most would opt for that security and confidentiality. It just seems like the way things should be by default. We now have hardware that supports these features and a mechanism to verify that it is working.
This is likely one that we will see more of as traditional CPU compute servers are updated. If you are still running your cloud VMs on old Intel Xeon Cascade Lake processors, for example, the required hardware features are missing. Still, this is a technology that, as an industry, we talk about today in the context of regulated industries, governments, and so forth, but really, it is a capability we should expect from all of our computing environments.



I wonder how much resources sacrificed (silicon, power, efficiency and headache for programmers) to ensure secure encryption for all computer parts. While still having really fast compute capabilities.
Compare to old way computing that try squeeze every cycle just to make compute more faster.
That’s how much progress we already had in half century.
Starting with Spectre and Meltdown it became clear to non-technical people that there are technical reasons why cloud vendors can not securely provide virtual machines to adversarial renters using multi-tenant hardware. From what I can tell, the clouds are too well established to give up on such a profitable scheme.
On the other hand, if you can get the job done with the overhead needed for secure cloud computing, it’s likely the problem you are solving is trivial relative to modern technologies. Creating useful AI is not currently a trivial problem that can afford the extra overhead. This is because smarter chatbots are universal more useful than stupid ones and they’re all pretty stupid right now.
From a different point of view hardware that prevents cloud vendors from snooping on their customers could be important to protect those vendors from legal complications related to search warrants and criminal activities.
In my opinion there are lots of advantages with a competitive free market for computing hardware that supports the construction of cost-effective on-premise datacenters.
No, it doesn’t.
It still has way more fundamental problems to solve.
Like finally having a decent RAM that is actually immune to FRIGGIN ROWHAMMER attacks.
Or architecture that has definitely solved timing attack holes – whole concept of virtualized CPU core is flawed as it turns out.
But why do that when advertisers have next shiny thing to sell, even though the last ones have been proved defective, right ?
When someone with time to spare finds a shocking hole in SEV, it’s time to sell a new version, right ?
Marx Brothers economy…
Memory encryption (which is required but not sufficient for SEV) does actually help with rowhammer as you cant target the raw bit patterns any more once memory is encrypted.
Developers won’t refactor apps for confidential computing until it’s seamlessly available everywhere, and cloud providers won’t prioritize seamless integration until there’s proven demand. The performance overhead, while decreasing, is still a real tax.
My question is about the path to the “mainstream” tipping point:
Is it more likely to be driven from the top down by a regulatory hammer (e.g., a future GDPR amendment explicitly requiring data-in-use protection for certain classes of PII), or from the bottom up by a killer use case?
That use case might not be just “secure financial transactions.” Could it be the enabling technology for a cross-industry, privacy-preserving data consortium? For example, multiple competing hospitals confidentially pooling patient data in a single enclave to train a diagnostic AI model none could build alone—where the raw data is never exposed, even to the cloud provider. Once that model proves its value, the dam might break.
The technology is maturing, but the ecosystem needs that compelling, undeniable business reason to justify the refactor.
I really wish that ServeTheHome would have a SPONCON tag for all sponsored articles, rather than a throwaway sentence mentioning that it is sponsored.
Also the analogy proposed is incorrect:
“Data in use is like taking all of that money, putting it on a table at a coffee shop, and leaving it sitting there while you pay for your coffee.”
No. Data in use is like taking the play money out and placing it on the counter when paying for your coffee. I.e. In use. Not leaving it on a random tabletop that is not involved in the transaction.
There are some banking regulations (DORA) that now require data-in-use encryption, of course this is a bit silly since you need to access the plain-text if you’re using the data but such pedestrian concerns never stop the regulators.
Of course there is a massive issue where pretty much all transport and at-rest encryption that uses public cloud infrastructure or CDNs requires the provider to hold the keys so there is always a third party that has access, can be compromised or compelled to disclose data.
I don’t know if you’ve meant to in a piece that’s “AMD sponsored”, but you’ve created the best confidential computing primer on the Web. At least I’ve read about it but this is the first time I’ve come away and it clicked on how they’re doing it. We’ve been asked to do this at work next year so I was doing research for the last 3 hrs
This is a really nice explainer on CC, thank you! But as a researcher working on CC and remote attestation, I would like to add a few caveats:
1- Current implementations aren’t nearly as secure as the manufacturers claim. Attacks like ‘TEE fail’ (Chuang et al.) and BatteringRAM (De Meulemeester et al.) have completely broken them by using cheap hardware memory interposers. These could be mitigated but AMD and Intel just shrugged and said “we consider physical attacks out-of-scope”, which is not realistic if you consider the cloud provider as an adversary (which you should!). The result is that BatteringRAM can replay SEV-SNP attestations and thus makes them completely useless (it can also replay full SGX enclaves, fully breaking their encryption). ‘TEE fail’ completely breaks TDX’s attestation by leaking a signing key from Intel’s Quoting Enclave. Both of these are fixable but will require significant effort from AMD and Intel. There is some hope that RISC-V cove or arm CCA could resist these attacks but that is unclear right now.
2- The process to create the attestation report for a confidential VM is currently also not very standardized/complete. A normal confidential VM will provide you with an attestation report that includes the platform (is this actually a hardware backed environment with valid firmware) and the virtual firmware (UEFI) that’s booted (+ some metadata about the initial state of memory pages). This is far from a complete picture, it does not include the guest kernel or any of the userland. There ways of doing it by embedding a hash of the kernel into the virtual firmware or using vTPMs (using AMD’s VMPLs) but this fairly new and is far from standardized currently. Scopelliti et al. actually wrote an interesting paper analyzing different cloud providers and what kind of attestation they actually provide (title: “Understanding Trust Relationships in Cloud-Based Confidential Computing”, DOI: 10.1109/EuroSPW61312.2024.00023).
We’re working hard on improving all this and I am very excited about the future of this technology so thank you for giving it a platform!