Verifiable AI

The Agent Identity Stack: Five Decisions, Not One

The dominant industry framing for what is happening in the agent economy is that the bottleneck is identity and "Know Your Agent" (KYA) is the AI equivalent of KYC.

The frame is correct as far as it goes, but it also collapses what is actually a set of architectural decisions into a single concept.

A builder shipping a human-backed AI agent has to answer at least five substantial questions. How does the agent prove who or what it is to other agents and services? How does a relying party know a real human stands behind it? How is the human's authorization carried through a chain of delegated calls in a way that is auditable and revocable? How is the agent's behavior on the back end verified, and by whom? And what compliance regimes are those four answers exposed to?

None of the answers are independent. A choice at the agent identity layer constrains what is possible at the delegation layer, and a choice at the human-backing layer becomes a regulatory bet under GDPR Article 9.

Much of what is shipping in production right now tends to address one or two answers but carries assumptions about the others.

Within 24 hours at the end of April and start of May, Okta and Microsoft shipped agent-identity control planes to general availability. Both treat AI agents as a new class of non-human identity to be discovered, registered, scoped to least privilege, monitored at runtime, and revoked through a kill switch. Microsoft prices its product at $15/user/month inside the Microsoft 365 E7 license. Okta positions itself as the neutral fabric across vendors, including a new open protocol (Cross App Access) for agent-to-app connection.

The control-plane model presumes you own the perimeter. It works for an enterprise governing agents that operate inside its existing identity boundary, against employees and contractors with managed credentials and a Defender stack already deployed.

Outside that perimeter, the answer looks different. When World ID 4.0 launched at the Lift Off event in San Francisco with launch integrations across Tinder, Zoom, Docusign, Okta, Vercel, and VanEck, it surfaced its account-based architecture purpose-built for what the company is calling "human-backed" AI agents. World's bet is iris biometrics scanned at a custom Orb device and processed into a zero-knowledge proof of unique humanness.

Billions Network, built by the team that originated the same Circom zero-knowledge toolchain that powers World, takes the opposite bet and opts for document handling rather than biometric processing. For some, the choice between them might be closer to a regulatory decision than a cryptographic one.

Decentralized identity solves many key coordination problems. Whether it solves the deeper ones is still being decided. Since ERC-8004, the Ethereum standard for trustless agents, went live, roughly 130,000 agents have registered. And the April 7 academic paper AI Agents Under EU Law, the first systematic regulatory mapping of agent provider obligations across nine simultaneous EU instruments, concluded that high-risk agentic systems with untraceable behavioral drift cannot currently satisfy the essential requirements of the AI Act, which begins enforcement on August 2.

So where does that leave the state of production architecture? What decisions are builders left with?

Decision 1: Identifying the agent

What is the cryptographically verifiable handle that this particular agent owns, and how do other systems resolve and trust it?

ERC-8004 is the most-discussed answer in the open agent economy. Authored by contributors from MetaMask, the Ethereum Foundation, Google, and Coinbase, the standard ships three on-chain registries: Identity (an ERC-721 token whose URI resolves to a JSON registration file describing the agent's services, endpoints, and supported trust models), Reputation (a permissionless feedback registry), and Validation (a generic interface for verifier contracts to record independent checks). Per the spec, the

supportedTrust field in the registration file is optional, and if absent or empty the registry is used only for discovery. Allium's observation that the first two weeks brought only 401 feedback submissions points toward the latter being more common.

The two enterprise IAM answers, Okta for AI Agents and Microsoft Agent 365, both extend integration into the agent-builder platforms, Microsoft Agent 365 with public-preview cross-cloud sync into AWS Bedrock and Google Cloud, and Okta through Cross App Access (XAA), an open protocol for agent-to-app connection. Microsoft Agent 365 distinguishes three classes of agents: those acting with delegated user access, agents operating with their own credentials behind the scenes, and agents participating in team workflows with their own access (the third still in public preview), and provides a single control plane that spans all three. Okta extends Universal Directory to register agents alongside humans and machines, ships a Token Vault for OAuth-based API delegation to services like Gmail and Slack, an Async Authentication pattern that sends a push notification to a human supervisor when a long-running agent reaches a high-stakes decision point, and Identity Threat Protection for runtime behavioral anomaly detection.

Microsoft sells an integrated stack priced inside the Microsoft 365 E7 license, with identity, observability, governance, network controls extended through Microsoft Entra, and discovery of "shadow AI" agents like OpenClaw and (soon) Claude Code and the GitHub Copilot CLI through Defender and Intune, all in one product locked to the Microsoft license boundary.

Okta positions itself as the neutral fabric across vendors with XAA. For a builder, the choice tracks how much of the enterprise IAM stack is already Microsoft, and how much agent traffic needs to cross vendor boundaries.

Token Security focuses on non-human identity (NHI) discovery and shadow-agent inventory. SailPoint extends identity governance and certification workflows to AI agents from its IGA position. CyberArk extends privileged access management to agent credentials and runtime authorization. HashiCorp Vault and Delinea handle agent-secret rotation and just-in-time access.

The third pattern is open-source delegation infrastructure that targets the gap that neither ERC-8004 nor the Microsoft control plane fills. ZeroID, released April 13, implements an identity and credentialing layer specifically for orchestrator agents that spawn sub-agents calling APIs and tools, the multi-agent attribution problem that OAuth 2.0 and OIDC were not designed for. Adjacent to it sits the Privado ID team's DEEPTRUST framework, which proposes that agent identity should anchor on Decentralized Identifiers using the did:iden3 method (with key rotation that preserves attestations across credential changes), with on-chain attestations from owners and auditors composing a reputation graph.

The DEEPTRUST report is explicit about what it will and will not attempt. It builds for "architectural identity" — model, weights, code, execution environment — because the more rigorous behavioral identity it would prefer requires zkML attesting to runtime behavior, which the authors estimate is three to five years from production-ready for large language models.

There are also several IETF Internet-Drafts in early-2026 circulation, like AIMS, WIMSE, Agentic JWT, and SCIM-for-agents, each targeting some slice of the problem from inside the existing IETF identity stack.

The trade-off across this layer is between portability and ecosystem fit. ERC-8004 is portable across organizational boundaries by design, but the trust models it composes with are not standardized. Microsoft Agent 365 ships an integrated answer to identity, observability, and basic governance, but only inside the Microsoft license boundary and primarily for agents an enterprise wants its own IT team to govern.

Decision 2: Proving a human is behind the agent

Proving an agent has an identifier and proving that identifier traces to a verified human are different operations with different trust assumptions and different regulatory exposure.

The most-deployed answer is World ID combined with AgentKit. AgentKit shipped as part of

World ID 4.0. A user with a verified World ID can register their agents, and websites that accept the x402 micropayment protocol, developed by Coinbase and Cloudflare, can then verify that requests from those agents trace back to a unique person. The launch integrations span Tinder, Zoom (running a three-way biometric match to confirm the human on a video call is the verified human expected), Docusign, Okta (which is building a Human Principal product to expose the verification to API builders), Vercel (a human-in-the-loop step in its open-source Workflow SDK), and VanEck. World reports 18 million verified humans across 160 countries with 150 million credential uses.

The competing answer is Privado ID's Billions Network, which uses a passport rather than iris biometrics as the source of uniqueness. The team behind Billions originated the iden3 protocol and Circom, the zero-knowledge toolchain that also underpins World. Both produce ZK proofs of human uniqueness; the question is what is being proved and where the input comes from. World requires an Orb scan; Billions requires a passport and a phone. Billions is in production in pilots with Deutsche Bank, HSBC, and Telefónica Tech, and is integrating with the Indian government's Aadhaar identity infrastructure.

Civic, Humanity Protocol (palm scanning), and several other projects sit in the same competitive landscape. At a different layer entirely, hardware-rooted approaches like Yubico's Reusable Delegation Tokens signed with a YubiKey, IBM's CIBA-based human-in-the-loop integration with WatsonX, the Mastercard Verifiable Intent proposal, provide proof-of-human at the moment a high-stakes action is taken rather than as a persistent credential.

The trade-off across this layer is partly accessibility and partly regulatory. World ID is the most distributed network and the most mature integration story, and it carries the most regulatory exposure: iris hashes are processed in the EU under what regulators have treated as GDPR Article 9 biometric special category data, and the company has faced regulatory action in Kenya, Spain, Hong Kong, and Brazil over its data collection practices. Billions makes the opposite bet: passports are recognized identity documents in nearly every jurisdiction, the ZK layer prevents the document data from being stored, and the regulatory question becomes about document handling rather than biometric processing. The hardware-rooted approaches narrow the question to the moment of high-stakes action and avoid the persistent-credential problem entirely.

For a builder, the decision turns partly on which jurisdictions the deployment is exposed to and what the threat model around impersonation looks like.

Decision 3: Delegation and authorization

When an agent that has been authorized by a human delegates to another agent or invokes a tool, what carries the authority through the chain in a way that is auditable and revocable?

The status quo answer in the agent ecosystem is OAuth 2.1, which the Model Context Protocol formally adopted in 2026 with PKCE. OAuth has decades of mature ecosystem support and works well for single-domain human-to-service authentication. It has limitations as an agent identity layer, articulated in the Agent Identity Protocol Internet-Draft submitted to IETF: every trust domain requires its own authorization server with no mechanism for cross-domain verification without pre-established federation; OAuth tokens are opaque to intermediaries, so when delegation produces a new token via RFC 8693 token exchange, the original authorization chain is lost; and OAuth provides no holder-side scope attenuation, since only the authorization server can narrow scopes at issuance time.

The AIP paper's survey of approximately 2,000 MCP servers found that all of them lacked authentication. MCP, Anthropic's protocol, is the dominant agent-to-tool plumbing in production today, and the entire deployed surface is unauthenticated.

AIP itself proposes Invocation-Bound Capability Tokens that fuse identity, attenuated authorization, and provenance binding into a single append-only token chain. The single-hop wire format is a signed JWT; the multi-hop format uses Biscuit tokens with append-only blocks and Datalog policy evaluation. Reference implementations are live in Python and Rust.

The other shipped patterns in this layer are payment-coupled (x402 plus AgentKit, where a payment and an identity verification ride the same request), hardware-coupled (the Yubico-Delinea-Microsoft "Human-in-the-Loop" architecture announced at RSAC 2026, which requires a human sponsor to sign an envelope at high-stakes decision points), and commerce-scoped (Mastercard's Verifiable Intent, scoped specifically to agent-initiated purchases).

This is the layer where the EU AI Act collides hardest with the architecture. Article 15(4) requires that high-risk AI systems be designed and developed with appropriate cybersecurity measures, and the April 7 compliance paper is direct about what that implies for agents: a system prompt instructing an agent not to delete files is not a security control. Article 15(4) compliance requires architectural enforcement — an email-summarization agent should be issued an API capability that exposes a read endpoint, not send or delete. This is the holder-side scope attenuation that AIP argues OAuth 2.1 cannot deliver. A builder running production agents in the EU after August 2 needs to be able to demonstrate that the authorization scope of every agent action was narrowed to what was necessary, and that the narrowing was architectural.

Decision 4: Behavior verification and audit

ERC-8004's Validation Registry defines two functions — validationRequest and validationResponse — and stores the response on-chain along with an optional URI pointing to off-chain evidence. The standard leaves the validator implementation open: stake-secured re-execution, zkML verifiers, TEE oracles, or trusted judges are all named as valid choices, and incentives and slashing are out of scope of the registry.

zkML is the option most often invoked. The DEEPTRUST report's assessment of zkML for production-scale large language models says that current implementations are suited to smaller models and specific applications, and the computational demands of LLMs combined with proof-system overhead pose hurdles that the report estimates require three to five years to clear. TEE-based attestation is more mature and more deployable today. NVIDIA H100 GPUs introduced GPU-side confidential computing, and Azure has confidential VMs combining AMD SEV-SNP with H100s in general availability since September 2024, but the trust assumptions degraded materially over 2025 as multiple TEE vulnerability disclosures landed. Stake-secured re-execution works for some classes of computation but is expensive and adds latency. Trusted judges are one possible solution.

It's worth mentioning how the major AI labs have shipped several different architectural answers to the related problem of what an AI provider should know about how its system is being used and how it should tell anyone. None of those answers compose with the agent-identity stack on the relying-party side.

Anthropic's Clio, released in December 2024 and now integrated into the recurring Anthropic Economic Index reports, with five published as of March 2026, plus the India Country Brief and the AI Fluency Index, is a privacy-preserving clustering system that anonymizes, semantically clusters, filters low-frequency clusters, and passes output through a final identifier-check before display. Clio's defense-in-depth design lets Anthropic publish how its system is being used at the population level without exposing individual conversations.

Google's Provably Private Insights, built into the Recorder app, takes a related approach with stronger formal guarantees. PPI uses confidential federated analytics that combine LLM-based topic extraction with differentially-private aggregation inside trusted execution environments, with the entire pipeline open-source and the workflow signatures publicly logged. Devices can attest that the analysis they participated in matches the published code.

Apple's Private Cloud Compute sits at a different layer, as PCC is not about observing what the AI is doing; it is about running the AI in a way that the provider cannot inspect at all, thus stateless computation, enforceable guarantees, no privileged access, non-targetability, and verifiable transparency are PCC's stated design principles, and the architecture extends iPhone-style hardware-rooted security into the data center.

Microsoft offers Azure AI Confidential Inferencing, a TEE-backed inference path with HPKE-encrypted prompts and key release gated on attestation of the inference container. The capability has been in preview for Azure OpenAI Service since late 2025; whether OpenAI uses it in production for any consumer-facing inference is not publicly stated. OpenAI separately released a Privacy Filter in late April, a bidirectional token-classification model for PII detection and masking, which sits at a third layer (pre-processing redaction rather than runtime confidentiality or aggregate observability).

None of these expose any standard interface by which a relying party, like a service that received an agent invocation through ERC-8004 and AgentKit, could query whether the agent's behavior in its session conformed to the policy bound to its identity. The AI providers have built privacy primitives for their own purposes; the agent identity work has built coordination primitives across organizations.

For a builder, this means the back-end verification answer for an agent today is whatever the AI provider gives you out of band: usage logs, system prompts, content policies, and whatever the provider's trust and safety stack catches, augmented by whatever the builder instruments themselves on the client side.

Decision 5: Compliance overlay

The August 2, 2026 EU AI Act enforcement deadline is a load-bearing date. After that date, high-risk AI systems must satisfy the essential requirements of the Act, and the April 7 compliance architecture paper concludes that high-risk agentic systems with untraceable behavioral drift cannot satisfy those requirements with current technology. The paper's twelve-step compliance architecture coordinates the AI Act with the GDPR, the Cyber Resilience Act, the Digital Services Act, the Data Act, the Data Governance Act, sector-specific legislation, the NIS2 Directive, and the revised Product Liability Directive. The cybersecurity standard prEN 18282 and the broader CRA standards programme under M/606 are the harmonised standards that providers will eventually use to demonstrate conformity, and they are not finalized, thus creating what the paper calls a standards-free zone from mid-2026 to late 2027 in which requirements are enforceable but harmonised standards are not yet available to anchor compliance.

Two European data protection authorities have already moved on agentic systems in advance of August. The Spanish DPA published 71 pages of guidance on February 18, 2026, adopting a "rule of 2" heuristic: an agent should not simultaneously process untrusted input, access sensitive data, and take autonomous action affecting individuals without human oversight. The Dutch DPA followed shortly after.

The Bank of England, in February 2026, called out the failure of literal "human-in-the-loop" approaches at agentic scale; the UK Competition and Markets Authority in March 2026 flagged interoperability, permissions, logging, data mobility, and closed vendor ecosystems as structural concerns. None of these are AI Act enforcement actions, but they establish the supervisory direction.

In the United States, NIST is working on AI agent identity standards through the National Cybersecurity Center of Excellence. Pindrop's April 14 submission to that process argued that agent identity frameworks must extend to verifying the human approval upstream of the agent's actions, particularly in voice and video contexts where deepfaked authorization is now a real attack vector. OpenAI's most recent public engagement with NIST, in February 2024, focused on dangerous-capability evaluation, red teaming, and synthetic media provenance via C2PA, and agent identity as a category was not part of the response.

Microsoft Agent 365 is positioned as the enterprise compliance answer to the requirements the EU AI Act and the Spanish DPA position imply. Whether the open-standards stack — ERC-8004 plus AgentKit plus AIP — assembles into something with comparable enterprise-audit defensibility before August 2 is unsettled.

The composition seams

ERC-8004 ships the hooks for validators but does not standardize what hangs from them, so a builder treating ERC-8004 as a complete trust layer is treating a discovery directory as a trust system.

OAuth 2.1, which MCP just adopted, does not carry the original authorization chain through token exchange. A builder using OAuth-extended MCP today has no protocol-level way to prove, after the fact, that a tool invocation by a sub-agent was authorized by the human who originally consented to the orchestrator. AIP and similar capability-token approaches solve this on paper.

When a human revokes their World ID, what happens to the agents that were registered against that ID? Whether outstanding agent actions are revoked, whether downstream services are notified, and on what timeline, is undefined.

The AI providers have shipped four conceptually distinct privacy primitives: Clio's privacy-preserving clustering, PPI's differentially-private federated analytics, PCC's hardware-rooted confidential execution, Privacy Filter's pre-processing redaction. None of them composes with any agent-identity standard.

The EU AI Act, GDPR, MiCA, the GENIUS Act, the SEC Crypto Task Force's ZK and MPC endorsement, the UK CMA position, NIST's in-progress work, and the EU's own M/613 standards programme were not written together, so a builder shipping a multi-jurisdictional agent product in the second half of 2026 is composing across regimes that do not coordinate, and the compliance posture is built deal-by-deal.

Many vendor pitches and the KYA framing assume the seams will be filled in.

What this piece does not settle

Several questions sit outside the scope of a decision-framework piece and will require their own coverage.

Whether ERC-8004's Validation Registry converges on zkML, TEE-attested oracles, stake-secured re-execution, or fragments across all three is open.

The provider-side privacy primitives are the most mature components in this entire stack and the least integrated with the rest of it. A standardized interface for asking whether an agent's session conformed to the policy bound to its identity, answerable in a privacy-preserving way by the AI provider, would close a seam that no other actor can close.

Proof Street tracks the convergence of privacy-preserving computation across deployment contexts: blockchain, identity, media provenance, AI, and enterprise. This analysis draws on public statements, company announcements, technical specifications, and published research. No company reviewed or approved this piece prior to publication.