Engineering

What 'verified' means after libcrux

Vitalik Buterin published a long essay on formal verification today. It argued that pairing AI-generated code with machine-checked mathematical proofs is the right response to AI lowering the cost of finding exploits. The gist is that Lean is the language of trust, and ZK proof systems and post-quantum cryptography are the priority targets.

More useful for a practitioner is one of the worked examples Vitalik chose to make a smaller point. He cites Nadim Kobeissi describing a November 2025 finding in libcrux-ml-dsa, a post-quantum signature library that the cryptographic stack treats as formally verified. The bug ran in the part that wasn't.

The essay's main reference is an April blog post by Yoichi Hirai, a former Ethereum Foundation researcher now at zkSecurity, which calls the assembly-plus-Lean pattern "the final form of software development." Hirai's claim is that letting AI agents write low-level code directly, paired with Lean proofs of correctness, sidesteps the historical problem of trusting the compiler. The agents produce hundreds of commits per day; CI checks the proofs; a human reviews and merges. Vitalik adopts the framing approvingly.

The concrete example on the cryptography side is ArkLib, a Lean library for formally verified arguments of knowledge developed under the Verified-zkEVM project. ArkLib targets the information-theoretic core of modern SNARKs, interactive oracle reductions, and proves completeness and round-by-round knowledge soundness for selected protocols. The same organization is working in parallel on verifying revm against Runtime Verification's KEVM semantics, with KEVM in turn being shown equivalent to a Lean EVM model maintained by Nethermind.

So Vitalik endorses Lean, ArkLib, Hirai-style agentic workflows, and a small set of firms doing the work.

libcrux is Cryspen's post-quantum cryptography library, used as an implementation of ML-DSA, the NIST post-quantum signature standard, and marketed on the strength of its formal-verification claims. In November 2025, Filippo Valsorda found that version 0.0.3 produced different public keys and signatures on different platforms given identical inputs, which is the kind of failure that makes a signature scheme useless.

The cause sat in a hardware-specific helper that handled SHA-3 on ARM64 chips without dedicated SHA-3 instructions. The helper passed wrong arguments to a bitwise shift and corrupted hash output. The library shipped this way because the relevant proof had been marked admitted, a Lean keyword that means "I have not proven this; trust me." The entire ARM64 backend had no completed proofs for runtime safety or correctness.

libcrux is not a hobby project. It is the kind of code that ships inside wallets, hardware security modules, and the post-quantum migration pipelines that Citi and Project Eleven spent the past two weeks telling financial institutions to accelerate. The library carries a formal-verification claim. The claim is true, in the narrow sense that proofs exist for parts of the codebase. The claim is also misleading in the sense that the parts attackers can reach were not the parts that were proven.

A proof only demonstrates that the code satisfies the specification it was given. If the specification is incomplete, or admitted in the wrong place, the proof is still valid and the system is still broken.

zkSecurity, the firm that hosts Hirai's blog, is also a co-author on one of the IEEE S&P 2026 papers being presented in San Francisco this week, a paper on detecting computation-constraint inconsistencies in ZKP programs. Its embedded Lean DSL for ZK circuits, clean, is now part of the Verified-zkEVM organization Vitalik points to. Runtime Verification's KEVM is the semantic anchor for the EVM-verification track. Nethermind maintains the equivalent Lean EVM model. ArkLib has 82 GitHub stars and a small, identifiable contributor list.

This is a narrow market. The set of firms capable of delivering Lean-grade proofs of cryptographic protocols at production scale is somewhere in the single digits.

For a team buying assurance for privacy-critical code, the question is who to call. Code4rena, one of the two competitive-audit platforms that has anchored Web3 security review for years, announced on May 13 that it is winding down, and Immunefi is onboarding its bug-bounty customers. The list of firms branded as auditors is one set, and the list capable of delivering formal proofs of cryptographic code is a smaller subset within it.

Vitalik's argument is built around code Lean can reason about: cryptographic protocols with crisp specifications, ZK proof systems, signature schemes. The libcrux example shows what happens when "is this verified" is yes in one place and silently no in another. It does not tell a team running an FHE-based confidential settlement system anything about its own exposure.

What 'verified' means after libcrux

Read more

"The transport layer should be generic": why OpenVTC added TSP

Visibility was never safety

Confidential computing is three different trust models with one label

The EUDI Wallet's privacy architecture, six months from the deadline