Lean Ethereum - LeanSig

January 20, 2026

Special thanks to Thomas for his great feedback and review

Introduction

Ethereum today relies on BLS signatures for attestations. These are compact, aggregatable and battle-tested, yet they are built on elliptic-curve assumptions which means that they are not post-quantum secure. Lean Ethereum is an experimental roadmap that is aggressively simplifying the base layer of Ethereum. One of its tracks is focused on moving to a post-quantum secure setup. In that world, XMSS style signatures become one of the main candidates to replace the existing BLS signatures.

In this post I want to walk through the XMSS signature scheme and how it is adapted for LeanSig, the proposed signature scheme for Lean Ethereum. Mostly I'll explain this paper as best as I can. The goal is to go through these ideas:

  • How XMSS roughly works
  • The notion of incomparable encodings
  • Building a generic protocol to verify signature aggregation via a succinct argument system
  • A nice optimization trick to make signature verification super efficient.

XMSS Signatures

XMSS is a hash-based signature scheme that is designed to be post-quantum secure. It's based on two things:

  1. A One Time Signature (OTS) scheme (Winternitz) built from hash chains.
  2. A Merkle Tree that bundles many one-time public keys into a single one.

We start with LL one-time keypairs (skl,pkl)(sk_{l}, pk_{l}). Recall that for each ll, the secret key is a vector: skl=(skl,1,,skl,v)sk_l = (sk_{l,1}, \dots, sk_{l,v}) and each component defines one hash chain. The corresponding public key pkl=(pkl,1,,pkl,v)pk_l = (pk_{l,1}, \dots, pk_{l,v}) consists of the result of repeatedly hashing skl,isk_{l,i}, meaning that for each component ii: pkl,i=H2w1(skl,i)pk_{l,i} = H^{2^w-1}(sk_{l,i})

The idea is as follows:

  • For each pair, define pklpk_l as a leaf in a Merkle Tree. After all leaves are defined, construct the tree.
  • The Merkle Root acts as a public root such that one can always show (via a Merkle Proof) that some pklpk_l is a leaf in the tree whose root is pkRpk_R

Now we need the Winternitz-OTS scheme, which works as follows:

  • To sign a message mm use a one time secret key sklsk_l to produce a one-time signature OTSOTS that reveals some information to the Verifier so that it can reconstruct the corresponding one time public key pklpk_l
  • Finally, attach a Merkle Proof that shows pklpk_l is a leaf in the Merkle Tree with root pkRpk_R

The idea is that hashes are one-way functions and therefore one cannot efficiently obtain sklsk_l from pklpk_l. It's important to note that each key should only be used once.

The signature scheme basically consists of two algorithms, signing and verifying. The signing algorithm would work as follows:

  1. To sign a message mm, first one encodes the message mm, with randomness ρ\rho, into a vector x=(x1,...,xv){0,...,2w1}vx = (x_1, ..., x_v) \in \{0, ..., 2^w - 1\}^v
  2. Afterwards we use xx as the steps to take in the hash-chain. So signing involves:
    • Converting mm to xx and
    • Hashing each component of sklsk_l (decomposed in vv values) a number of xix_i times.

Here, vv is the number of hash-chains and ww is the Winternitz parameter (bit-width). Each chain has length 2w2^w. Together, the one-time signature σOTS=(Hx1(skl,1),...,Hxv(skl,v))\sigma_{OTS} = (H^{x_1}(sk_{l,1}), ..., H^{x_v}(sk_{l,v})) and the Merkle Proof θl\theta_l are the signature over the message mm

xmss-hash-tree.png XMSS Approach based on this presentation given in the Ethereum PQ workshop

Verification of a signature, would involve the following steps:

  • Recompute xx from the message mm
  • For each σOTS,i\sigma_{OTS,i} complete the hash-chain by hashing the remaining 2wxi12^w - x_i - 1 steps, until reaching pkl=H2w1xi(σOTS,i)pk_l = H^{2^w - 1 - x_i}(\sigma_{OTS,i})
  • Finally, verify that pklpk_l is included in the Merkle Tree MTMT with root pkRpk_R

Parameters

Before diving into what folks working in Lean Ethereum did, I want to give a picture of what happens when you change the parameters in this scheme. The two most important parameters are vv (the number of hash-chains) and 2w2^w (the number of steps from sklsk_l to pklpk_l). They both target different issues:

  1. The number of hash chains vv is important to prevent collision resistant encodings of messages. If there are multiple messages that map to the same encoding, then the scheme is not binding, since one could potentially find that the same signature could be used for different messages. Larger vv expands the encoding space (2w)v(2^w)^v, reducing collisions.
  2. The number of steps 2w2^w is important to prevent pre-image attacks. If an attacker can effectively reverse a hash-chain, it might find sklsk_l, breaking the security of the scheme

The signature size (in bits) is: (v+logL)w(v + log L) \cdot w while the work that the prover and verifier together must do is O(v2w+logL)O(v \cdot 2^w + log L) hashes. Note that in a blockchain context we want both small signatures and fast verifiers.

Incomparable encodings

The notion of incomparable encodings is important in order to create a scheme where we can't create signatures for messages mm if we don't know the secret key sklsk_l

Definition (Comparability). Two different codewords xx and xx' are comparable if one dominates the other coordinate-wise, i.e., xixix_i \le x'_i for all ii and xj<xjx_j < x'_j for at least one jj.

For example (1,2,3)(1, 2, 3) is comparable with (2,3,4)(2, 3, 4) or (1,2,4)(1, 2, 4) but it's incomparable with (1,2,1)(1, 2, 1) or (1,2,3)(1, 2, 3) The idea is to find an encoding function IncEnc(P,m,ρ,l)IncEnc(P, m, \rho, l) that gives incomparable codewords from a message mm. Having incomparable codewords means that the protocol is secure. The reason for this is that whenever we get an encoding (x1,...,xv)(x_1, ..., x_v) from a message mm we are effectively getting the number of steps to hash on every chain. If there is at least one message mm' such that its encoding is comparable then we could generate a valid signature without knowing the secret key, just by continuing to hash from a known encoding xx

binding-xmss.jpg Encoding of messages mm and mm' where the encoding of mm' can be constructed solely based on the encoding of mm and not based in the values of sklsk_l. Here, the encodings of mm and mm' are comparable

Therefore, to prevent this from being able to happen, one needs to generate an incomparable encoding such that producing xx' is simply not possible.

Achieving incomparability

In order to achieve this property and ensure the signature scheme is secure, one needs to propose an encoding that never allows this to happen. Note that we only need one hash-chain to have an element that is less than an existing one as that would be enough to prevent any issue from happening. Recall that reversing a single hash (using a secure hash function) is not feasible. The easiest way to go about this is to add a new value to the code which we would call a checksum. The checksum can be written as checksum(x):=i=1v(Wmaxxi)=vWmaxixichecksum(x) := \sum_{i = 1}^v (W_{max} - x_i) = v \cdot W_{max} - \sum_i x_i.

Let Wmax=2w1W_{max} = 2^w - 1. This is the maximum value any coordinate xix_i can take (since xi{0,,2w1}x_i \in \{0, \dots, 2^w - 1\}). Note that this would add more elements to xx (increasing the signature size) but would always provide an incomparable encoding because of the fact that xx' dominates xx if and only if xixix_i \le x'_i for all ii and xj<xjx_j < x'_j for at least one jj.

Hopefully that sounds familiar, it is the definition of comparability we just gave before! Hence, whenever we have an encoding xx' dominating xx, the checksum of xx' would be less than the other one, making them incomparable, since:

vWmaxixi<vWmaxixiv \cdot W_{max} - \sum_i x'_i \lt v \cdot W_{max} - \sum_i x_i

as long as xx' dominates xx component-wise (so ixi>ixi\sum_i x'_i > \sum_i x_i).

There is a better way to do this which does not increase signature size. It's called Winternitz Target Sum and it consists of simply adding an extra-check to the verification phase:

  • Whenever you get xx, verify that: i=1vxi=T\sum_{i = 1}^v x_i = T

Intuitively this works because of the same reason as before. Since TT is a fixed parameter, then if one were to obtain some xx' dominating xx component-wise it means that the sum of its components will be greater and therefore will be at least T+nT + n with n>0n \gt 0.

This does add more work for the prover, but it's required to guarantee that the scheme is secure.

LeanSig

Now we are ready to talk about LeanSig, the new construction that is based on these ideas and will be used to replace BLS. Ethereum requires attestations from validators once per epoch. Therefore, each keypair will be used to attest in a single epoch.

The idea is that each validator, when joining the network, generates keys for a number of epochs in advance. The amount of epochs is still not defined, but should be enough to last multiple years, or even decades.

Key Generation

Let L=2hL = 2^h be the number of epochs, then the algorithm works as follows:

  1. Sample a random public salt PP
  2. For each epoch epep:
    • Choose random secrets skep,isk_{ep, i}
    • Compute the public value by running the chain to the end of its range (2w12^w - 1)
      • pkep,i:=H2w1(skep,i)pk_{ep, i} := H^{2^w-1}(sk_{ep, i})
    • Define pkep=(pkep,1,...,pkep,v)pk_{ep} = (pk_{ep, 1}, ..., pk_{ep, v}) and skep=(skep,1,...,skep,v)sk_{ep} = (sk_{ep, 1}, ..., sk_{ep, v})
  3. Place each pkeppk_{ep} as leaves of a Merkle Tree and compute the root pkRpk_R

The public key is pk=(pkR,P)pk = (pk_R, P)

Signing

In order to sign a message mm at epoch epep with secret key skepsk_{ep}:

  1. Generate the Merkle Proof θep\theta_{ep} from pkeppk_{ep} to pkRpk_R
  2. Encode the message using an incomparable encoder: x=IncEnc(P,m,ρ,ep)x = \text{IncEnc}(P, m, \rho, ep) where ρ\rho is a fresh random value.
    • Note that IncEnc\text{IncEnc} is randomized and fallible: the signer keeps sampling new ρ\rho values until the resulting vector fulfills the target-sum condition. This makes signing more expensive but results in cheaper verification.
  3. For each ii, compute σOTS,i:=Hxi(skep,i)\sigma_{OTS,i} := H^{x_i}(sk_{ep, i})
  4. Set σOTS=(σOTS,1,...,σOTS,v)\sigma_{OTS} = (\sigma_{OTS,1}, ..., \sigma_{OTS,v})

The final signature is: σ=(ρ,σOTS,θep)\sigma = (\rho, \sigma_{OTS}, \theta_{ep})

Verification

Given a public key (pkR,P)(pk_R, P), an epoch epep, a message mm and a signature σ\sigma the verifier:

  1. Recomputes x=IncEnc(P,m,ρ,ep)x = \text{IncEnc}(P, m, \rho, ep) from the message mm and randomness ρ\rho
  2. Checks that xx is a valid element in the code (via Winternitz Target Sum)
  3. For each ii, finish computing the hash-chain: pkep,i=H2w1xi(σOTS,i)pk_{ep, i} = H^{2^w - 1 - x_i}(\sigma_{OTS,i})
  4. Set pkep=(pkep,1,...,pkep,v)pk_{ep} =(pk_{ep, 1}, ..., pk_{ep, v})

Finally, the verification only needs to check if θep\theta_{ep} is the Merkle Path from pkeppk_{ep} to pkRpk_R to be done.

From many signatures to a single proof

So far we have only seen how a single validator produces a single LeanSig for a message mm at epoch epep. But we are really looking for an aggregated signature to replace BLS such that:

  • We have kk validators, each with a public key pkeppk_{ep}
  • They all sign a message mm at the same epoch epep
  • We are looking for a single object convincing us that all signatures are valid.

We won't get into the entire aggregation approach, but rather talk about how to generalize it to a statement that can be later used by any proving system. Let's define a relation Γ\Gamma that captures that "all signatures are valid". We can define that as:

  • The statement: x=(k,ep,m,{pki}i=1k)x = (k, ep, m, \{pk_i\}_{i=1}^k)
  • The witness: w={σi}i=1kw = \{\sigma_i\}_{i=1}^k

Recall that σi\sigma_i would be the signature that each validator generated. Thus, the relation Γ\Gamma holds if and only if the verification for each validator is succesful:

i{1,...,k}:LeanSig.Ver(pki,ep,m,σi)=1\forall i \in \{1, ..., k\}: \text{LeanSig.Ver}(pk_i, ep, m, \sigma_i) = 1

Note: LeanMultiSig won't involve only aggregating signature verification, but also aggregating aggregation proofs. For simplicity, we will only talk about signature verification here. For the full aggregation protocol, see this post on folding-based post-quantum signature aggregation.

If we take any SNARK where we define ArgProve\text{ArgProve} and ArgVer\text{ArgVer} as their proving and verification algorithms, then we can define the LeanMultiSig scheme as follows:

Key Generation and Signing

Both key generation and signing are the same as in LeanSig since nothing else is required.

Aggregation

To aggregate, given epep, mm and the list ({pki}i=1k,{σi}i=1k)( \{pk_i\}_{i=1}^k, \{\sigma_i\}_{i=1}^k )

  • Construct xx and ww as explained before

Proving

  • Compute the aggregated signature as the proof π=ArgProveH(x,w)\pi = \text{ArgProve}^H(x, w) where HH is the random oracle used by the SNARK implementation.

Verifying

  • Given the list of public keys {pki}i=1k\{pk_i\}_{i=1}^k, the epoch epep, message mm and the proof π\pi run:
    • b:=ArgVerH(x,π)b := \text{ArgVer}^H(x, \pi) and verify that b=1b = 1

Instead of verifying kk separate hash based signatures, the protocol only needs to verify a single succinct proof. The cost of validation is decoupled from the number of validators and tied instead to the cost of running the proving system itself.

A geometric detour: layers in a hypercube

The follow up paper At the Top of the Hypercube digs into a very specific but important question:

How should we choose the codewords xx in order to get the best tradeoff between signature size and verification time?

The idea is as follows: Since verifying signatures is what we will have to do in the Verification Algorithm of our proving system, can we make the verification very fast while keeping the signature size small? In other words, could we encode in a way where the signer does most of the work but without requiring us to decrease the hash-chain length 2w2^w (i.e., reducing ww)?

The answer is yes! By forcing the encoding to lie in the top layers of the hypercube (very close to the chain ends), we force the signer to do the heavy lifting (hashing many times), while the verifier only needs to compute a few remaining hashes.

To understand it, we need to get a more proper intuition of the problem. So, consider all the possible codewords as points in a vv-dimensional grid: {0,1,...,N1}v\{0, 1, ..., N - 1\}^v where N=2wN = 2^w

For reference, a value may be (1,1,,1)(1,1, \dots ,1) where 11 is repeated vv times. The idea is that there is a directed edge from a point xx to a point yy if:

  • For some ii, yi=xi+1y_i = x_i + 1 and xj=yjx_j = y_j for all jij \neq i

So, for instance if we have x=(1,1,,1)x = (1,1, \dots ,1) and y=(2,1,,1)y = (2,1, \dots ,1) then since y1=x1+1y_1 = x_1 + 1 but xj=yjx_j = y_j for all j1j \neq 1 we would have a directed edge from xx to yy

hypercube-walk.jpg A 2-dimensional grid of a walk from the source (0,0)(0,0) to the sink (2,2)(2,2). Each point is a (x,y)(x, y) coordinate and each edge fulfills the aforementioned condition. The top layer is shown and corresponds to the points (2,1)(2,1) and (1,2)(1,2)

Note that if we count the distance from each point to the "sink" (N1,N1,,N1)(N-1, N-1, \dots, N-1) we start seeing layers at the same distance, and importantly two properties hold:

  • Points in the same layer turn out to form an incomparable set under the coordinate wise order that we care about. Since they are the same distance away, the sum of their components is always a constant TT, which complies with the Winternitz Target Sum we saw before.
  • The value xix_i directly controls how many hashes the signer does. Therefore, if we could always get an encoding on one of the top layers, the verifier will have very little work to do.

The paper then proposes different encoding schemes that improve the verification cost when compared against existing schemes, the idea is to use one of those encoding functions as IncEnc(P,m,ρ,l)IncEnc(P, m, \rho, l)

Note: This encoding is currently state-of-the-art in the spec, but it turned out to be unfriendly for SNARK implementations (because of big-integer arithmetic), and ongoing work is exploring alternatives. It might not be the encoding used in the final design.

Why this matters for Ethereum

Quantum computers threaten elliptic curve based signatures like BLS, but hash-based signatures tend to be bigger in size and compute. Ethereum is a p2p network with big constraints on bandwidth usage; Therefore, new research is required in order to create implementations that can provide good alternatives.

The LeanSig proposed design improve existing constructions by:

  • Optimizing the generalized XMSS signatures with incomparable encodings that are cheap to verify and do not worsen the signature size.
  • Create a generic statement that can be proven by a specialized argument system in order to aggregate each validator signature.

There is also work done on the aggregation part regarding the concrete proving system to use for LeanMultiSig, though I'll leave that for a future post.


Profile picture

Written by Ziemann