A Simple Design for (Limited) Cross-Rollup Interoperability Using a Shared Sequencer Network
Exploring interoperability as a service offered by shared sequencers
In a world filled with rollups, achieving secure and efficient interoperability between them is a notoriously hard problem to solve.
Rollups on Ethereum have been gaining significant traction, with more rollups expected to launch once Celestia deploys their mainnet.
Transaction throughput, conventionally measured using TPS, is arguably not a problem anymore, as rollups are finely tuned systems capable of processing transactions at rates 10-100 times greater than Ethereum Mainnet, while delivering near-instant (soft) confirmations to end-users, resulting in swift and seamless UX1.
Over the past few months, we’ve also witnessed an uptake in the number of rollup frameworks that have launched, such as the OP Stack, Arbitrum Orbit, ZK Stack, and Rollkit.
Additionally, the ease of deploying rollups will drastically reduce with the imminent launch of projects offering Rollup-as-a-Service (RaaS). Some prominent RaaS projects today include Caldera, Conduit, and Vistara.
Rollup sequencers
All rollups today are operated by a single sequencer. Given that this threatens censorship resistance and liveness, decentralizing sequencers has been an important research topic for the past few years.
Now we’re starting to see the results of this research bear fruit with the advent of Based rollups as well as the imminent launch of multiple shared sequencer networks such as Espresso and Astria.
Given that the semantics around sequencers are incredibly nascent, below I define what I mean by some of these terms before continuing to the meat of this article.2
Glossary of terms
Sequencer: An entity responsible for the inclusion + ordering of rollup transactions and submitting them to the underlying L1 chain.
Inclusion: Process by which a transaction is added to a block.
Ordering: Process by which transactions are ordered in a block according to some predefined protocol rule.
Shared sequencer: A single sequencer used by one or more rollups.
Shared sequencer network (SSN): A network of sequencers used by one or more rollups. The SSN agrees upon the inclusion and ordering of rollup transactions by utilizing a consensus algorithm.3
The concept of a shared sequencer network (SSN) has frequently been touted as a potential solution for achieving cross-rollup interoperability. But none of these claims have been convincing and lacks a clear description of how exactly such a system would work.
The lack of clarity surrounding this topic is understandable since SSNs have not yet been deployed in production. It is challenging to predict precisely how the components of rollups and SSNs will interact. The details remain uncertain.
This piece is an attempt at cleaning the foggy rear-view mirror of uncertainty and offers a potential solution to how we might achieve cross-rollup communication.
But before delving into the solution, let's briefly discuss the challenges of cross-rollup interoperability today.
Challenges of cross-rollup interoperability
Although all existing Ethereum rollups use the EVM, there are incompatibilities and issues that extend beyond the state machine, such as:
Bypassing the 7-day challenge window for optimistic rollups without deteriorating their security properties.
Creating and verifying proofs for different parts of the stack such as an execution proof from the state machine, proof of data availability (DA) from the DA layer, and a proof of ordering from the sequencer/SSN.
Building light clients for different parts of the stack (execution, DA+consensus, ordering).
Various zero-knowledge (ZK) virtual machines (VMs) employ distinct ZK circuits, each with its own transaction-sending and processing methods. Additionally, these ZKVMs may utilize different proof systems. Consequently, implementing a cross-chain protocol for each of these systems will necessitate separate implementations.
Safely handling hard forks.
These are some of the problems that interoperability projects grapple with in terms of building a solution for cross-rollup communication.
However, given that centralized sequencers are not a sustainable long-term solution, and since most rollup projects may opt for a shared sequencer network (SSN), it could be worth investigating interoperability at the SSN layer as a potential solution.
Simple design for limited cross-rollup interoperability
As highlighted by James Prestwich in this article, SSNs provide atomic inclusion guarantees but not atomic execution. This is because SSNs are not running execution nodes of the rollups and are therefore not state-aware. To quote him:
Shared Sequencer proponents envision a new structure where the user can specify atomicity of inclusion, i.e. that the sequencer can be forced to sequence a set of transactions in multiple rollups at the same time via a shared forced-sequencing mechanism. This would allow users to ensure that either all of those atomic transactions are included in the rollup histories or none.
This is not as good as it seems. Because only infallible transactions can be force-sequenced, only sets of infallible transactions guarantee atomic execution when atomically included.
A rollup filters out invalid transactions after inclusion, and before execution, via the filter function. Suppose the sequencer takes the user’s atomic set, and causes one transaction to fail or become invalid. That transaction will be filtered after sequencing, and will not execute. This means that atomic inclusion is not sufficient to guarantee atomic execution, unless all transactions involved are infallible.
To make it very concrete, simple sends and withdrawals can be executed atomically, but anything fallible, like a swap, or a DeFi interaction, can’t be. Most high-value interactions contain 1 or more fallible transactions, unfortunately, so it seems difficult to make atomic inclusion useful. This effectively rules out cross-rollup DeFi composability via a shared Sequencer. The shared Sequencer is not a magic bullet. Users are locked into the asynchronous cross-chain model until the end of time.
In the current design, sequencers are not aware of the state of the rollups that they service. But we could make individual sequencers within an SSN run execution nodes as well, and not just nodes with inclusion and ordering logic.
To maintain a clean separation of concerns within this system, it's crucial to introduce a mechanism for this purpose. Proposer-Builder Separation (PBS), as implemented in MEV-Boost, offers a suitable approach. So, how would this concept work in practice?
A network of builders, responsible for block creation, would generate bundles. These bundles are block templates containing transactions from multiple rollups. Builders submit these bundles to proposers, attaching fees. Utilizing an appropriate auction mechanism, a proposer selects the most profitable bundle, compresses the data, and submits it to the underlying L1. Different rollup nodes can fetch4 their respective transactions and execute them.
Builders would operate execution nodes for all rollups within the system. This allows them to simulate all rollup transactions to verify their validity before including them in a bundle. By running these simulation engines, builders can construct bundles with specific ordering of rollup transactions, ensuring their inclusion at various points within a block.
Let's illustrate this with an example involving two rollups, A and B, sharing the same SSN. Imagine Alice, a user on rollup A, intends to transfer tokens from A to B.
In this scenario, Alice initiates the cross-rollup transaction by submitting it to a network of builders, that receive the transaction and performs simulations on both rollup A and B to ensure its validity. Once the transactions are confirmed to be valid, they proceed to construct a bundle, positioning the two rollup transactions at the top of the bundle. These transactions involve locking/burning tokens on rollup A and the other to unlock/mint an equivalent amount of token vouchers on rollup B.
Builders submit multiple such bundles to a network of proposers. A proposer is elected every round to choose a bundle and propose it as a block. At any given round, a single proposer selects the bundle that is the most profitable in terms of fees and proceeds to submit the block to the underlying L1, accompanied by a proof of sequencing/commitment to the block.5
In summary, the transaction flow looks as follows:
User submits a transaction to a network of builders
Builders simulate the transactions to ensure validity
Builders create bundles and submits them to the proposer with a fee
Proposer chooses the most profitable bundle
Proposer submits said bundle to the underlying L1
Rollup full nodes download their respective data to execute transactions and update their state
The above example is a simple cross-chain interaction but this can be extended to relatively more complex transactions such as a cross-chain swap.
Even more sophisticated actions such as transfer+action becomes complicated with this design. For example, a user sends token X from rollup A to B, if the transaction is successful, then send another message to swap/LP/stake X on B.
User actions like these can potentially be executed by arranging transactions appropriately within a bundle. In the transfer+action example mentioned above, there are essentially four actions involved:
Burn/lock token X on A
Mint/unlock token X on B
Register a callback on A to confirm the success of steps 1. and 2.
Execute a transaction for swapping/providing liquidity/staking token X on B
In theory, a builder could construct a bundle containing all four of these transactions, arranged in sequence, to enable more complex cross-chain interactions. However, this is an extremely naive solution and has significant shortcomings. One issue here is of synchronizing execution across rollups A and B in order to achieve the desired result. Additionally, there would need to be a relayer entity that can pass callbacks between A and B. Otherwise rollup A has no way of knowing when step 2. (from the example above) was executed in order to call a callback and subsequently send a message to execute step 4 on B.
There are multiple aspects here that are unclear to me and require further research for a more comprehensive understanding.
Why do we need PBS in this design?
Without implementing PBS, the sequencer is effectively both the builder and proposer. Within an SSN, a single sequencer entity would then have to run nodes for both ordering as well as execution which introduces the following undesirable properties:
Censorship: A sequencer can arbitrarily choose to censor users. It's important to note that a sequencer wouldn't have the ability to permanently censor a user, as users could opt to use the escape hatch in such scenarios and submit transactions directly to the L1. However, this approach would result in higher costs and latency for users.
Inefficient pricing: Establishing a builder market allows users to receive transactions at the market-clearing price. If builder A decides to overcharge, builder B can offer the same service at a lower price, leading to a market equilibrium where supply meets demand. In contrast, when a sequencer has the power to both build and propose blocks, they can charge monopoly prices each time they are elected to propose a block.
MEV: Without PBS, sequencers could maximize MEV revenue at the expense of users. Toxic MEV extraction can result in unfavourable pricing for users, leading to poor UX. But with a network of builders highly optimized for MEV operations, the ideal functionality is to bid nearly or all of the MEV revenue per block in order to win an auction. For example, if builder A stands to generate $100 of MEV revenue from a block, then A’s best response is to bid as close to $100 as possible during the auction (and in some cases even the entire $100).
Conclusion
Building infrastructure for secure and permissionless interoperability in a world filled with rollups is far from trivial. Each rollup framework is different in its own way and poses challenges for composability between heterogeneous rollups.
The development of SSNs are in their early stages. And the design space here feels grossly unexplored. SSNs offer a new shared layer on top of the L1 nodes for rollups to interface with and interact with the L1.
The goal of SSNs is to be as close to the L1 as possible in terms of decentralization, while at the same time providing a high-performance network for several rollups. While SSNs focus on inclusion and ordering of transactions, we can safely extend their responsibilities to incorporate execution. And in order to have a clean separation of concerns, we can introduce PBS into this layer.
By doing so, interoperability, albeit limited, could potentially be a service that SSNs can provide for its rollups. In the future, the services offered by SSNs might expand to more than just cross-rollup composability, acting as a means of differentiation at the shared sequencer part of the stack.
All rollups today are effectively Web2 applications given that they are operated by a single sequencer. In the current context, a sequencer is essentially a rollup full node hosted on platforms like AWS or other cloud service providers.
For example, Astria uses Tendermint as the consensus algorithm. Espresso uses HotShot.
Either from the underlying DA layer or by subscribing to a data stream from the proposer.
Block commitment is the root hash of all txs within a block submitted to the L1. It is necessary for proving fault or validity proofs. For example, in order to prove that a certain tx was invalid, a rollup needs to first prove that it executed all transactions within a block that belongs to that particular rollup (as one block might include txs pertaining to multiple rollups). Using Merkle proofs, a rollup can prove to the L1 bridge contract that all txs pertaining to it were indeed executed.