Building on the idea of breaking the bandwidth requirement in part 1, it’s clear that we cannot expect every node to download all transactions. Recall that a block consists of a header and a body (i.e., the content of a block). A block header contains a hash of the block body (in the form of a Merkle root or a vector commitment). Our key design principle is to require each node to download only the block headers and a subset of the block bodies, rather than the entire content of every block.
To better understand this concept, let’s revisit the numerical example from Post 1. Previously, with an average bandwidth of 10 Mbps and an average transaction size of 250 bytes, all farmers would be limited to downloading fewer than 5,000 transactions per second (i.e., TPS < 5,000). However, by evenly dividing the farmers into 100 data shards, each shard now downloads only 1% of all transactions. This effectively reduces the average transaction size from 250 bytes to 2.5 bytes per shard. While maintaining the same 10 Mbps bandwidth, this approach can theoretically scale the TPS by a factor of 100. (We ignored the impact of block headers for simplicity.)
The naive example above already demonstrates the intuition behind resolving bandwidth constraints. Now, we would like to highlight some hidden issues associated with this sharding scalability design and discuss one approach we plan to use to address them.
One such issue arises from the relatively smaller size of data shards. Since the number of farmers in a single shard is much smaller than in the entire system, the risk of forming a malicious majority within a shard increases. A compromised shard can manipulate the data it is responsible for, such as by forking, which disrupts the globally consistent order across all honest nodes. To address this, we require all the farmers to download block headers. Specifically, all the farmers maintain a so-called beacon chain that stores and orders all the block headers. Once the order of block headers is confirmed on the beacon chain, all honest nodes in the system can achieve a consistent order for block bodies (from different data shards), which allows subsequent consistent execution for each data shard.
Another significant issue is closely related to data availability. Malicious farmers within a single shard may withhold the block body, making it inaccessible to other honest farmers. One of our proposed solutions is to leverage a variant of the longest-chain protocol. Through a sortition mechanism, each data shard selects a farmer as the leader to propose a new block for its shard chain. Honest shard leaders will only extend the longest chain that includes available block bodies, ensuring data availability. In other words, the transactions in the longest chain of each shard remain accessible to farmers within the respective shards, effectively mitigating data withholding attacks.
After grasping these design concepts intuitively, we will next shift our focus to mathematical proofs for system security. In the following post, we will also explain how the sortition mechanism for data shards can be smoothly integrated with our PoAS leader election.