Faster way to sync non-archival nodes

nazar-pc · January 7, 2024, 5:53pm

With growing blockchain size sync process takes longer and longer, which is a big user experience problem for farmers.

There are fast and warp syncs in Substrate that are not functional in Subspace due to various reasons, but I think we can still do better than we do now.

In Subspace we have sync from DSN where we take advantage of archival history collectively persisted by farmers as well as regular Substate sync to sync blocks that are not archived yet (also sometimes acts as a fallback).

What we could do in DSN sync is to download only last few segments with blocks and instead of importing them normally (expecting parent block and state to exist already as it would in archival node), download the state of the first imported block from one of the nodes on the network and continue from there.

This way we skip both downloading and importing of the majority of the blockchain history to get farmers up to speed quickly and efficiently.

Archival nodes will still need to go through the same process as they do now though. We could also extend it later with Substrate-like warp sync that will download and import older blocks in the background, but that will be a much lower priority relatively speaking.

The question here is about security implications of such implementation and whether it is acceptable. I think implementation-wise it is actually not that hard to do if this is considered to be secure enough.

Chen_Feng_2023 · January 8, 2024, 5:31am

Short answer: Yes, it works, but we should pay attention to several details.

Long answer: From the security perspective, we need to ensure the following

If the archived history is unique, we are all good. Otherwise, a new node needs to download block headers to determine which archived history is compatible with the longest/heaviest chain.
After downloading “the state of the first imported block”, a new node should check the corresponding state root.

More discussions: Consider an ideal case where each full node maintains a chain of block headers with each header containing a state root. Then, we can define a secure node-sync problem. One solution is the following. First, a new node contacts several existing full nodes to download block headers. As long as one is honest, the new node can obtain a longest/heaviest chain of headers in the local view of the honest full node. Second, the new node downloads the state of a block buried deep enough (regarding the longest/heaviest chain) and then it checks the state using the state root. This ensures the integrity of the state. It is easy to see that this solution is as good as a standard solution where a new node downloads the entire chain of blocks. To sum up, we are all good as long as our new implementation “simulates” the above solution.

dariolina · January 8, 2024, 9:53am

@Chen_Feng_2023 is this something where MMR can come in handy as well?

nazar-pc · January 8, 2024, 12:35pm

Unique and longest are orthogonal properties from my point of view. Check the spec on how we do DSN sync, we don’t look at longest chain there actually.

Naturally, wouldn’t be done any other way.

As mentioned above, we are not downloading block headers in this case.

I don’t see how it does unless we verify all the block headers and current implementation is not able to do that without access to the runtime/state.

Well, that is not what I suggested though, what you’re suggesting doesn’t compress resources quite the way I suggested it.

Rahul_Subramaniyam · January 8, 2024, 8:19pm

Related, may be of interest: Mina protocol(https://www.kraken.com/learn/what-is-mina-protocol)

They are based on very different cryptographic primitives (zk SNARK based recursive proofs), but worth checking for the ideas themselves

Chen_Feng_2023 · January 10, 2024, 6:47pm

To @nazar-pc : We discussed this in detail during our R&D meeting. Please go ahead with the implementation and we will then tell you what are additional checks we need to do. (These additional checks can be made orthogonal to your implementation.)

Unique and longest are orthogonal properties from my point of view. Check the spec on how we do DSN sync, we don’t look at longest chain there actually.

Yes, they are orthogonal. We just check the uniqueness for our purpose and we don’t look at the longest chain.

I don’t see how it does unless we verify all the block headers and current implementation is not able to do that without access to the runtime/state.

Verifying block headers is not the only way. Another way is described by Dariia here.

nazar-pc · January 11, 2024, 1:16am

I don’t have time to work on this right now, just wanted to initiate conversation to collect feedback. Hopefully this will be one of the nice upgrades to Gemini 3h.

ning · March 25, 2024, 5:37pm

Recently, we found some conflict between consensus chain fast sync and fraud proof & XDM verification.

Suppose there are 100 blocks per segment and the best block now is #321, a newly joined consensus node will download the stats at the block #300 and sync & execute the block #301..#321. While this improves the sync process by bypassing the block before #300, it breaks some dependencies of fraud proof & XDM verification.

For fraud proof, depending on the different types of fraud proof, some of the historical states of the consensus chain are required during verification (e.g. block randomness, transaction byte fee, domain runtime code, etc). Suppose there is a bad ER derived from the consensus block #250 and a fraud proof submitted at block #321, the verification will fail for consensus nodes that use fast sync, because they bypass the execution of the block #1..#300 and the historical state at #250 is unavailable.

For XDM, it uses an MMR proof to ensure the corresponding domain block of the src domain is confirmed (i.e. the fund is burnt on src domain), and both the generation and verification of this MMR proof require the MMR offchain storage, but the MMR data is added to the offchain storage during the block on_initialize hook of pallet-mmr, for consensus nodes that use fast sync, they bypass the execution of block #1..#300 thus these MMR data are not added to their offchain storage, they will fail to relay (require generating MMR proof) and verify XDM related to the consensus block #1..#300.

Potential solutions discussed with @dariolina and @shamil:

For fraud proof verification:

Define the challenge period in the consensus block (there is a potential unsolved attack), when defining the challenge period in the domain block there is no limit for how many historical states are required for potential fraud proof because the domain can stop progressing while the consensus chain can keep growing. When defining the challenge period in the consensus block, fast sync will start the normal sync sooner and execute more block (like #200 in the above example) to ensure the historical state is available.
- Another issue just found is there is no time limit for submitting ER, suppose there are bundles in block #100 but the corresponding ER is submitted at block #320 and a fraud proof is submitted at #321 and it requires the state at #100 which is still unavailable.
Store the duplicated state in the consensus chain, if there are bundles submitted at #n there will be an ER of #n submitted later and a potential fraud proof, thus store one more copy of the state at #n on runtime (e.g. block randomness, transaction byte fee, domain runtime code, etc) for potential later use, and prune them when the ER is confirmed
- Obviously too expensive
Implement stateless fraud proof, requiring the fraud proof contain storage proof for any state used during verification so we don’t need to query from the client via host function, the storage proof can be verified against the state root of the consensus block header thus also require fast sync to download the header
- Increase the size of fraud proof.

For XDM, download both the state and the MMR offchain storage at block #n during the fast sync.

nazar-pc · March 25, 2024, 5:44pm

I’m surprised this is a problem at all.

Proposed fast sync by definition starts with something that is already archived, meaning it is at least 100 blocks deep according to archiving depth, i practice likely deeper than that. So we do have (or can have if necessary) state for some recent blocks.

And MMR was supposed to be a solution to this exact problem:

store all MMR peaks in the runtime state
generate a stateless proof, potentially with some off-chain data and as the result MMR proof generated and included in FP can be used to verify any data in the block or corresponding state in the past

The only constraint is that it would be generated for a corresponding MMR root that should still be in the state of non-pruned block by the time FP verification happens or else FP would have to be re-generated.

What out of mentioned assumptions do not hold and why?

dariolina · March 25, 2024, 6:15pm

IIUC fraud proofs weren’t updated to make use of the MMR.
And currently there is no implemented way to download the MMR itself.

ning · March 25, 2024, 7:46pm

Proposed fast sync by definition starts with something that is already archived, meaning it is at least 100 blocks deep according to archiving depth, i practice likely deeper than that. So we do have (or can have if necessary) state for some recent blocks.

There are 2 issues:

The challenge period is now defined in the domain block. Because the domain can stop progressing while the consensus chain keeps growing, there is no limit to how many consensus historical states are required for potential fraud proof.
There is no time limit for ER submission. Suppose there are bundles included in block #1, then the domain stops and the corresponding ER is later submitted at block #100000, and a fraud proof is submitted at #100001, to verify the fraud proof we need to query the state at #1

And MMR was supposed to be a solution to this exact problem

Do you mean the “stateless fraud proof” approach but getting the state root from MMR proof instead of the block header?

nazar-pc · March 25, 2024, 8:06pm

Both of these are non-issues with stateless MMR proof if I understand it correctly.

Yes, with MMR proof you can prove which state root corresponds to which block without having access to block or even its header (we will want to prove everything for blocks below a few segment headers to achieve Make node use bounded amount of space on disk · Issue #2114 · subspace/subspace · GitHub, so any reliance on state or even block headers will prevent us from getting there).

Chen_Feng_2023 · March 26, 2024, 4:19am

I agree with Nazar. (1) Our fraud proofs are designed so that only the information of block headers is required. (2) The use of MMR allows us to download the recent headers (instead of all the headers). It seems that (1) + (2) solves the problem.

Topic		Replies	Views
A LOT faster Snap sync General syncing , consensus , performance	0	24	June 18, 2025
Why are some people's finalized blocks at 69884, while others' nodes are at 70420? Why are there two different finalized blocks? Support	1	73	November 15, 2024
Node sync troubles (forks/33581/old snaps/network usage/archival vs non-archival mode) Incentivized Testnet	1	422	June 6, 2022
Increased sync speed on initial start, slows down overtime Testing nodes , cli , syncing	0	509	August 19, 2022
Is piece cache synchronization downloading data from a node or from other farms? Support	4	228	June 4, 2024

Faster way to sync non-archival nodes

Related topics