Cluster piece cache sync timeout

Issue Report

when running piece cache sync on cluster, after a while it pauses for indefinite time

Environment

amd 7950X3D
windows 11
subspace-farmer-windows-x86_64-skylake-direct-io-linux-macos-backport
all cluster components and node on one machine

the issue is reproducible on every attempt.
the link to the log files:

(I hope it was actually reproduced, since I couldn’t understand what’s going on with all that output. After a while I noticed that network usage has dropped significantly so I presumed the process has paused)

I was too slow, can you share the file again, please? Something like pastebin.com or gist.github.com might work better.

Well I used again we transfer. If we miss it again I’ll use the others you suggested.

I think the problem here is that connection to cache dropped somehow (at least on logical level, like with timeouts during plotting). However, piece cache sync is unaware of this and continues the process facing a bunch of errors in the process, which in turn likely prevents cache from being re-discovered and reinitialized.

I have noticed this a while ago and created an issue just now with description of the problem, please subscribe there to follow future updates: Allow to respond to piece cache changes during piece cache sync in cluster setup · Issue #3116 · autonomys/subspace · GitHub

I want to know if a cluster cache can be copied to another cluster

It can be copied without issue. Just ensure that the cache component is stopped before copying the piece_cache.bin.

subspace-farmer cluster --nats-server nats://127.0.0.1:4222
cache
path=/cache01,size=200G
path=//cache02,size=200G

I would like to ask, when my cache component is planned with two spaces, do these two spaces contain the same cache data, or is the cache evenly distributed to the two spaces?

Each cache group contains its own distinct cache data, separate from the others. As long as each cache group is equal to or larger than the “Archived History Size” of the blockchain, you will have two complete caches based on your example once fully synced.

If you sync one cache group first, you can stop the cache component and copy the fully synced cache file to speed up the syncing of the second group, effectively reducing the total sync time by half as they contain the same data.