Understanding the Piece Cache in the Autonomys Network
One topic that has come up frequently from our growing farmer community is the behavior of the piece cache in the Autonomys Network, especially around initial plotting performance. Let’s dive into how the piece cache works and how it affects the plotting process.
NOTE: This subject was covered on a recent Farmer Office Hours if you’d prefer to watch a video rather than read this guide: Autonomys Network - Farmer Office Hours - 25th September, 2024
What is the Piece Cache?
The Autonomys Network implements the Subspace Protocol, which uses a consensus mechanism called Proof-of-Archival-Storage (PoAS). Unlike traditional storage networks, PoAS uses the space that farmers pledge to store an archival history of the blockchain. You can think of it as a highly replicated, cryptographically encoded copy of the blockchain’s history, where each farmer stores a unique version. This unique copy is what makes the network resilient and verifiable.
The piece cache is a critical component in this process. It serves as a temporary storage space for raw blockchain history data (pieces) before it is encoded into the farmer’s plots. Think of it as “hot” storage that accelerates both piece retrieval for the Distributed Storage Network (DSN) and the plotting process itself.
How Does the Piece Cache Work?
Here’s a step-by-step breakdown of what happens during the plotting process:
- Initialization: When you first start plotting, or when a plot size is adjusted, the piece cache is populated. On every software restart, the cache is checked and updated based on the pieces it should hold.
- Plotting: As the plotting process runs, it requests pieces from the cache to encode them into plots. If the cache holds the piece, the process continues quickly. If not, this triggers a “cache miss.”
- Cache Miss Handling: When there is a cache miss, the farmer’s node must reach out to the DSN to find and download the required piece from another farmer. This step is significantly slower because it involves searching the network and transferring data based on your internet speed.
Why Does This Matter?
Farmers have noticed that initial plotting speed varies widely, even when using similar hardware. A common reason for this is the size and configuration of the piece cache. By default, Space Acres allocates 1% of the total plot size to the piece cache. If the cache is not large enough to hold all the pieces of the blockchain history, cache misses will occur, resulting in slower plotting as the system needs to fetch missing pieces from the DSN.
Let’s take the current Gemini 3h test network as an example:
- The archived history size is about 107 GiB.
- To store all of this history in the piece cache up-front and enjoy uninterrupted plotting, you would need a plot size of around 10.7 TiB (since 1% of this would match the history size).
- Most farmers operate with much smaller plot sizes, leading to cache misses.
Cache Configuration Options
The piece cache size and its management differ based on the configuration:
- Space Acres: Defaults to using 1% of the specified plot size to the piece cache.
- Advanced CLI: Allows setting a specific piece cache size, balancing initial plotting speed against space allocated to farming.
- Farming Cluster: Separates the cache into its own component, allowing 100% of plot space for plot data.
Configuring the piece cache to match the size of the archived history can prevent cache misses and speed up initial plotting. However, this comes with a trade-off: larger cache sizes mean less space for farming, unless using an external cache component.
Why Do I Still See Cache Misses?
As mentioned, cache misses occur when the piece cache cannot hold all the necessary pieces. This can happen even with larger caches due to two reasons:
- Dynamic Blockchain History: The piece cache is updated whenever new blocks are archived on the network. If the archived history grows beyond the configured cache size, cache misses will start to occur again.
- Supplemental Caching: During initial plotting, if a piece is missing from the cache, it is fetched from the network and stored in the unused plot space temporarily. However, as the plot fills up, this supplemental storage is overwritten, which means cache misses can resurface.
Practical Recommendations
If you want to optimize plotting speed and minimize cache misses, here are a few strategies:
- Increase Cache Size: If you have the space, configure the piece cache to be large enough to hold all archived pieces.
- Use Faster Networking: The real bottleneck during cache misses is often network speed, not cache size. Increasing the number of peer connections can help retrieve pieces more quickly. Space Acres has a “Faster Networking” option for this purpose, but note that it can sometimes stress consumer routers.
- Monitor Plot Progress: Understand that initial plotting without a complete history in the cache will always have a slow start as cache misses are resolved. As your plot becomes more complete, the number of cache misses will reduce, and plotting will speed up.
Final Thoughts
The piece cache plays a crucial role in the Autonomys Network by accelerating both plotting and piece retrieval for the DSN. However, it is a nuanced tool with trade-offs that farmers need to understand. The key takeaway is that performance issues related to cache misses are often more about network retrieval than cache size. Configuring your piece cache correctly, along with optimizing your network settings, can help smooth out these bumps.
We encourage all farmers to continue providing feedback and to participate in testing new features as they become available. Your input is invaluable in refining and improving the Autonomys Network ecosystem.