Cache expired and removed

Issue Report

Cache expired and removed

Environment

  • Operating System: Ubuntu 18.04
  • Pulsar/Advanced CLI/Docker: CLI

Problem

When controller is restarted, the cache (200GB) is expired and reinitializing takes hours.

Why the cache cannot be reused after controller restart? Is it a necessary.
It costs too much time for cluster restarting.

node@node-8012:~/subspace$ ./controller.sh 
2024-06-28T21:36:26.751954Z  INFO subspace_farmer::commands::cluster::controller: Connecting to node RPC url=ws://192.168.0.39:9944
2024-06-28T21:36:26.752007Z  INFO async_nats: event: connected
2024-06-28T21:36:26.753668Z  INFO subspace_farmer::node_client::node_rpc_client: Downloading all segment headers from node...
2024-06-28T21:36:26.767497Z  INFO subspace_farmer::node_client::node_rpc_client: Downloaded all segment headers from node successfully
2024-06-28T21:36:26.771296Z  INFO subspace_networking::constructor: DSN instance configured. allow_non_global_addresses_in_dht=false peer_id=12D3KooWEDNfcijUdgGPFfcz3QF1PY55BvLe4quhotEUTmsGEUMd protocol_version=/subspace/2/0c121c75f4ef450f40619e1fca9d1e8e7fbabc42c895bc4790801e85d5a91c34
2024-06-28T21:36:26.771991Z  INFO libp2p_swarm: local_peer_id=12D3KooWEDNfcijUdgGPFfcz3QF1PY55BvLe4quhotEUTmsGEUMd
2024-06-28T21:36:27.257234Z  INFO subspace_farmer::commands::shared::network: DSN listening on /ip4/127.0.0.1/tcp/30533/p2p/12D3KooWEDNfcijUdgGPFfcz3QF1PY55BvLe4quhotEUTmsGEUMd
2024-06-28T21:36:27.257270Z  INFO subspace_farmer::commands::cluster::controller::caches: New cache discovered, scheduling reinitialization cache_id=01J1AETN9XFH85HN67WT9ZYAEH
2024-06-28T21:36:27.257319Z  INFO subspace_farmer::farmer_cache: Initializing piece cache
2024-06-28T21:36:27.258847Z  INFO subspace_farmer::commands::shared::network: DSN listening on /ip6/::1/tcp/30533/p2p/12D3KooWEDNfcijUdgGPFfcz3QF1PY55BvLe4quhotEUTmsGEUMd
2024-06-28T21:36:27.258859Z  INFO subspace_farmer::commands::shared::network: DSN listening on /ip4/192.168.0.40/tcp/30533/p2p/12D3KooWEDNfcijUdgGPFfcz3QF1PY55BvLe4quhotEUTmsGEUMd
2024-06-28T21:36:27.259180Z  INFO subspace_farmer::commands::shared::network: DSN listening on /ip6/fe80::3374:10ec:e86a:115a/tcp/30533/p2p/12D3KooWEDNfcijUdgGPFfcz3QF1PY55BvLe4quhotEUTmsGEUMd
2024-06-28T21:36:27.259187Z  INFO subspace_farmer::commands::shared::network: DSN listening on /ip4/172.17.0.1/tcp/30533/p2p/12D3KooWEDNfcijUdgGPFfcz3QF1PY55BvLe4quhotEUTmsGEUMd
2024-06-28T21:36:27.259196Z  INFO subspace_farmer::commands::shared::network: DSN listening on /ip6/fe80::42:eaff:fec4:a796/tcp/30533/p2p/12D3KooWEDNfcijUdgGPFfcz3QF1PY55BvLe4quhotEUTmsGEUMd
2024-06-28T21:36:27.259204Z  INFO subspace_farmer::commands::shared::network: DSN listening on /ip6/fe80::868:80ff:fe52:b9bb/tcp/30533/p2p/12D3KooWEDNfcijUdgGPFfcz3QF1PY55BvLe4quhotEUTmsGEUMd
2024-06-28T21:36:46.740873Z  INFO subspace_farmer::commands::cluster::controller::caches: New cache discovered, scheduling reinitialization cache_id=01J1GCF14J2F645EAD1R3TBEM4
2024-06-28T21:36:57.257730Z  WARN subspace_farmer::commands::cluster::controller::caches: Cache expired and removed, scheduling reinitialization cache_id=01J1AETN9XFH85HN67WT9ZYAEH
2024-06-28T21:38:15.103674Z  INFO subspace_farmer::commands::cluster::controller::farms: Discovered new farm, initializing farm_index=0 farm_id=01J064A906XK3GT42E4FA676XF
2024-06-28T21:38:15.103696Z  INFO subspace_farmer::commands::cluster::controller::farms: Discovered new farm, initializing farm_index=1 farm_id=01J064A906WE1RTTQQ9F2N1W4V
2024-06-28T21:38:15.103700Z  INFO subspace_farmer::commands::cluster::controller::farms: Discovered new farm, initializing farm_index=2 farm_id=01J064A903CJZD18FRXTQYV7G5
2024-06-28T21:38:15.103705Z  INFO subspace_farmer::commands::cluster::controller::farms: Discovered new farm, initializing farm_index=3 farm_id=01J064A903WYHBQF5V70XK9TV3
2024-06-28T21:38:15.103714Z  INFO subspace_farmer::commands::cluster::controller::farms: Discovered new farm, initializing farm_index=4 farm_id=01J064A906MATW2K6J975NHP5S
2024-06-28T21:38:15.103956Z  INFO subspace_farmer::commands::cluster::controller::farms: Discovered new farm, initializing farm_index=5 farm_id=01J064A906CPNBJXTP9C7Y51MY
2024-06-28T21:38:15.286732Z  INFO subspace_farmer::commands::cluster::controller::farms: Farm initialized successfully farm_index=0 farm_id=01J064A906XK3GT42E4FA676XF
2024-06-28T21:38:15.449355Z  INFO subspace_farmer::commands::cluster::controller::farms: Farm initialized successfully farm_index=1 farm_id=01J064A906WE1RTTQQ9F2N1W4V
2024-06-28T21:38:15.614816Z  INFO subspace_farmer::commands::cluster::controller::farms: Farm initialized successfully farm_index=2 farm_id=01J064A903CJZD18FRXTQYV7G5
2024-06-28T21:38:15.909422Z  INFO subspace_farmer::commands::cluster::controller::farms: Farm initialized successfully farm_index=3 farm_id=01J064A903WYHBQF5V70XK9TV3
2024-06-28T21:38:16.033658Z  INFO subspace_farmer::commands::cluster::controller::farms: Farm initialized successfully farm_index=4 farm_id=01J064A906MATW2K6J975NHP5S
2024-06-28T21:38:16.318214Z  INFO subspace_farmer::commands::cluster::controller::farms: Farm initialized successfully farm_index=5 farm_id=01J064A906CPNBJXTP9C7Y51MY

What build are you using? There were a few fixes to cache handling in case you have multiple that were merged after jun-18, specifically these might be relevant:

If you run farmer from one of the latest test builds (like Snapshot build · subspace/subspace@f6ed626 · GitHub) or compile from Skip duplicate node client notifications by nazar-pc · Pull Request #2882 · subspace/subspace · GitHub then it should handle multiple caches much better.

Jul-05 release test passed. The 200GiB piece cache initialization took 2 minutes.