Double Replotting With Clusters?

Issue Report

I am running Docker on v2/legacy build on two servers: Server B and Server C.

Server B is running NATs, Node, Controller, Cache, Farmer, Plotter and has 1 disk
Server C is running a Plotter and has 0 disks

Server B is replotting its disk and the plotter on Server B and Server C are working on the same sectors. Is this normal? I’ve included logs below:

Environment

  • Operating System: Ubuntu 22.04 on both
  • Pulsar/Advanced CLI/Docker: Docker on both

Problem

Farmer on Server B:

2024-06-26T13:10:18.739312Z  INFO {farm_index=0}:{sector_index=102}: subspace_farmer::single_disk_farm::plotting: Replotting sector (0.00% complete)
2024-06-26T13:10:18.741170Z  INFO {farm_index=0}:{sector_index=142}: subspace_farmer::single_disk_farm::plotting: Replotting sector (25.00% complete)
2024-06-26T13:10:18.742216Z  INFO {farm_index=0}:{sector_index=153}: subspace_farmer::single_disk_farm::plotting: Replotting sector (50.00% complete)
2024-06-26T13:10:18.742978Z  INFO {farm_index=0}:{sector_index=95}: subspace_farmer::single_disk_farm::plotting: Replotting sector (75.00% complete)
2024-06-26T13:28:13.065136Z  INFO {farm_index=0}:{sector_index=102}: subspace_farmer::single_disk_farm::plotting: Replotting sector (0.00% complete)
2024-06-26T13:28:13.067005Z  INFO {farm_index=0}:{sector_index=142}: subspace_farmer::single_disk_farm::plotting: Replotting sector (25.00% complete)
2024-06-26T13:28:13.067782Z  INFO {farm_index=0}:{sector_index=153}: subspace_farmer::single_disk_farm::plotting: Replotting sector (50.00% complete)

Plotter on Server B:

2024-06-26T13:09:15.844681Z  INFO async_nats: event: connected
2024-06-26T13:09:17.130429Z  INFO subspace_farmer::commands::cluster::plotter: Multiple L3 cache groups detected l3_cache_groups=2
2024-06-26T13:09:17.130478Z  INFO subspace_farmer::commands::cluster::plotter: Preparing plotting thread pools plotting_thread_pool_core_indices=[CpuCoreSet { cores: CpuSet(0-11,24-35), .. }, CpuCoreSet { cores: CpuSet(12-23,36-47), .. }]
2024-06-26T13:10:18.739723Z  INFO {public_key=a223fb7645488b418d8743b783d890f9132d1b3e17ab178a08b53d7e421a671b sector_index=102}: subspace_farmer::cluster::plotter: Plot sector request
2024-06-26T13:10:18.741512Z  INFO {public_key=a223fb7645488b418d8743b783d890f9132d1b3e17ab178a08b53d7e421a671b sector_index=142}: subspace_farmer::cluster::plotter: Plot sector request
2024-06-26T13:10:18.743392Z  INFO {public_key=a223fb7645488b418d8743b783d890f9132d1b3e17ab178a08b53d7e421a671b sector_index=95}: subspace_farmer::cluster::plotter: Plot sector request
2024-06-26T13:10:18.743442Z  INFO {public_key=a223fb7645488b418d8743b783d890f9132d1b3e17ab178a08b53d7e421a671b sector_index=153}: subspace_farmer::cluster::plotter: Plot sector request

Plotter on Server C:

2024-06-26T13:28:03.376067Z  INFO async_nats: event: connected
2024-06-26T13:28:04.512885Z  INFO subspace_farmer::commands::cluster::plotter: Multiple L3 cache groups detected l3_cache_groups=2
2024-06-26T13:28:04.512948Z  INFO subspace_farmer::commands::cluster::plotter: Preparing plotting thread pools plotting_thread_pool_core_indices=[CpuCoreSet { cores: CpuSet(0-11,24-35), .. }, CpuCoreSet { cores: CpuSet(12-23,36-47), .. }]
2024-06-26T13:28:13.066739Z  INFO {public_key=a223fb7645488b418d8743b783d890f9132d1b3e17ab178a08b53d7e421a671b sector_index=102}: subspace_farmer::cluster::plotter: Plot sector request
2024-06-26T13:28:13.068416Z  INFO {public_key=a223fb7645488b418d8743b783d890f9132d1b3e17ab178a08b53d7e421a671b sector_index=142}: subspace_farmer::cluster::plotter: Plot sector request
2024-06-26T13:28:13.069299Z  INFO {public_key=a223fb7645488b418d8743b783d890f9132d1b3e17ab178a08b53d7e421a671b sector_index=153}: subspace_farmer::cluster::plotter: Plot sector request

Yes, this was indeed possible when you had significant plotting capacity and just a few sectors to replot, already addressed in Skip duplicate node client notifications by nazar-pc · Pull Request #2882 · subspace/subspace · GitHub based on logs from a different user and should be included in the next release.

This was possible with multiple controllers or potentially when controllers were restarting. Does it match what might have happened on your end?

It’s different because I only have a single controller. Server C is only running a plotter, no controller/node/nats/cache/etc…

Do you have full detailed logs maybe? That would help a lot. If not I can create a test build with latest changes to try out.

I’ve restarted everything since then so I no longer have logs unfortunately.

Got it. Snapshot build · subspace/subspace@f6ed626 · GitHub includes a bunch of recent fixes, if you an run apps with RUST_LOG=info,subspace_farmer=trace it should collect a bunch of useful details about what was happening there assuming the issue is reproducible.

Note that new farms initialized with this test build are compatible with future GPU plotter and can’t be used with older versions of the software.