There are too many warning logs in the farmer cluster

List of test devices

172.29.100.11 node
172.29.100.141 nats-server
172.29.100.142 controller、farmer
172.29.100.143 plotter、cache
172.29.100.144 plotter
172.29.100.145 plotter
172.29.100.146 plotter

CPU:E5-2697 v4 * 2
memory:64G

farmer log

2024-05-27T09:14:52.877419Z  INFO {farm_index=9}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=577
2024-05-27T09:14:56.333195Z  INFO {farm_index=8}:{public_key=dc8fb87f5cf0d7d96f1ad3c1294f732e372a8f84e4c8fe63642bdba34ab53e5b sector_index=537}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=536
2024-05-27T09:14:57.770142Z  INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.21% complete) sector_index=588
2024-05-27T09:15:01.736922Z  INFO {farm_index=8}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.50% complete) sector_index=537
2024-05-27T09:15:06.336977Z  WARN {farm_index=8}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=536 error=Low-level plotting error: Timed out without ping from plotter
2024-05-27T09:15:33.369328Z  INFO {farm_index=4}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.04% complete) sector_index=576
2024-05-27T09:15:43.969050Z  INFO {farm_index=10}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=580
2024-05-27T09:16:08.206055Z  INFO {farm_index=6}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.28% complete) sector_index=593
2024-05-27T09:16:55.398965Z  INFO {farm_index=3}:{public_key=50f9f6e5d33e120592e431c5f2c3c3ef6f870480ef8b64eb5fb82d1e5760d648 sector_index=578}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=577
2024-05-27T09:17:03.111753Z  INFO {farm_index=9}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.11% complete) sector_index=581
2024-05-27T09:17:13.119869Z  WARN {farm_index=9}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=581 error=Low-level plotting error: Timed out without ping from plotter
2024-05-27T09:17:39.862408Z  INFO {farm_index=2}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.01% complete) sector_index=574
2024-05-27T09:17:49.433589Z  INFO {farm_index=10}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.14% complete) sector_index=583
2024-05-27T09:18:21.634928Z  INFO {farm_index=11}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.10% complete) sector_index=580
2024-05-27T09:18:37.021913Z  INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.22% complete) sector_index=589
2024-05-27T09:18:50.333352Z  WARN {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=589 error=Low-level plotting error: Timed out without ping from plotter
2024-05-27T09:18:50.936843Z  INFO {farm_index=9}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.12% complete) sector_index=582
2024-05-27T09:19:06.647317Z  INFO {farm_index=6}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.29% complete) sector_index=594
2024-05-27T09:19:42.152451Z  INFO {farm_index=7}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.18% complete) sector_index=586
2024-05-27T09:19:42.522823Z  INFO {farm_index=9}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=581
2024-05-27T09:19:53.200882Z  INFO {farm_index=2}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.03% complete) sector_index=575
2024-05-27T09:20:29.771104Z  INFO {farm_index=10}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.15% complete) sector_index=584
2024-05-27T09:20:39.775738Z  WARN {farm_index=10}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=584 error=Low-level plotting error: Timed out without ping from plotter
2024-05-27T09:20:40.403530Z  INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.24% complete) sector_index=590
2024-05-27T09:20:50.406742Z  WARN {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=590 error=Low-level plotting error: Timed out without ping from plotter
2024-05-27T09:21:14.323801Z  INFO {farm_index=6}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.31% complete) sector_index=595
2024-05-27T09:21:14.882588Z  INFO {farm_index=11}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.11% complete) sector_index=581
2024-05-27T09:21:28.377462Z  INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=589
2024-05-27T09:21:45.558309Z  INFO {farm_index=7}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.19% complete) sector_index=587
2024-05-27T09:21:50.649036Z  INFO {farm_index=9}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.14% complete) sector_index=583
2024-05-27T09:21:55.579024Z  WARN {farm_index=7}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=587 error=Low-level plotting error: Timed out without ping from plotter
2024-05-27T09:21:58.392567Z  INFO {farm_index=10}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.17% complete) sector_index=585

plotter log

172.29.100.144
2024-05-27T09:17:03.114356Z  INFO {public_key=a8cde3c8c112dac065d319a048d0e147b2ccee38ad4778ba0a256bcb225d4e24 sector_index=581}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:17:09.479353Z  INFO {public_key=e29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c sector_index=586}: subspace_farmer::cluster::plotter: Finished plotting sector successfully
2024-05-27T09:18:03.116975Z  WARN {public_key=a8cde3c8c112dac065d319a048d0e147b2ccee38ad4778ba0a256bcb225d4e24 sector_index=581}: subspace_farmer::cluster::nats_client: Acknowledgement wait timed out request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress
2024-05-27T09:18:03.117098Z  WARN {public_key=a8cde3c8c112dac065d319a048d0e147b2ccee38ad4778ba0a256bcb225d4e24 sector_index=581}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T09:19:40.051325Z  WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-05-27T09:19:42.151178Z  INFO {public_key=740cd9c4e012360519db66c72e9d640dc2741a8c35ed5e60de4074b7505ab032 sector_index=586}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:19:42.521588Z  INFO {public_key=a8cde3c8c112dac065d319a048d0e147b2ccee38ad4778ba0a256bcb225d4e24 sector_index=581}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:19:42.748980Z  INFO {public_key=e44c90e164f93b96352924158eda5de8243e2d3239168a4cc3b0c41c37cea55c sector_index=579}: subspace_farmer::cluster::plotter: Finished plotting sector successfully
2024-05-27T09:19:53.203751Z  INFO {public_key=26ebd83540da3720fbd2c52b787a472b6dfd2632ad064bbb7bca1e5360ea7e43 sector_index=575}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:19:59.200884Z  INFO {public_key=50f9f6e5d33e120592e431c5f2c3c3ef6f870480ef8b64eb5fb82d1e5760d648 sector_index=577}: subspace_farmer::cluster::plotter: Finished plotting sector successfully

172.29.100.145
2024-05-27T09:17:49.439759Z  INFO {public_key=7890b3a0ca7bbc5edbd3a8ebbcb326fa12dd8c8379cba44aa60423e9656b4503 sector_index=583}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:17:55.573020Z  INFO {public_key=a8cde3c8c112dac065d319a048d0e147b2ccee38ad4778ba0a256bcb225d4e24 sector_index=577}: subspace_farmer::cluster::plotter: Finished plotting sector successfully
2024-05-27T09:20:29.772141Z  INFO {public_key=7890b3a0ca7bbc5edbd3a8ebbcb326fa12dd8c8379cba44aa60423e9656b4503 sector_index=584}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:20:36.098681Z  INFO {public_key=e29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c sector_index=588}: subspace_farmer::cluster::plotter: Finished plotting sector successfully
2024-05-27T09:20:40.403596Z  INFO {public_key=e29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c sector_index=590}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:20:47.498072Z  INFO {public_key=26ebd83540da3720fbd2c52b787a472b6dfd2632ad064bbb7bca1e5360ea7e43 sector_index=574}: subspace_farmer::cluster::plotter: Finished plotting sector successfully
2024-05-27T09:21:29.774612Z  WARN {public_key=7890b3a0ca7bbc5edbd3a8ebbcb326fa12dd8c8379cba44aa60423e9656b4503 sector_index=584}: subspace_farmer::cluster::nats_client: Acknowledgement wait timed out request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress
2024-05-27T09:21:29.775764Z  WARN {public_key=7890b3a0ca7bbc5edbd3a8ebbcb326fa12dd8c8379cba44aa60423e9656b4503 sector_index=584}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T09:21:40.405898Z  WARN {public_key=e29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c sector_index=590}: subspace_farmer::cluster::nats_client: Acknowledgement wait timed out request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress
2024-05-27T09:21:40.405997Z  WARN {public_key=e29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c sector_index=590}: subspace_farmer::cluster::plotter: Response sending ended early

172.29.100.146
2024-05-27T09:18:37.022415Z  INFO {public_key=e29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c sector_index=589}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:18:40.330498Z  WARN {public_key=e29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c sector_index=589}: subspace_farmer::cluster::nats_client: Unexpected acknowledgement index received_index=1 expected_index=0 request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress 01000000
2024-05-27T09:18:40.330589Z  WARN {public_key=e29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c sector_index=589}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T09:18:42.506628Z  INFO {public_key=acb90a49fb722311858a3386a4d2761e6dae07c9e1c3ced4abe88a1de5380009 sector_index=576}: subspace_farmer::cluster::plotter: Finished plotting sector successfully
2024-05-27T09:21:11.739364Z  WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-05-27T09:21:14.319916Z  INFO {public_key=7890b3a0ca7bbc5edbd3a8ebbcb326fa12dd8c8379cba44aa60423e9656b4503 sector_index=580}: subspace_farmer::cluster::plotter: Finished plotting sector successfully
2024-05-27T09:21:14.323975Z  INFO {public_key=969534d2f199cf1771d1eef156c20c2fcf54d75ceae3c23039495453e0ae6328 sector_index=595}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:21:14.881714Z  INFO {public_key=e44c90e164f93b96352924158eda5de8243e2d3239168a4cc3b0c41c37cea55c sector_index=581}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:21:28.425948Z  INFO {public_key=e29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c sector_index=589}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:21:34.271659Z  INFO {public_key=e44c90e164f93b96352924158eda5de8243e2d3239168a4cc3b0c41c37cea55c sector_index=580}: subspace_farmer::cluster::plotter: Finished plotting sector successfully

172.29.100.143
2024-05-27T09:19:06.647314Z  INFO {public_key=969534d2f199cf1771d1eef156c20c2fcf54d75ceae3c23039495453e0ae6328 sector_index=594}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:19:07.024099Z  INFO {public_key=dc8fb87f5cf0d7d96f1ad3c1294f732e372a8f84e4c8fe63642bdba34ab53e5b sector_index=537}: subspace_farmer::cluster::plotter: Finished plotting sector successfully
2024-05-27T09:21:45.558358Z  INFO {public_key=740cd9c4e012360519db66c72e9d640dc2741a8c35ed5e60de4074b7505ab032 sector_index=587}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:21:45.579295Z  WARN {public_key=740cd9c4e012360519db66c72e9d640dc2741a8c35ed5e60de4074b7505ab032 sector_index=587}: subspace_farmer::cluster::nats_client: Unexpected acknowledgement index received_index=1 expected_index=0 request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress 01000000
2024-05-27T09:21:45.579368Z  WARN {public_key=740cd9c4e012360519db66c72e9d640dc2741a8c35ed5e60de4074b7505ab032 sector_index=587}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T09:21:49.305812Z  WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-05-27T09:21:50.650729Z  INFO {public_key=a8cde3c8c112dac065d319a048d0e147b2ccee38ad4778ba0a256bcb225d4e24 sector_index=583}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:21:51.797690Z  INFO {public_key=969534d2f199cf1771d1eef156c20c2fcf54d75ceae3c23039495453e0ae6328 sector_index=593}: subspace_farmer::cluster::plotter: Finished plotting sector successfully
2024-05-27T09:21:58.404308Z  INFO {public_key=7890b3a0ca7bbc5edbd3a8ebbcb326fa12dd8c8379cba44aa60423e9656b4503 sector_index=585}: subspace_farmer::cluster::plotter: Plot sector request
2024-05-27T09:22:04.770289Z  INFO {public_key=a8cde3c8c112dac065d319a048d0e147b2ccee38ad4778ba0a256bcb225d4e24 sector_index=582}: subspace_farmer::cluster::plotter: Finished plotting sector successfully

plotter warn log

root@172-29-100-145:~# grep -v INFO logs/ssc-plotter-cluster.log 

2024-05-27T13:10:26.306849Z  WARN {public_key=26ebd83540da3720fbd2c52b787a472b6dfd2632ad064bbb7bca1e5360ea7e43 sector_index=617}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T13:11:56.724192Z  WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-05-27T13:12:57.719708Z  WARN {public_key=dc8fb87f5cf0d7d96f1ad3c1294f732e372a8f84e4c8fe63642bdba34ab53e5b sector_index=582}: subspace_farmer::cluster::nats_client: Acknowledgement wait timed out request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress
2024-05-27T13:12:57.719852Z  WARN {public_key=dc8fb87f5cf0d7d96f1ad3c1294f732e372a8f84e4c8fe63642bdba34ab53e5b sector_index=582}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T13:14:51.283626Z  WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-05-27T13:16:03.036940Z  WARN {public_key=740cd9c4e012360519db66c72e9d640dc2741a8c35ed5e60de4074b7505ab032 sector_index=636}: subspace_farmer::cluster::nats_client: Acknowledgement wait timed out request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress
2024-05-27T13:16:03.037051Z  WARN {public_key=740cd9c4e012360519db66c72e9d640dc2741a8c35ed5e60de4074b7505ab032 sector_index=636}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T13:17:43.070937Z  WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-05-27T13:33:05.141270Z  WARN {public_key=50f9f6e5d33e120592e431c5f2c3c3ef6f870480ef8b64eb5fb82d1e5760d648 sector_index=632}: subspace_farmer::cluster::nats_client: Acknowledgement wait timed out request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress
2024-05-27T13:33:05.141399Z  WARN {public_key=50f9f6e5d33e120592e431c5f2c3c3ef6f870480ef8b64eb5fb82d1e5760d648 sector_index=632}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T13:34:58.291417Z  WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-05-27T13:44:35.719220Z  WARN {public_key=acb90a49fb722311858a3386a4d2761e6dae07c9e1c3ced4abe88a1de5380009 sector_index=618}: subspace_farmer::cluster::nats_client: Acknowledgement wait timed out request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress
2024-05-27T13:44:35.719357Z  WARN {public_key=acb90a49fb722311858a3386a4d2761e6dae07c9e1c3ced4abe88a1de5380009 sector_index=618}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T13:46:18.520189Z  WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-05-27T13:49:21.313103Z  WARN {public_key=acb90a49fb722311858a3386a4d2761e6dae07c9e1c3ced4abe88a1de5380009 sector_index=621}: subspace_farmer::cluster::nats_client: Unexpected acknowledgement index received_index=1 expected_index=0 request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress 01000000
2024-05-27T13:49:21.313138Z  WARN {public_key=acb90a49fb722311858a3386a4d2761e6dae07c9e1c3ced4abe88a1de5380009 sector_index=621}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T13:49:25.766387Z  WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-05-27T13:53:10.286988Z  WARN {public_key=740cd9c4e012360519db66c72e9d640dc2741a8c35ed5e60de4074b7505ab032 sector_index=643}: subspace_farmer::cluster::nats_client: Acknowledgement wait timed out request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress
2024-05-27T13:53:10.287058Z  WARN {public_key=740cd9c4e012360519db66c72e9d640dc2741a8c35ed5e60de4074b7505ab032 sector_index=643}: subspace_farmer::cluster::plotter: Response sending ended early
2024-05-27T13:55:07.669439Z  WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone

Are they all running on identical machines? What is the networking connectivity between them (link speed, switch, etc.)?

If you can run those instances with RUST_LOG=info,subspace_farmer=debug environment variable and attach logs after some warnings that’d help a lot.

It is running on different devices, with an internal network connection speed of 10G

172.29.100.11 node
172.29.100.141 nats-server
172.29.100.142 controller、farmer
172.29.100.143 plotter、cache
172.29.100.144 plotter
172.29.100.145 plotter
172.29.100.146 plotter

The farmer and plotter logs after adding the environment variables

farmer
plotter

By different you mean distinct or they have different hardware specs?

The six test devices all have the same hardware specifications, and the internal network connection speed is 10G

CPU:E5-2697 v4 x2
memory:64G

The current issue is that there are many warnings in the farmer logs. The debug log files have already been uploaded

If you look at network usage, do you see it reaching 10G sometimes?

Try this build once it is ready, it may help: Snapshot build · subspace/subspace@dec09c4 · GitHub

Snapshot build · subspace/subspace@a0b1bca · GitHub addresses it from a different side potentially removing the root cause of large number of requests, would appreciate feedback on it as well.

Peak traffic has never reached 10G, the highest was 9.1G

Here are the debug logs for running this version of the farmer and plotter

farmer
plotter

Anything in logs of NATS server? To me looks like some messages were lost in the transmission, like this line indicates:

2024-06-04T03:22:37.605702Z WARN {farm_index=9}:{sector_index=4509}: subspace_farmer::cluster::nats_client: Received unexpected response stream index, aborting stream actual_index=1 expected_index=0 message_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress

It’ll also explain other timeouts.

Not sure if it was the case this time, but last time you only provided logs from a single plotter, which makes it confusing to read all the messages because many of them don’t match fully. It’d be much easier if you could run a single plotter and if issue is still reproducing then provide logs from both ends. Increasing log level to RUST_LOG=info,subspace_farmer=trace would help me as well.

Snapshot build · subspace/subspace@7a9f495 · GitHub should improve piece retrieval efficiency/bandwidth further, hopefully reducing number of warnings further (+ it includes a few other tweaks).

172.29.100.11 node
172.29.100.141 nats-server、farmer
172.29.100.142 controller
172.29.100.143 cache
172.29.100.131 plotter
172.29.100.132 plotter

The debug log for this version is as follows

nats server
farmer
plotter_01
plotter_02

Two plotters, one farmer, still have warn logs

2024-06-05T08:28:00.380551Z  INFO {farm_index=10}:{sector_index=4883}: subspace_farmer::single_disk_farm::plotting: Plotting sector (68.16% complete)
2024-06-05T08:28:00.838563Z  WARN {farm_index=11}:{sector_index=4864}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s error=Low-level plotting error: Timed out without ping from plotter
2024-06-05T08:28:15.974821Z  INFO {farm_index=9}:{sector_index=4874}: subspace_farmer::single_disk_farm::plotting: Plotting sector (68.03% complete)
2024-06-05T08:29:00.393507Z  WARN {farm_index=10}:{sector_index=4883}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s error=Low-level plotting error: Timed out without ping from plotter
2024-06-05T08:29:47.137852Z  INFO {farm_index=11}:{sector_index=4865}: subspace_farmer::single_disk_farm::plotting: Plotting sector (67.91% complete)
2024-06-05T08:29:47.138883Z  INFO {farm_index=11}:{sector_index=4866}: subspace_farmer::single_disk_farm::plotting: Plotting sector (67.92% complete)
2024-06-05T08:29:53.436912Z  INFO {farm_index=11}:{sector_index=4867}: subspace_farmer::single_disk_farm::plotting: Plotting sector (67.94% complete)
2024-06-05T08:29:53.464216Z  INFO {farm_index=11}:{sector_index=4868}: subspace_farmer::single_disk_farm::plotting: Plotting sector (67.95% complete)
2024-06-05T08:30:47.140711Z  WARN {farm_index=11}:{sector_index=4865}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s error=Low-level plotting error: Timed out without ping from plotter

Just to confirm, you have a single NATS server instance, right?

The only remaining issue that I see is this:

2024-06-05T08:29:47.142598Z  WARN {public_key=58bd0dded5bc5edaa17e07300636f6acc14a9608582d58feecb75b38d60e0068 sector_index=4865}: subspace_farmer::cluster::nats_client: Unexpected acknowledgement index received_index=1 expected_index=0 request_type=subspace_farmer::cluster::plotter::ClusterPlotterPlotSectorRequest response_type=subspace_farmer::cluster::plotter::ClusterSectorPlottingProgress 01000000

Not sure how it is possible yet, code seems to be organized in a way that guarantees ordering and there is nothing about dropped messages in logs on either end. Checking further.

Yes, there is only one NATS server

So I see that above warning is the only one remaning. The only explanation I have right now is race condition in the way NATS handles multiple connections. Assuming that is in fact the case, this build should help: Snapshot build · subspace/subspace@7d0d9ac · GitHub
But I’m wondering if it will have any negative side-effects because of that.
Please give it a try and let me now if it is any better.

I have reported potential race condition upstream here: Race condition with multiple connections · Issue #1274 · nats-io/nats.rs · GitHub

Or if you don’t want to wait for build to complete you can try to restart previous build with --nats-pool-size 1 right after NATS server argument. Above build removes this argument completely and makes it always 1.