The remote farmer cannot resume the connection to the node without restarting. (error=RestartNeeded)

Issue Report

Environment

  • Operating System: Ubuntu 22.04.3 LTS on node+farmer server, Ubuntu 20.04.6 LTS on remote farmer.
  • Advanced CLI: gemini-3f-2023-sep-13-2

Problem

After restarting the node, the farmer on node’s local server can resume the connection as usual, while the remote farmer cannot resume the connection to the node without restarting.

Sep 15 07:52:38 lenovo7 subspace-farmer[93395]: 2023-09-15T04:52:38.620046Z ERROR single_disk_farm{disk_farm_index=6}: subspace_farmer::utils::piece_validator: Failed tor retrieve segment headers from node piece_index=1575 error=RestartNeeded("Networking or low-level protocol error: WebSocket connection error: connection closed")
Sep 15 07:52:38 lenovo7 subspace-farmer[93395]: 2023-09-15T04:52:38.702851Z ERROR single_disk_farm{disk_farm_index=12}: subspace_farmer::utils::piece_validator: Failed tor retrieve segment headers from node piece_index=1669 error=RestartNeeded("Networking or low-level protocol error: WebSocket connection error: connection closed")
Sep 15 07:52:39 lenovo7 subspace-farmer[93395]: 2023-09-15T04:52:39.303228Z ERROR single_disk_farm{disk_farm_index=10}: subspace_farmer::utils::piece_validator: Failed tor retrieve segment headers from node piece_index=3014 error=RestartNeeded("Networking or low-level protocol error: WebSocket connection error: connection closed")
Sep 15 07:52:39 lenovo7 subspace-farmer[93395]: 2023-09-15T04:52:39.420706Z ERROR single_disk_farm{disk_farm_index=3}: subspace_farmer::utils::piece_validator: Failed tor retrieve segment headers from node piece_index=2271 error=RestartNeeded("Networking or low-level protocol error: WebSocket connection error: connection closed")

There is zero difference between local and remote farmer, must be a coincidence and remove should also exit eventually.

Also judging by disk_farm_index=12 you either have a lot of disks or you sliced disk into multiple farms unnecessarily. If it is sliced disk it’ll be less efficient than one big farm.

Really? More rewards on smaller plots - #8 by nazar-pc
There are simply several relatively small plot files, like this:

path=/media/aorus-1/plot1,size=111679874151 \
path=/media/aorus-2/plot0,size=111679874151 \
path=/media/aorus-2/plot1,size=111679874151 \
path=/media/aorus-3/plot0,size=111679874151 \
path=/media/aorus-3/plot1,size=111679874151 \
path=/media/aorus-4/plot0,size=111679874151 \
path=/media/aorus-4/plot1,size=111679874151 \
path=/media/aorus-5/plot0,size=111679874151 \
path=/media/aorus-5/plot1,size=111679874151

I proceeded from the experience of colleagues. Should I now replot everything? :sob:

Things change quickly and we have already fixed underlying reasons for large plots that caused less rewards in Release gemini-3f-2023-sep-11 · subspace/subspace · GitHub. It is not necessary to replot, things will continue to work, but somewhat suboptimally, so no need to create small plots or recommend it to anyone. You’ll just spend more RAM and CPU for no reason.

1 Like

The problem looks a little different these days, but I believe this topic is still suitable for it. The remote farmer cannot resume the connection to the node if the node has been restarted (for example, to update the version).

Nov 10 18:31:26 lenovo7 subspace-farmer[2339391]: 2023-11-10T15:31:26.431653Z ERROR jsonrpsee_core::client::async_client: [backend]: Networking or low-level protocol error: WebSocket connection error: connection closed
Nov 10 18:31:26 lenovo7 subspace-farmer[2339391]: 2023-11-10T15:31:26.431652Z ERROR jsonrpsee_core::client::async_client: [backend]: Networking or low-level protocol error: WebSocket connection error: connection closed
Nov 10 18:31:26 lenovo7 subspace-farmer[2339391]: 2023-11-10T15:31:26.431662Z ERROR jsonrpsee_core::client::async_client: [backend]: Networking or low-level protocol error: WebSocket connection error: connection closed
Nov 10 18:31:26 lenovo7 subspace-farmer[2339391]: 2023-11-10T15:31:26.431651Z ERROR jsonrpsee_core::client::async_client: [backend]: Networking or low-level protocol error: WebSocket connection error: connection closed
Nov 10 18:31:26 lenovo7 subspace-farmer[2339391]: 2023-11-10T15:31:26.431705Z ERROR jsonrpsee_core::client::async_client: [backend]: Networking or low-level protocol error: WebSocket connection error: connection closed
Nov 10 18:31:26 lenovo7 subspace-farmer[2339391]: 2023-11-10T15:31:26.431715Z ERROR jsonrpsee_core::client::async_client: [backend]: Networking or low-level protocol error: WebSocket connection error: connection closed

So I need to restart the farmer(s) too.
gemini-3g-2023-nov-09

Farmer and node are version-updated at the same time, so wouldn’t you want to restart the farmers to also update the version anyway? Otherwise you’d be running two different versions between node and farms.

My process has been, prep all my scripts to run the new version for both node and farmer, then shut down all farmers, then shut down node, then restart node, then restart farmers.

It was always the case and will likely be for some time, but farmer should shut down when node does shortly after those errors.
But generally should should upgrade both together anyway because not all releases are cross-compatible with each other.

I knew that the details presented in the post would not be enough. But after all, the farmer cannot restore the connection even after restarting the node for any other reason. I wrote “for example”, didn’t I? :wink: