Is the ARM server a pi or something similar? If so I dont think you should enable plotting on it, might just slow things down.
ARM server is 8 cores aarch64 CPU with 4 core 1.8GHz, and the other 4 cores are 2.8GHz. Not a pi.
If you run a plotter on that slow ARM machine, then it will plot on that slow ARM machine in addition to x86-64. If you donât want to use it for plotting - donât run plotter there.
I have 10 servers, one of server has a large SSD to plot.
How to arrange controller, cache, plotter and farmer in the servers to to do a quick plotting to the SSD?
For plotting the primary thing that matters is plotter. Youâll want to run it only on the machine/machines that are fast. You donât need to run it everywhere. The faster is networking between machines the better it is. Having 10G network is ideal, but not required of course.
The log formats of the farmers are not consistent, and there are quite a few warning messages. Is this normal?
2024-05-24T07:50:32.111870Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.02% complete) sector_index=73
2024-05-24T07:50:32.183444Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.12% complete) sector_index=74
2024-05-24T07:50:36.485025Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.21% complete) sector_index=75
2024-05-24T07:50:42.236333Z WARN {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=71}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=74 error=Low-level plotting error: Timed out without ping from plotter
2024-05-24T07:50:44.120945Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=71}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.31% complete) sector_index=76
2024-05-24T07:50:46.891632Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=71
2024-05-24T07:50:50.888259Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=74
2024-05-24T07:50:54.017888Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.40% complete) sector_index=77
2024-05-24T07:50:54.359936Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.50% complete) sector_index=78
2024-05-24T07:50:57.398469Z WARN {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=76 error=Low-level plotting error: Timed out without ping from plotter
2024-05-24T07:51:00.939702Z WARN {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=74 error=Low-level plotting error: Timed out without ping from plotter
2024-05-24T07:51:01.363339Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=76
2024-05-24T07:51:01.929071Z WARN {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=71 error=Low-level plotting error: Timed out without ping from plotter
2024-05-24T07:51:04.598118Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.60% complete) sector_index=79
2024-05-24T07:51:16.403998Z WARN {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=76 error=Low-level plotting error: Timed out without ping from plotter
2024-05-24T07:51:19.176133Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=71}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=80}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=71
2024-05-24T07:51:26.535976Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.69% complete) sector_index=80
2024-05-24T07:51:33.469966Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=76
2024-05-24T07:51:43.480165Z WARN {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=81}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=76 error=Low-level plotting error: Timed out without ping from plotter
2024-05-24T07:53:30.455822Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=81}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=76
2024-05-24T07:53:52.804138Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.79% complete) sector_index=81
2024-05-24T07:53:52.806595Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.88% complete) sector_index=82
2024-05-24T07:53:52.807674Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}: subspace_farmer::single_disk_farm::plotting: Plotting sector (7.98% complete) sector_index=83
2024-05-24T07:53:52.809282Z INFO {farm_index=0}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=79}:{public_key=8cea533ae6691fd7167f9565993bbb4e7f4fd17150f8f41c306fc9778e10a071 sector_index=74}: subspace_farmer::single_disk_farm::plotting: Plotting sector (8.08% complete) sector_index=84
Yeah, logs are off. Plotting warnigns are not good and should ideally not happen. Please create a separate forum thread with description of your setup and weâll try to figure out why it happens.
Logs should be fixed with Improve farming cluster logging by nazar-pc ¡ Pull Request #2789 ¡ subspace/subspace ¡ GitHub
I expect farming-cluster to accelerate my plotting phrase, and here is my test.
Test ONE:
Server 1: controller, cache, plotter, farmer.
It took 172 mins to plot 20GB by itself.
Test TWO:
6 servers are used:
Server 1: controller, cache, plotter, farmer
Server 2, 3, 4, 5: controller, cache, plotter
Server 6: NATs
It took 179 mins to plot 20GB by the cluster.
Note: clear the farming directory before the test.
Why there is no plotting speed improve by farming-cluster feature? In which case we can see the benefit of farming-cluster?
By default farmer will not not request to plot more than 8 sectors at a time in order to limit memory usage (though I think weâll increase that signficiantly due to some improvements already done to the plotting process), so if your servers provide more capacity in total, it will not actually be utilized. You can change that by increasing --sector-encoding-concurrency
to something like 100.
The non-cluster versionâs Plot directory cannot be continued with the cluster version for Plot, right?
root@172-29-100-142:~# cat /disk/nvme1n1/ssc-3h/farmer-1/single_disk_farm.json |jq {
"v0": {
"id": "01HY2PPKMTBXGD3VGMTV9BW5JW",
"genesisHash": "0c121c75f4ef450f40619e1fca9d1e8e7fbabc42c895bc4790801e85d5a91c34",
"publicKey": "e29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c",
"piecesInSector": 1000,
"allocatedSpace": 7573101084672
}
}
root@172-29-100-142:~#
root@172-29-100-142:~# /root/ssc/subspace-farmer-cluster info /disk/nvme1n1/ssc-3h/farmer-1/Single disk farm 0:
ID: 01HY2PPKMTBXGD3VGMTV9BW5JW
Genesis hash: 0x0c121c75f4ef450f40619e1fca9d1e8e7fbabc42c895bc4790801e85d5a91c34
Public key: 0xe29c3bff319a6b1da300c3d794c1f089e0ef13bf4cfdc7a93f26724ae60ef77c
Allocated space: 6.9 TiB (7.6 TB)
Directory: /disk/nvme1n1/ssc-3h/farmer-1/
root@172-29-100-142:~#
root@172-29-100-142:~# /root/ssc/subspace-farmer-cluster cluster --nats-servers nats://172.29.100.141:4242 farmer --reward-address stC1HgpMEVpwYKEfiPPcDpLnwhmtV7RfaZHQjuUk1DfMbxxx path=/disk/nvme1n1/ssc-3h/farmer-1,size=7053GiB
2024-05-27T03:06:45.096417Z INFO async_nats: event: connected
2024-05-27T03:06:45.096494Z INFO async_nats: event: connected
2024-05-27T03:06:45.096464Z INFO async_nats: event: connected
2024-05-27T03:06:45.096488Z INFO async_nats: event: connected
2024-05-27T03:06:45.096414Z INFO async_nats: event: connected
2024-05-27T03:06:45.096488Z INFO async_nats: event: connected
2024-05-27T03:06:45.096517Z INFO async_nats: event: connected
2024-05-27T03:06:45.096516Z INFO async_nats: event: connected
2024-05-27T03:06:45.805534Z ERROR {farm_index=0}: subspace_farmer::commands::cluster::farmer: Farm creation failed error=Can't preallocate plot file, probably not enough space on disk: File exists (os error 17)
Error: Can't preallocate plot file, probably not enough space on disk: File exists (os error 17)
Simply reduce the plot size or remove the piece_cache.bin from each plot drive as they are not used in clustering.
okăthank youăThis is indeed feasible.
with --sector-encoding-concurrency 100 parameter, the plot time is 3 hours and 20 minutes, longer than without it.
In which case, farming cluster can perforce better than regular farmer?
./subspace-farmer cluster --nats-server nats://192.168.0.10:4222 farmer --reward-address stXXX path=/data/farm_test,size=20GiB --sector-encoding-concurrency 100
2024-05-27T04:26:06.436001Z INFO async_nats: event: connected
2024-05-27T04:26:06.436282Z INFO async_nats: event: connected
2024-05-27T04:26:06.437065Z INFO async_nats: event: connected
2024-05-27T04:26:06.437555Z INFO async_nats: event: connected
2024-05-27T04:26:06.438044Z INFO async_nats: event: connected
2024-05-27T04:26:06.438299Z INFO async_nats: event: connected
2024-05-27T04:26:06.438353Z INFO async_nats: event: connected
2024-05-27T04:26:06.439017Z INFO async_nats: event: connected
2024-05-27T04:26:06.855588Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plot_cache: Checking plot cache contents, this can take a while
2024-05-27T04:26:06.856689Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plot_cache: Finished checking plot cache contents
2024-05-27T04:26:06.857521Z INFO {farm_index=0}: subspace_farmer::single_disk_farm: Benchmarking faster proving method
2024-05-27T04:26:08.188449Z INFO {farm_index=0}: subspace_farmer::single_disk_farm: Faster proving method found fastest_mode=ConcurrentChunks
2024-05-27T04:26:08.219843Z INFO {farm_index=0}: subspace_farmer::commands::cluster::farmer: Farm 0:
2024-05-27T04:26:08.219854Z INFO {farm_index=0}: subspace_farmer::commands::cluster::farmer: ID: 01HYW4RTM2AS2Y9V1FH3WDQXQQ
2024-05-27T04:26:08.219871Z INFO {farm_index=0}: subspace_farmer::commands::cluster::farmer: Genesis hash: 0x0c121c75f4ef450f40619e1fca9d1e8e7fbabc42c895bc4790801e85d5a91c34
2024-05-27T04:26:08.219873Z INFO {farm_index=0}: subspace_farmer::commands::cluster::farmer: Public key: 0xf0d0ee649c301a031dbbcd5d964e697fad354979a758e029d5dd9cb6c267711f
2024-05-27T04:26:08.219879Z INFO {farm_index=0}: subspace_farmer::commands::cluster::farmer: Allocated space: 20.0 GiB (21.5 GB)
2024-05-27T04:26:08.219881Z INFO {farm_index=0}: subspace_farmer::commands::cluster::farmer: Directory: /data/farm_test
2024-05-27T04:26:08.220270Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::farming: Subscribing to slot info notifications
2024-05-27T04:26:08.220305Z INFO {farm_index=0}: subspace_farmer::reward_signing: Subscribing to reward signing notifications
2024-05-27T04:26:08.222332Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Subscribing to archived segments
2024-05-27T04:26:08.227048Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (0.00% complete) sector_index=0
2024-05-27T04:26:08.230237Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (5.00% complete) sector_index=1
2024-05-27T04:26:08.234533Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (10.00% complete) sector_index=2
2024-05-27T04:26:08.239297Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (15.00% complete) sector_index=3
2024-05-27T04:26:08.246499Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (20.00% complete) sector_index=4
2024-05-27T04:26:08.251019Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (25.00% complete) sector_index=5
2024-05-27T04:26:09.488217Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (30.00% complete) sector_index=6
2024-05-27T04:26:09.983376Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (35.00% complete) sector_index=7
2024-05-27T04:26:09.998948Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (40.00% complete) sector_index=8
2024-05-27T04:26:10.613134Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (45.00% complete) sector_index=9
2024-05-27T04:26:10.931575Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (50.00% complete) sector_index=10
2024-05-27T04:26:10.954486Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (55.00% complete) sector_index=11
2024-05-27T04:26:20.943550Z WARN {farm_index=0}:{public_key=f0d0ee649c301a031dbbcd5d964e697fad354979a758e029d5dd9cb6c267711f sector_index=12}: subspace_farmer::single_disk_farm::plotting: Failed to plot sector, retrying in 1s sector_index=10 error=Low-level plotting error: Timed out without ping from plotter
2024-05-27T04:38:32.309727Z INFO {farm_index=0}:{public_key=f0d0ee649c301a031dbbcd5d964e697fad354979a758e029d5dd9cb6c267711f sector_index=12}:{public_key=f0d0ee649c301a031dbbcd5d964e697fad354979a758e029d5dd9cb6c267711f sector_index=10}: subspace_farmer::single_disk_farm::plotting: Plotting sector (60.00% complete) sector_index=12
2024-05-27T04:44:20.580794Z INFO {farm_index=0}:{public_key=f0d0ee649c301a031dbbcd5d964e697fad354979a758e029d5dd9cb6c267711f sector_index=12}:{public_key=f0d0ee649c301a031dbbcd5d964e697fad354979a758e029d5dd9cb6c267711f sector_index=10}:{public_key=f0d0ee649c301a031dbbcd5d964e697fad354979a758e029d5dd9cb6c267711f sector_index=13}: subspace_farmer::single_disk_farm::plotting: Plotting sector retry sector_index=10
2024-05-27T05:42:56.406019Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (65.00% complete) sector_index=13
2024-05-27T05:44:34.541380Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (70.00% complete) sector_index=14
2024-05-27T06:01:38.951693Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (75.00% complete) sector_index=15
2024-05-27T06:03:51.978759Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (80.00% complete) sector_index=16
2024-05-27T06:04:09.576539Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (85.00% complete) sector_index=17
2024-05-27T06:06:04.091562Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (90.00% complete) sector_index=18
2024-05-27T06:06:45.322035Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Plotting sector (95.00% complete) sector_index=19
2024-05-27T07:46:47.125506Z INFO {farm_index=0}: subspace_farmer::single_disk_farm::plotting: Initial plotting complete
Hard to answer without knowing what machines you have, what networking you have between them, etc. This thread is already quite long, please create a separate topic and describe all the machines you have, networking between them and what components each of them is running.
System Specs
Alpha has a 12c Ryzen and runs only node, cache and controller + nats.
Farmers are all 7950X or HP Z840s with a v4 Xeon.
Errors
Controller
2024-06-03T12:57:01.149958Z INFO async_nats: event: slow consumers for subscription 1
The above messages was getting spammed until I restarted my Controller
I can upload logs for other parts of the cluster if needed.
System Diagram:
I have detailed logs with a similar issue, Iâll ping you if more information is needed
Turns out I can benefit from logs with RUST_LOG=info,subspace_farmer=trace
, thanks!
It took about two days to replicate the slow consumers for subscription 1
errors. I rotated the logs and was able to capture it with just under one day of logs. I then realized I only had the trace on the controller, cache, and plotter. (Standard logging on the farmers). The logs are pretty good size ~1.9GiB (~158MiB compressed).
If youâd like me to run this again with the farmers using extended logs, let me know.
Snapshot build #342
System Specs
Role: NATS (nats.log)
Dual Socket 2680v3 24 cores (48 threads)
Link: 20Gbit
Role: Cache, Controller, Plotter, Farmer (cache.log controller.log plotter.log farmer1.log)
Dual Socket 7742 128 cores (256 threads)
Cache: 100GiB
Plots: 109T
Link: 100Gbit
Role: Node (node.log)
EPYC 7F72 24 Cores (48 Threads)
Plots: 0
Link: 100Gbit
Role: Farmer (farmer3.log)
Dual Socket 2687v4 24 cores (48 threads)
Plots:189T
Link: 20Gbit
Role: Farmer (farmer4.log)
Dual Socket 2697A 32 Cores (64 Threads)
Plots:189T
Link: 20Gbit
Role: Farmer (farmer6.log)
Dual Socket 6136 24 Cores (48 Threads)
Plots 91TB
Link: 20Gbit
Iâm wondering if youâll be able to reproduce it with There are too many warning logs in the farmer cluster - #22 by duanyz_aiyo