Hello, with a node/farmer that uses an enterprise 18TB SATA HDD, I have node and farmer configured to run on the HDD only. It has been syncing for a while. I noticed that the processes have created a lot of file descriptors.
Plot size 2T
788 for subspace-farmer
457 for subspace-node
Plot size 23T
12918 for subspace-farmer
647 for subspace-node
[srv_subspace@node-7d subspace]$ ls -l /proc/$(pidof subspace-farmer)/fd |wc -l
788
[srv_subspace@node-7d subspace]$ ls -l /proc/$(pidof subspace-node)/fd |wc -l
457
[srv_subspace@ma-plotter-02 subspace]$ ls -l /proc/$(pidof subspace-farmer)/fd |wc -l
12918
[srv_subspace@ma-plotter-02 subspace]$ ls -l /proc/$(pidof subspace-node)/fd |wc -l
647
I think as the project/community grows, we are going to need more comprehensive documentation on infrastructure and workflow.
I am not suggesting that the Subspace team should do the homework for whales (or ambitious farmers), but:
If this project gets the sort of attention that Chia received last year, there will be a lot of questions about infrastructure from newcomers and, specifically, about running larger farms. While a lot of experience and knowledge will carry over from Chia and similar projects in the storage space, there are important differences in how Subspace operates, and there are a few gotchas. This seems especially applicable to people with larger farms or growth ambitions.
The GUI client was clearly a way to simplify onboarding for beginners, and that’s important. But I think that the community needs an infrastructure/scaling focused KB and a community channel (discord?)
(I am new to this project so please take what I say with a grain of salt)
I take from your website: " Energy-intensive mining and capital-intensive staking are replaced with a new form of disk-based farming" I don’t really see how subspace is going to be disk farming, where disk has a meaning, spindle, hard drive. A flash drive, is not a disk. I am running for days successfully a couple nodes all flash. The one farmer running on HDD, the example used to create this thread, hasn’t synced yet, even only a 2T plot is too slow.
The IO profile right now is no fit for HDD based farming unless in the sub 1T space, maybe?! Node doesn’t seem to sync fast enough when on HDD.
I think you have to plot on SSD or some other storage with a decent amount of IOPS and throughput available, then move to slow disks.
(not sure about the re-salting thing)
I moved a finished and synced 4.5T plot node/farmer from a SSD volume to a raid-0 volume made with 5x HDD and 4 SSD for log and cache (zfs volume), it didn’t work, couldn’t keep up with the chain.
My small temp node has been struggling to sync+plot 1.5T on a RAID5(H710P) of 6x10K 600G disks (granted, it’s a terrible setup, but I did not expect it to be as slow as it has been). I haven’t tried to plot subspace on SSD or accelerate the disk in any way yet as I’m still in the early exploratory stage with this, and I’m not sure what it will look like once the initial plotting process has finished.
Since you mentioned you’re running 4 SSDs for ZFS, I assume you’re well versed in ZFS performance tuning and the like (though RAID0?), but have you tried to run the same setup with a lighter FS and use those SSDs as basic linux disk cache? Just throwing darts in the dark.
What do the IO metrics look like?
Would be great to get input from the team on this one. This is impactful if it’s not an isolated and setup-specific case.
Thanks for all of the feedback! The thing to consider is that we’re only running an incentivized testnet right now.
We know that users are facing some problems with node syncing with HDDs, especially bigger ones and we already have a planned fix for that in development. The farmer syncing algorithm is being reworked right now, so that data will be peer to peer synced between farmers (which weakens dependency of a farmer on a node and also would speed up initial sync and plotting a lot).
ZFS’s RAID-0, isn’t really named RAID-0… It’s simply a list of vdevs consisting of one disk per vdev… Then I added SSDs for ZFS log (stripe of 2 SSD) and SSDs for ZFS read cache (stripe of 2 SSD)…
2 SSD for log (speads up write)
2 SSD for cache (read)
5 HDD each as a vdev, basically makes a raid-0 of 5…
I set record-size to 4kb, since the subspace website mentions millions of 4kb pieces in plots… The farmer didn’t even start, so I soon dropped the idea. I wasn’t even getting ready to look at IO stats.
What would help is more predictability as for read and write workloads. Almost like a benchmark. One cannot sit in front of a screen to hope syncing kicks off (so you can look at the IO) or plotting kicks off so you can look at IO again.
You could capture stats into a log files with time stamps, and compare with timestamps of subspace log. like zpool iostat -v 2 shows some first derails on what’s happening to the HDDs and SSDs.
I didn’t dig into the code yet, I know some rust. But from the log output I assume subspace does
re: increase open files… You do not increase, your log says you set soft to hard. Some users will still have set a fairly low hard limit. I set mine to 1M.
subspace_farmer::utils: Increase file limit from soft to hard (limit is 1048576)
I assume these lines mean you work with notification, under the hood I guess the same inotify on Linux would achieve?
subspace_farmer::farming: Subscribing to slot info notifications
edit: I missed to mention, I did look at plot.rs (subspace/plot.rs at main · subspace/subspace · GitHub) and I found “random” in there as for processing. I didn’t dig further but if processing random pieces leads to IO, that is always bad with HDD.