This is a follow-up to a post I made in the general-chat channel on discord earlier today.
While the minimum system requirements are very clear, it is not clear how to scale up a large farm. Based on my personal experience along with discord questions and answers I have been able to compile a few bits of information. Some of those bits are:
-
Large plots take up tons of ram - at least initially. 16 TB of plots (4 @ 4 TB SSD’s) takes up 222 gigs of ram on my server initially and then settles at about 80-82 gigs used after 20-30 minutes.
-
Internet usage is massive. Today I saw the receive on my NIC stay above 100 Mbps for a good 20 minutes and even hitting 200 at one point. Checking my usage at my ISP, it looks like I spent an extra 3 TB of data during my last testing phase which lasted a few weeks.
-
It appears the node process is largely single threaded and is limited by the speed of the core it is running. It does take a considerable amount of time to sync up. I am getting about 2-4 blocks per second.
-
The farmer has very brief periods of heavy CPU usage and is multi-threaded to some extent. It usually hovers around 20% of the processor and will only use one processor in a multi-processor system.
-
Following posted recommendations, I am not using raided SSD’s. SATA SSD drives appear to work fine as I have never seen usage higher than a few hundred MB/s. Most of the time the drives are sitting idle while plotting (and farming). Every once in awhile I will see all of them with a few hundred kb/sec usage.
-
I have yet to determine what are the bottlenecks other than the initial ram usage when the farmer process starts. Most of the time (i.e. 95% or more), I see plenty of available ram, CPU, disk I/O, and network bandwidth.
As I had mentioned earlier in this post, I am interested in determining how to scale things up. I have a ton of questions. Would it be best to add more SSD’s to an existing PC or run more PC’s? Based on my observations right now we are limited by RAM, but I have a few older workstations with 512 gigs of ram with Ivy Bridge processors. Could I run 32 TB on them? is it always better to run one node process and multiple farmers? Is it better to use the largest plot sizes you can fit on an SSD or should you split them into different files? Right now I am not CPU limited, but I have less than 1% of 16TB plotted. Are CPU or RAM requirements going to go up the farther I get? Is there anything that I can do to increase my plotting speed? When a network is brand new there is a huge advantage in getting in early. But with such a slow plotting speed, does it make sense to run as many different farmers as possible with smaller (or fewer) drives to be able to plot quicker? How is internet bandwidth being used here? Are larger plots taking up more bandwidth?
Tons of questions here and I can understand if they can’t all be answered right away. But now that a date has been announced for the incentivized testnet I am sure there are many people that are interested in the answers. I would appreciate some guidance from the developers here.
Thank you!