Assuming the issue is number of threads, you can shut down all Subspace software and try iperf3 with -P NUM_STREAMS option. I believe by default it is 1, but you can try setting it to 10, 100, see if there it goes offline after some reasonable number.
For your reference, each Subspace instance (both node and farmer) are configured to up to 100 outgoing streams by default right now, but I don’t think it actually reaches that often.
Just did a test, with max (128) streams I can push almost 1Gbps, which is the limit of what that server can do.
Current state is that I have Space Acres + 1 Node/Farmer, and the network is still doing well. The person from the ISP said they had done something so who knows. Today I will slowly turn on more farmers. I switched my setup up a bit so I have 1 Node and I am connecting each Farmer to the single node.
Well it still isn’t fixed unfortunately. I’ll call them again today and see if I can talk to a tech or someone that can give me more information. It’s a busy week this week unfortunately but will see if I can carve out some time.
The VPN solution still works - and I’m setting up a VLAN anyways for just my crytpo servers and plan to put all my crypto on a VPN. My fiance is tolerating my network crashes but in the future would like to make sure only the crypto stuff is impacted.
The VPN is the holy grail. Syncing is at 50+ bps and Piece Cache syncs in just a few minutes. Zero impact to my home network. VPN keeps up with at least 3 node/farmers.
ISP would not tell me the issue they apparently noticed the other day. ISP would also not admit that they did any sort of rate limiting or blocking of traffic. So who knows. The only solution for me was to set up a VPN.
I now have pfsense w/ OpenVPN client setup, connected to an EC2 instance running OpenVPN server. I have this set as a gateway on the pfsense and I set up an alias that routes specific IPs through the VPN. I wanted to do it by VLAN, which I will eventually, but some of the servers are on a switch that is not VLAN capable which is shared with regular devices I don’t want to route through the VPN. But working on a solution to that.
EDIT: I spoke too soon - I’m getting high CPU usage and packet loss. I think I will need to upgrade my pfsense now
EDIT 2: After piece cache sync CPU usage went down quite a bit. So I’m slowing starting up each Server. So far I have 5 PCs farming and network is holding up well
Alright one more update. Staggering my start ups I successfully have 8 farmers running on the VPN perfectly. No impact to my home network. I still plan to upgrade my pfsense as CPU usage is hanging around 50-60%.
Which software and version are you using right now? I have some ideas on what to experiment with and can create a custom build for that version with a single change to see if it helps.
@repost I’m wondering if Snapshot build · subspace/subspace@483db85 · GitHub helps you in any way. If my current theory is correct, it should help. But would be great to first confirm that you still have issues with mar-18 before trying that one such that we have a proper comparison.
I have tested this build today, and seems something is working differently with the network saturation issue. This morning I was trying to run Space Acres 0.1.11-1, and impacting the network really badly, dropping connections, and timeouts. I switched over to the Ubuntu build, Node sync’d then started farming/plotting, still some issue I would say but something is very different. Happy to continue to test and provide logs etc. I am running tests on both Windows & Ubuntu. I sense that the network issue is really a larger issue on Linux/Ubuntu than Windows but I have no firm evidence at this point. Running Data Dog on both systems to monitor.
After red line is the new build.
There is no difference network-wise between Windows and Ubuntu from architecture point of view.
It would be good if you can quantify “very different”, but either way this topic is not about Ubuntu vs Windows.
What I am curious about right now is if you have issues with CLI of the current latest release, whether switching to above experimental build changes anything for you. Ideally we’d see that it stops networking from breaking.
I have been running the experimental version for over 16 hours, with good results.
What is different from ver feb 19th, is my network is not “stopping” the Feb 19th version would crash my network completely, and I would have to stop running the CLI or space acres. I had given up with SubSpace due to this issue. With this Experimental version(ubuntu) it is better, but I sense there is still something impacting the network that is not bandwidth. Not sure if you can share the change as to narrow my focus on what to monitor or not. Ping times good and no dropped packets are zero in the past few hours.
images below are ping test this AM, and the Ubuntu host monitor. Happy to revert to Mar22nd version if that helps test your theory.
My suspicion is that it was never bandwidth, but rather churn of connections. And probably because you run Space Acres it still impacts things the same way as before. Since we were using UDP that doesn’t have “connections”, it creates firewall states that then expire over time and apparently some routers abilities and/or ISP limits are low enough to be overwhelmed by the number of states being generated.
That test build removes UDP/QUIC support from software completely, switching to TCP instead. While the same number of connections is made, TCP has connections and has explicit connection closing, such that firewalls/routers can drop corresponding states immediately (if they wish to, they don’t always do that).
Would be great to get more confirmations before removing QUIC for everyone.
Thanks for the explanation, I was planning to setup a “better” router to confirm a few other things I am working on, so happy to pull stats on that, should be in the next week.
I privately messaged you in January about the infinite retry loop bug in UDP connections in libp2p. It has finally been removed from the current version along with qubic. Perhaps it’s time to find out the reason behind the qubic issue.
Because there really was no need to specifically use UDP for data transmission, considering the QoS policies of various countries and other factors, I wanted you to disable UDP at that time.