More rewards on smaller plots

We provide a hint of random access already, that is already an indication to the kernel that there is no point to read ahead and cache anything in the first place, we’ll be accessing random parts of the file anyway. At least that is how I read it according to docs, feel free to play with flags and let me know what you have learned. But I do think you should generate random challenges to make it more realistic and the benchmark that is already in the repo should probably do the same.

I added randomisation of global_challenge in each round. Now it takes ~1.79 seconds to audit all 936 sectors. IOWAIT is quite high. The parallel audit still shows ~46ms.

MADV_RANDOM, as I remember, just prevents unneeded read-ahead. MADV_DONTNEED implies that the data is no longer needed and the cache is dropped if there is no memory pressure, but it didn’t help, as expected.

Okay, so we do need to parallelize it. That is helpful, I’ll see how to do it. I’d like the threads to still be named according to plots for easier debugging, but simply using rayon is easier of course.

1 Like

Here is a snapshot build (still in progress) with parallel auditing as implemented in Parallelize audit across multiple cores by nazar-pc · Pull Request #1944 · subspace/subspace · GitHub : Snapshot build · subspace/subspace@81ed094 · GitHub

Once CI run finishes there will be both executables attached to the workflow artifacts and container images will also be published. Give it a try and let me know how it works.

Once the team reviews it and there is a positive feedback we’ll make another Gemini 3f release with this change included.

Thanks for testing and actionable feedback!

1 Like

I’ve started the farmer with these changes. So far I notice a slight decrease in the average IOWAIT, but it takes time to get results.

One more thing, the farmer earned 2 rewards almost right away. That looks promising.

1 Like

+9 rewards. That’s an all-time record. The farmer used to earn 3-4 rewards per hour, but here it’s as many as 9 in about 15 minutes.

Does it make sense and is it possible to implement parallelism for proving?

Proving is already highly parallel!

It was a major bottleneck initially and was very heavily optimized. In fact it sacrifices efficiency (by burning more CPU than would be necessary otherwise) and memory usage to achieve lower latency.

We started with something like 1.5 seconds proving on my machine (Core i9-13900K with 8C/16T performance cores at 5.5GHz and 16 efficiency cores at 4.3GHz), now the same proving takes 150ms (in memory) and we still know we can do better, just very hard to actually get there. Disk reads are also parallelized there.

Yes, I noticed that initially, but decided to give it a try nonetheless.

I tried to foolishly parallelise it using Rayon and was able to achieve on my machine a 2x reduction in time spent proving benchmarks on 48 sectors.
11.7s/43.6s → 6.6s

Initially the proving benchmark time on 48 sectors was around 11.7 seconds, but now it is around 43.6 seconds. I can’t tell the exact reason at the moment.

You will unlikely have 48 sectors to prove on a real netwok with a lot of space, there is no point in optimizing throughput. As I mentioned above, it is designed to achieve lower latency instead for a single sector, that was the design goal.

Got it, thank you for the clarification.

Increasing the number of Rayon threads to values much higher than the number of threads in the system, (via the RAYON_NUM_THREADS environment variable) significantly reduces IOWAIT and thus increases CPU utilisation. In auditing, the most costly part is reading, and SSDs only benefit from parallel operations (if supported by the filesystem), so it helps. Helps a lot.

By setting the value to 3 times the number of threads in the system I achieved an average iowait around 0.1% with occasional peaks of 0.3-0.4%.

The goal is not to remove IOWAIT to zero though. We’ll have to look into changing the way farmer works probably. Right now it is designed to use memory mapped I/O, but maybe that was not such a good decision after all.

@nazar-pc Parallel auditing has almost completely removed the impact of plotting on farming. Plotting used to be able to reduce the amount of rewards a farmer brings by half (and sometimes even more), but now they are either the same or the reduction is minimal.

Great. I have also created Make plot reads, at least for auditing, async · Issue #1946 · subspace/subspace · GitHub, it’ll be a bit tricky and maybe awkward to do, but should remove the need to create many threads in the first place and will improve efficiency.

Thanks for testing and all the feedback!

1 Like

Here is the official release: Release gemini-3f-2023-sep-11 · subspace/subspace · GitHub

1 Like