From all the disks in my PC that I’ve run benchmark so far, the concurrent chunk prove method is always faster. I understand that when it runs benchmark, it run 10 times, so the result is accurate.
However, the logic behind Mar 18 release is that, it will take concurrent chunk as the default faster method and it will run once per plot. If the result is less than 3s, it will keep the prove method as concurrent chunk. If not, it will change the prove method as whole sector.
However this is misleading. First, if run only once, so it can be inaccurate, especially there’re many I/O due to consequent benchmark run at start. I sometimes see 1 or 2 of my disks, the prove method is changed to whole sector and there is no difference of this disk type and its connection type, but it is changed to whole sector prove and this disk will always miss reward. This is a pain because I know if the farmer does the whole sector prove for that plot, I will never win reward.
Second, if a concurrent chunk is more than 3s for any disk, why the farmer don’t do the whole sector prove and compare the time. What if the concurrent chunk prove time at start is 3.3s but the whole sector prove time is 10s, why the farmer does NOT do the test for the whole sector prove but select it as the ‘better one’??? Well, if this plot is set to ‘concurrent chunk’, it least later it can win reward and miss some. While if it is set to ‘whole sector’, it will always always miss reward.
I hope this is changed ASAP. I’m really missing reward in some of my plots; or I will have to run the farmer again and again until it can do the concurrent chunk benchmark successfully for all disks at start.
This behavior itself eliminate the purpose of benchmark run: missing more reward (if any disk is set to whole sector prove) and wasting more time from farmers like me (as I have to re-run farmer again and again).
It is not misleading, you simply criticizing how you imagine things work, not how they actually work. See source code for the actual non-imaginary implementation.
Benchmarks are not running concurrently at start, instead they are running exclusively one by one. As you can see corresponding code is protected by barriers, nothing is happening on any of the farms while benchmarks are running.
Are you sure? Whole sector is less efficient, but on most SSDs should still be able to win and the only reason for farmer to pick that is because reading random chunks was too slow, which means it would miss if that happens too.
It is an optimization to save startup time. In many cases problematic SSDs take MANY seconds to read chunks, so running that 3 times to compare results may mean waiting for a whole minute benching just one farm. And some users have tens of farms configured, they wouldn’t want to wait an hour for farmer to start.
I’m afraid you have other issues to resolve then, it doesn’t pick whole disk for no reason. You did absolutely have some issues on that machine for READ (not even the whole proving) to take over 3 seconds. That is something you need to debug and address.
Is it still not perfectly possible to win with 3.1s proving, if the audit goes fast?
It would seem to me that if you’re not going to actually benchmark whole sector, it should only default to that if the concurrent chunk method cannot win. Raising that threshold to 4s seems a reasonable request.
The benchmark command run 10 times, is it correct? I have a disk, same model, same connection (SATA directly to mobo) but it miss reward 100%. Then I figured out it was failed at the start when the farmer ran ‘concurrent chunk’, for some reason, the concurrent chunk prove was more than 3s, then it was assigned the ‘whole sector’ method.
I am sure this assumption “if concurrent chunk is failed at the start of the farmer, for sure, there is nothing to lose to assign the whole sector method to the plot” is misleading. Because I have ran the benchmark for this disk, result is very clear below.
Summary
PS C:\Subspace Farmer> .\subspace-farmer3h_18Mar benchmark prove L:\1
Benchmarking prove/plot/rayon/unbuffered/concurrent-chunks
Benchmarking prove/plot/rayon/unbuffered/concurrent-chunks: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 17.1s.
Benchmarking prove/plot/rayon/unbuffered/concurrent-chunks: Collecting 10 samples in estimated 17.073 s (10 iterations)
Benchmarking prove/plot/rayon/unbuffered/concurrent-chunks: Analyzing
prove/plot/rayon/unbuffered/concurrent-chunks
time: [1.7015 s 1.7338 s 1.7650 s]
change: [+96.532% +101.21% +105.65%] (p = 0.00 < 0.05)
Performance has regressed.
Benchmarking prove/plot/rayon/unbuffered/whole-sector
Benchmarking prove/plot/rayon/unbuffered/whole-sector: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 45.9s.
Benchmarking prove/plot/rayon/unbuffered/whole-sector: Collecting 10 samples in estimated 45.903 s (10 iterations)
Benchmarking prove/plot/rayon/unbuffered/whole-sector: Analyzing
prove/plot/rayon/unbuffered/whole-sector
time: [4.3469 s 4.4422 s 4.5344 s]
change: [-13.564% -10.446% -7.0690%] (p = 0.00 < 0.05)
Performance has improved.
Benchmarking prove/plot/rayon/regular/concurrent-chunks
Benchmarking prove/plot/rayon/regular/concurrent-chunks: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 20.8s.
Benchmarking prove/plot/rayon/regular/concurrent-chunks: Collecting 10 samples in estimated 20.793 s (10 iterations)
Benchmarking prove/plot/rayon/regular/concurrent-chunks: Analyzing
prove/plot/rayon/regular/concurrent-chunks
time: [1.7049 s 1.7269 s 1.7477 s]
change: [+95.397% +99.058% +102.62%] (p = 0.00 < 0.05)
Performance has regressed.
Benchmarking prove/plot/rayon/regular/whole-sector
Benchmarking prove/plot/rayon/regular/whole-sector: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 39.9s.
Benchmarking prove/plot/rayon/regular/whole-sector: Collecting 10 samples in estimated 39.885 s (10 iterations)
Benchmarking prove/plot/rayon/regular/whole-sector: Analyzing
prove/plot/rayon/regular/whole-sector
time: [3.9536 s 4.2475 s 4.7334 s]
change: [-4.3274% +18.032% +44.805%] (p = 0.15 > 0.05)
No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
1 (10.
00%) high severe
Now I know why Mar 11 has worse miss rate than Mar 8, simply because this logic was applied for Mar 11.
Yes, I sound a bit criticizing. But I’ve been staring at that disk, which has normal active time as others, same model, same connection, and that disk has 8/8 miss over more than a day, not a single win. I had to think of selling it. Then, you may get my feeling.
So either way, I want to have an additional check by whole sector method if a plot has beyond 3s by concurrent chunk method and then farmer to decide which is better after it runs both methods. Or you let us, farmers, decide on the prove method without the blind assumption.
Well, the line had to be drawn somewhere. 4s is already way too late. Any modern SSD should be able to read a few hundred MiB/s, so even if it is SATA and can’t reach 500 MiB/s, it will still prove successfully. If you have something that reads less than ~280 MiB/s sequentially you must have serious issues somewhere.
No, that is something you imagined, I have no idea where you got that. Check the code I linked, line 2199.
That is a logical heuristic that was selected due to logical reasons. It is certainly unexpected that SSD can’t read 1 GiB of data in 4 seconds sequentially, but I believe you that it does happen in practice.
That sounds reasonable, thanks for feedback. I’ll try to implement something better in the next version.
If the benchmark at farmer start is producing significantly slower results than the benchmark from CLI, as those results seem to suggest, that’s certainly an issue that should be looked at.
However, I think that no matter what other opinions there are on this subject, in the long run it would benefit everyone to be able to have a startup parameter specifying method. The vast majority of people seem to do much better with concurrent chunk on every disk, so having that as a farm level parameter would allow them to skip the benchmarking on every run. (Of course, I think it’d be great if we could start adding plot level parameters for when that makes sense, too)
I also wonder if the whole sector method only ever showed better results due to the windows bugs that have now been fixed, rendering it no longer necessary.
Fair enough. Query: Is there a technical reason that makes it difficult to specify plot level parameters aside from size? Like, if you did add a --method parameter, it would be ideal to have it on the plot level rather than the farm level. (I guess format would be “,method=chunk” instead of a --param.)
Ideally farmer would decrease number of parameters over time rather than adding even more of them. It should be possible to make most of the decisions automatically and maintaining countless options increases maintenance burden while ensuring they all work in all possible combinations.
Separate benchmarking command and the internal benchmarking farmer does during startup are completely different things that have almost nothing in common.
Dedicated benchmarking does actual proving in a more or less scientific way.
Internal benchmarking farmer does on startup tries to quickly and roughly identify what is the best approach without running lengthy benchmark on every single farm because then some users will wait for an hour before farmer actually starts farming on restart and maybe will refuse to upgrade as often given such a massive downtime.
So you’ve got my dedicated benchmark result on the disk (shared above) that it was assigned whole sector method, and it missed 8/8 (100%) after almost 2 days.
I bought it and then I had to think of selling it off.
I spent time to plot to it and I had to copy the plot to other disk. This was done.
Then I realised, wow, it was not working the way it should be after running the last bechmark before kissing good bye.
I’m sorry but it hurted me a lot. I hope you can improve it on the next release.
Well, I do get that, but given that the automation in this case incurs a not-totally-insignificant delay on every single farmer start, AND as you said the benchmark used for this automation is less scientific and therefore less accurate than the dedicated benchmark, it would seem to me that the benefits of letting users set their preference manually outweigh the cons in this regard. IMO, I think it’s inevitable that unless and until a perfectly accurate and trivially fast benchmark method that works on every possible hardware is reality, the ability to do a manual setting will eventually be deemed necessary.
Still early, but the experimental build is looking promising so far… already have multiple rewards on one particular drive that was benchmarked as too slow and previously did not see any success for days…
I’m running test build and it seems fine so far but I feel it doesn’t truly address the issue.
Root cause is that due to some factors (I couldn’t narrow it down so far, but i suspect either some Windows shenanigans, controller firmware or system driver caching issues, or internal drive issues or firmware routines like garbage collection / write amplification - last one may be caused by drive being completely full) a SSD drive can have bouts of extremely poor performance for ~15 seconds every 10-15 minutes.
Therefore having farmer automatically run benchmark on startup is going to eventually cause incorrect results and drive farming modes for people who have this kind of problematic drives like I do.