Adapting farmer benchmarks for existing plots and integrating them into CLI

This thread is about trying to adapt existing benchmarks to use existing plots and integrating them into the farmer CLI.

I have almost no experience in Rust development, but I am currently in the midst of trying to adapt benchmarks to use existing plots and integrate them into the CLI.
I ran into some problems while doing this, namely the exposing Criterion CLI. To be honest, I have no idea how to implement both this and parameter passing (path to plot, for example). If it is possible to do without Criterion CLI (at least temporarily), or to pass values through environment variables, everything becomes much easier.

1 Like

I think we can go without criterion for this. As described in More rewards on smaller plots - #5 by nazar-pc it would probably be better to have a separate root command bench or benchmark on subspace-farmer, then we can have multiple subcommands starting with audit and have any CLI arguments we want there.

Criterion can be added later if necessary though. The initial goal would be to make command read farm information and being able to audit it in isolation without plotting or proving.

Somewhat complementary, when we add metrics to subspace-farmer we’ll be able to expose things like audit time per farm in there for easier analysis and visualization.

I meant integrating adapted Criterion benchmarks into farmer CLI. subspace-farmer bench audit and so on. Just Criterion exposes its own CLI by default, but it can be removed if it’s not that necessary.
Some of the arguments for setting up Criterion can be implemented independently. An example of a simple program that works:
src/main.rs:

mod audit;

use crate::audit::audit_benchmark;
use std::path::PathBuf;
use clap::{Parser, Subcommand};

#[derive(Debug, Parser)]
#[clap(about, version)]
enum Command {
    Bench {
        #[command(subcommand)]
        subcommand: BenchCommand
    },
}

#[derive(Debug, Parser)]
struct BenchArgs {
    base_path: PathBuf,
}

#[derive(Debug, Subcommand)]
enum BenchCommand {
    Audit(BenchArgs),
}

fn main() {
    let command = Command::parse();
    match command {
        Command::Bench { subcommand } => {
            match subcommand {
                BenchCommand::Audit(bench_args) => {
                    audit_benchmark(bench_args.base_path);
                }
            }
        }
    }
}

src/audit.rs:

use std::thread::sleep;
use std::time::Duration;
use criterion::{black_box, criterion_group, criterion_main, Criterion, Throughput, SamplingMode};
use std::path::PathBuf;

pub fn audit_benchmark(path: PathBuf) {
    let mut c = Criterion::default();
    println!("Started benchmarking with {:?}", path);
    let mut group = c.benchmark_group("auditing");
    group.throughput(Throughput::Elements(1));
    group.sample_size(10);
    group.sampling_mode(SamplingMode::Flat);
    group.bench_function("memory", |b| {
        b.iter(|| {
            sleep(Duration::from_secs(1));
        })
    });
    group.finish();
    c.final_summary();
}

Generally Criterion parses arguments in a way that treats positional arguments as names of the benches and I’m not sure we need that. So I think the kind of approach you have there is the way to go.

I’ve finished adapting the audit benchmarks, but it’s pretty much raw code that needs refinements.
Should I submit a pull request in this case? If yes, in which branch, main?
I’d also like to adapt reading and proving benchmarks, if that makes sense.

I generally prefer to review finished code, if you want to make my life a bit easier and avoid a lot of back and forth, here is our contributing guide: Contributing to Subspace. PR should be against main branch, we can backport it into Gemini 3f later.

For reading and proving the benchmark we have is fairly decent, but if you see something to improve there I’m open for suggestions. I think it would generally be nice to have proving in subspace-farmer so you don’t have to compile benchmarks from scratch, but the performance targets we have should be feasible on any modern desktop computer, so not like you’d use those benchmarks very often.

There are some, in my opinion, moot decisions in the code, but I don’t know the right way to go about it in this case. I can send code snippets directly here if you don’t want to review such code in a pull request.

My implementation uses PlotMetadataHeader to decode the metadata header, but this structure is private. Identical for the RESERVED_PLOT_METADATA constant. Can I make the structure public for crate?

Links to GitHub are preferred, you can select specific lines in sidebar and copy permalink:

Screenshot

Знімок екрана з 2023-09-04 14-25-03

Hm… I think that would be undesirable, I’d rather see some nicer API for this, but I’ll have to spend some time thinking about it. Not trivial to design a good API and we already need some refactoring there.

From benchmarks is it enough to support one plot, or does it make sense to implement benchmarks for multiple plots in parallel? For now, I’m using internal duplicates of PlotMetadataHeader and RESERVED_PLOT_METADATA as a not-so-good solution.
Here’s a link to the commit on GitHub.

1 Like

I think one is fine unless requested by community. Different farms operate independently, so assuming there is enough CPU/RAM on the machine in general it’ll work about the same with multiple farms as with one.

I think it might be acceptable to make data structures public, but mark them with #[doc(hidden)] indicating they are not supposed to be used in general and leave TODO to create a better abstraction for them. Duplicating data structures will likely break at some point, which is not good.

Left some superficial comments there.

Fixed the code taking into account the comments.

What am I supposed to do about it now? Should I submit a pull request?

That’s what I would do! I will take the blame if it’s wrong :slight_smile:

1 Like

@nazar-pc Is that how it’s supposed to be? The original proving benchmarks use sector_index=0 for all sectors, is this how it should be? Also, the first sector metadata is used for them.

That benchmark is trivialized. In order to avoid plotting multiple sectors individually, which would make benchmark very painful to run each time, it plots one sector and writes it many times, meaning the sector index for which sectors are created is the same even though sectors are written at different offsets physically.

Here is a PR with audit benchmark implemented: Farmer audit benchmark by nazar-pc · Pull Request #2124 · subspace/subspace · GitHub
Even though main is not compatible with Gemini 3f, benchmark can still be used against Gemini 3f farms (at least for now).