czbenchmarks.cli.cli_run

Attributes

`log`
`VALID_OUTPUT_FORMATS`
`DEFAULT_OUTPUT_FORMAT`

Functions

`add_arguments`(→ None)	Add run command arguments to the parser.
`main`(→ None)	Execute a series of tasks using multiple models on a collection of datasets.
`run`(→ list[czbenchmarks.cli.types.TaskResult])	Run a set of tasks against a set of datasets. Runs inference if any model_args are specified.
`run_with_inference`(...)	Execute a series of tasks using multiple models on a collection of datasets.
`run_inference_or_load_from_cache`(...)	Load the processed dataset from the cache if it exists, else run inference and save to cache.
`run_without_inference`(...)	Run a set of tasks directly against raw datasets without first running model inference.
`run_multi_dataset_task`(...)	Run a task and return the results.
`run_task`(→ list[czbenchmarks.cli.types.TaskResult])	Run a task and return the results.
`get_model_arg_permutations`(→ dict[str, ...)	Generate all the "permutations" of model arguments we want to run for each dataset:
`write_results`(→ None)	Format and write results to the given directory or file.
`get_result_url_for_remote`(→ str)
`set_processed_datasets_cache`(→ None)	Write a dataset to the cache
`try_processed_datasets_cache`(...)	Deserialize and return a processed dataset from the cache if it exists, else return None.
`get_remote_cache_prefix`(cache_options)	get the prefix ending in '/' that the remote processed datasets go under
`get_processed_dataset_cache_filename`(→ str)	generate a unique filename for the given dataset and model arguments
`get_processed_dataset_cache_path`(→ pathlib.Path)	Return a unique file path in the cache directory for the given dataset and model arguments.
`parse_model_args`(→ czbenchmarks.cli.types.ModelArgs)	Populate a ModelArgs instance from the given argparse namespace.
`parse_task_args`(→ czbenchmarks.cli.types.TaskArgs)	Populate a TaskArgs instance from the given argparse namespace.
`parse_batch_json`(→ list[dict[str, Any]])	Parse the --batch-json and --batch-random-seeds argument.

Module Contents

czbenchmarks.cli.cli_run.log[source]

czbenchmarks.cli.cli_run.VALID_OUTPUT_FORMATS = ['json', 'yaml']

czbenchmarks.cli.cli_run.DEFAULT_OUTPUT_FORMAT = 'json'

czbenchmarks.cli.cli_run.add_arguments(parser: argparse.ArgumentParser) → None[source]: Add run command arguments to the parser.

czbenchmarks.cli.cli_run.main(parsed_args: argparse.Namespace) → None[source]

Execute a series of tasks using multiple models on a collection of datasets.

This function handles the benchmarking process by iterating over the specified datasets, running inference with the provided models to generate results, and running the tasks to evaluate the generated outputs.

czbenchmarks.cli.cli_run.run(dataset_names: list[str], model_args: list[czbenchmarks.cli.types.ModelArgs], task_args: list[czbenchmarks.cli.types.TaskArgs], cache_options: czbenchmarks.cli.types.CacheOptions) → list[czbenchmarks.cli.types.TaskResult][source]: Run a set of tasks against a set of datasets. Runs inference if any model_args are specified.

czbenchmarks.cli.cli_run.run_with_inference(dataset_names: list[str], model_args: list[czbenchmarks.cli.types.ModelArgs], task_args: list[czbenchmarks.cli.types.TaskArgs], cache_options: czbenchmarks.cli.types.CacheOptions) → list[czbenchmarks.cli.types.TaskResult][source]

Execute a series of tasks using multiple models on a collection of datasets.

This function handles the benchmarking process by iterating over the specified datasets, running inference with the provided models to generate results, and running the tasks to evaluate the generated outputs.

czbenchmarks.cli.cli_run.run_inference_or_load_from_cache(dataset_name: str, *, model_name: str, model_args: czbenchmarks.cli.types.ModelArgsDict, cache_options: czbenchmarks.cli.types.CacheOptions) → czbenchmarks.datasets.base.BaseDataset[source]: Load the processed dataset from the cache if it exists, else run inference and save to cache.

czbenchmarks.cli.cli_run.run_without_inference(dataset_names: list[str], task_args: list[czbenchmarks.cli.types.TaskArgs]) → list[czbenchmarks.cli.types.TaskResult][source]: Run a set of tasks directly against raw datasets without first running model inference.

czbenchmarks.cli.cli_run.run_multi_dataset_task(dataset_names: list[str], embeddings: list[czbenchmarks.datasets.base.BaseDataset], model_args: dict[str, czbenchmarks.cli.types.ModelArgsDict], task_args: czbenchmarks.cli.types.TaskArgs) → list[czbenchmarks.cli.types.TaskResult][source]: Run a task and return the results.

czbenchmarks.cli.cli_run.run_task(dataset_name: str, dataset: czbenchmarks.datasets.base.BaseDataset, model_args: dict[str, czbenchmarks.cli.types.ModelArgsDict], task_args: czbenchmarks.cli.types.TaskArgs) → list[czbenchmarks.cli.types.TaskResult][source]: Run a task and return the results.

czbenchmarks.cli.cli_run.get_model_arg_permutations(model_args: list[czbenchmarks.cli.types.ModelArgs]) → dict[str, list[czbenchmarks.cli.types.ModelArgsDict]][source]: Generate all the “permutations” of model arguments we want to run for each dataset: E.g. Running 2 variants of scgenept at 2 chunk sizes results in 4 permutations

czbenchmarks.cli.cli_run.write_results(task_results: list[czbenchmarks.cli.types.TaskResult], *, cache_options: czbenchmarks.cli.types.CacheOptions, output_format: str = DEFAULT_OUTPUT_FORMAT, output_file: str | None = None) → None[source]: Format and write results to the given directory or file.

czbenchmarks.cli.cli_run.get_result_url_for_remote(remote_prefix_url: str) → str[source]

czbenchmarks.cli.cli_run.set_processed_datasets_cache(dataset: czbenchmarks.datasets.base.BaseDataset, dataset_name: str, *, model_name: str, model_args: czbenchmarks.cli.types.ModelArgsDict, cache_options: czbenchmarks.cli.types.CacheOptions) → None[source]: Write a dataset to the cache A “processed” dataset has been run with model inference for the given arguments.

czbenchmarks.cli.cli_run.try_processed_datasets_cache(dataset_name: str, *, model_name: str, model_args: czbenchmarks.cli.types.ModelArgsDict, cache_options: czbenchmarks.cli.types.CacheOptions) → czbenchmarks.datasets.base.BaseDataset | None[source]: Deserialize and return a processed dataset from the cache if it exists, else return None.

czbenchmarks.cli.cli_run.get_remote_cache_prefix(cache_options: czbenchmarks.cli.types.CacheOptions)[source]: get the prefix ending in ‘/’ that the remote processed datasets go under

czbenchmarks.cli.cli_run.get_processed_dataset_cache_filename(dataset_name: str, *, model_name: str, model_args: czbenchmarks.cli.types.ModelArgsDict) → str[source]: generate a unique filename for the given dataset and model arguments

czbenchmarks.cli.cli_run.get_processed_dataset_cache_path(dataset_name: str, *, model_name: str, model_args: czbenchmarks.cli.types.ModelArgsDict) → pathlib.Path[source]: Return a unique file path in the cache directory for the given dataset and model arguments.

czbenchmarks.cli.cli_run.parse_model_args(model_name: str, args: argparse.Namespace) → czbenchmarks.cli.types.ModelArgs[source]: Populate a ModelArgs instance from the given argparse namespace.

czbenchmarks.cli.cli_run.parse_task_args(task_name: str, TaskCls: type[czbenchmarks.cli.types.TaskType], args: argparse.Namespace) → czbenchmarks.cli.types.TaskArgs[source]: Populate a TaskArgs instance from the given argparse namespace.

czbenchmarks.cli.cli_run.parse_batch_json(batch_json_list: list[str], batch_random_seeds: list[int]) → list[dict[str, Any]][source]: Parse the –batch-json and –batch-random-seeds argument. Returns a list of dicts where each entry is a batch of CLI arguments.