Skip to content

CLI#

The ref command-line interface (CLI) is the primary way to interact with the Climate-REF framework. This CLI tool is installed as part of the climate-ref package and provides commands for managing configurations, datasets, diagnostics, and more.

app#

A CLI for the Assessment Fast Track Rapid Evaluation Framework

This CLI provides a number of commands for managing and executing diagnostics.

Usage#

app [OPTIONS] COMMAND [ARGS]...

Arguments#

No arguments available

Options#

Name Description Required Default
--configuration-directory PATH Configuration directory No -
-v, --verbose Set the log level to DEBUG No -
-q, --quiet Set the log level to WARNING No -
--log-level [error|warning|debug|info] Set the level of logging information to display [default: INFO] No -
--version Print the version and exit No -
--install-completion Install completion for the current shell. No -
--show-completion Show completion for the current shell, to copy it or customize the installation. No -
--help Show this message and exit. No -

Commands#

Name Description
solve Solve for executions that require...
config View and update the REF configuration
datasets View and ingest input datasets
db Database management commands
executions View execution groups and their results
providers Manage the REF providers.
test-cases Test data management commands for...
celery Managing remote celery workers

Subcommands#

app solve#

Solve for executions that require recalculation

This may trigger a number of additional calculations depending on what data has been ingested since the last solve. This command will block until all executions have been solved or the timeout is reached.

Filters can be applied to limit the diagnostics and providers that are considered, see the options --diagnostic and --provider for more information.

Usage#

app solve [OPTIONS]

Arguments#

No arguments available

Options#

Name Description Required Default
--dry-run / --no-dry-run Do not execute any diagnostics [default: no-dry-run] No -
--execute / --no-execute Solve the newly identified executions [default: execute] No -
--timeout INTEGER Timeout in seconds for waiting on executions to complete. Defaults to 6 hours. [default: 21600] No -
--wait / --no-wait Wait for executions to complete before exiting. Use --no-wait to queue executions and exit immediately. [default: wait] No -
--one-per-provider / --no-one-per-provider Limit to one execution per provider. This is useful for testing [default: no-one-per-provider] No -
--one-per-diagnostic / --no-one-per-diagnostic Limit to one execution per diagnostic. This is useful for testing [default: no-one-per-diagnostic] No -
--diagnostic TEXT Filters executions by the diagnostic slug. Diagnostics will be included if any of the filters match a case-insensitive subset of the diagnostic slug. Multiple values can be provided No -
--provider TEXT Filters executions by provider slug. Providers will be included if any of the filters match a case-insensitive subset of the provider slug. Multiple values can be provided No -
--dataset-filter TEXT Filter input datasets by facet values using key=value syntax. For example, --dataset-filter source_id=ACCESS-CM2 --dataset-filter variable_id=tas. Multiple values for the same facet are ORed (include any match), different facets are ANDed (must match all). Multiple values can be provided No -
--limit INTEGER Maximum number of executions to run. If not set, all executions are run. No -
--rerun-failed / --no-rerun-failed Re-run all previously failed executions, even if the execution group is not dirty. By default, failed executions are only retried if explicitly flagged dirty. [default: no-rerun-failed] No -
--help Show this message and exit. No -

app config#

View and update the REF configuration

Usage#

app config [OPTIONS] COMMAND [ARGS]...

Arguments#

No arguments available

Options#

Name Description Required Default
--help Show this message and exit. No -

Subcommands#

app config list#

Print the current climate_ref configuration

If a configuration directory is provided, the configuration will attempt to load from the specified directory.

Usage#

app config list [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--help Show this message and exit. No -

app datasets#

View and ingest input datasets

The metadata from these datasets are stored in the database so that they can be used to determine which executions are required for a given diagnostic without having to re-parse the datasets.

Usage#

app datasets [OPTIONS] COMMAND [ARGS]...

Arguments#

No arguments available

Options#

Name Description Required Default
--help Show this message and exit. No -

Subcommands#

app datasets list#

List the datasets that have been ingested

The data catalog is sorted by the date that the dataset was ingested (first = newest).

Usage#

app datasets list [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--source-type [cmip6|cmip7|obs4mips|pmp-climatology] Type of source dataset [default: cmip6] No -
--include-files / --no-include-files Include files in the output [default: no-include-files] No -
--limit INTEGER Limit the number of datasets (or files when using --include-files) to display to this number. [default: 100] No -
--dataset-filter TEXT Filter datasets by facet values using key=value syntax. For example, --dataset-filter source_id=ACCESS-CM2 --dataset-filter variable_id=tas. Multiple values for the same facet are ORed (include any match), different facets are ANDed (must match all). Multiple values can be provided No -
--help Show this message and exit. No -
app datasets list-columns#

List the available columns in the data catalog for the given source type.

Usage#

app datasets list-columns [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--source-type [cmip6|cmip7|obs4mips|pmp-climatology] Type of source dataset [default: cmip6] No -
--include-files / --no-include-files Include files in the output [default: no-include-files] No -
--help Show this message and exit. No -
app datasets ingest#

Ingest a directory of datasets into the database

Each dataset will be loaded and validated using the specified dataset adapter. This will extract metadata from the datasets and store it in the database.

A table of the datasets will be printed to the console at the end of the operation.

Usage#

app datasets ingest [OPTIONS] FILE_OR_DIRECTORY...

Arguments#
Name Description Required
FILE_OR_DIRECTORY... [required] No
Options#
Name Description Required Default
--source-type [cmip6|cmip7|obs4mips|pmp-climatology] Type of source dataset Yes -
--solve / --no-solve Solve for new diagnostic executions after ingestion [default: no-solve] No -
--dry-run / --no-dry-run Do not ingest datasets into the database [default: no-dry-run] No -
--n-jobs INTEGER Number of jobs to run in parallel No -
--skip-invalid / --no-skip-invalid Ignore (but log) any datasets that don't pass validation [default: skip-invalid] No -
--chunk-size INTEGER Stream the catalog in chunks of this many files instead of loading the whole directory at once. Bounds peak memory for large archives. Only supported by adapters that implement iter_local_datasets (currently CMIP6 and CMIP7). No -
--help Show this message and exit. No -
app datasets stats#

Show summary statistics for datasets.

Displays counts of datasets grouped by dataset type, with finalisation status breakdown and file counts. Optionally expand by a facet using --group-by (e.g., --group-by source_id).

Usage#

app datasets stats [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--source-type [cmip6|cmip7|obs4mips|pmp-climatology] Filter by dataset type No -
--group-by TEXT Group results by a dataset facet. Allowed values: source_id, variable_id. Requires --source-type to be specified. No -
--help Show this message and exit. No -
app datasets fetch-sample-data#

Fetch the sample data for the given version.

These data will be written into the test data directory. This operation may fail if the test data directory does not exist, as is the case for non-source-based installations.

Usage#

app datasets fetch-sample-data [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--force-cleanup / --no-force-cleanup If True, remove any existing files [default: no-force-cleanup] No -
--symlink / --no-symlink If True, symlink files into the output directory, otherwise perform a copy [default: no-symlink] No -
--help Show this message and exit. No -
app datasets fetch-data#

Fetch REF-specific datasets

These datasets have been verified to have open licenses and are in the process of being added to Obs4MIPs.

Usage#

app datasets fetch-data [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--registry TEXT Name of the data registry to use Yes -
--output-directory PATH Output directory where files will be saved No -
--force-cleanup / --no-force-cleanup If True, remove any existing files [default: no-force-cleanup] No -
--symlink / --no-symlink If True, symlink files into the output directory, otherwise perform a copy [default: no-symlink] No -
--verify / --no-verify Verify the checksums of the fetched files [default: verify] No -
--help Show this message and exit. No -

app db#

Database management commands

Usage#

app db [OPTIONS] COMMAND [ARGS]...

Arguments#

No arguments available

Options#

Name Description Required Default
--help Show this message and exit. No -

Subcommands#

app db migrate#

Run database migrations to bring the schema up to date.

This applies any pending Alembic migrations. A backup is created before migrating (SQLite only).

Usage#

app db migrate [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--help Show this message and exit. No -
app db status#

Check if the database schema is up to date.

Shows the current revision, the latest available revision, and whether any migrations are pending.

Usage#

app db status [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--help Show this message and exit. No -
app db heads#

Show the latest migration revision(s).

Usage#

app db heads [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--help Show this message and exit. No -
app db history#

Show the migration history.

Usage#

app db history [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
-n, --last INTEGER Show only the last N migrations No -
--help Show this message and exit. No -
app db backup#

Create a manual backup of the database (SQLite only).

Usage#

app db backup [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--help Show this message and exit. No -
app db sql#

Execute an arbitrary SQL query against the database.

SELECT queries display results as a table (default limit: 100 rows). Other statements report the number of rows affected.

Usage#

app db sql [OPTIONS] QUERY

Arguments#
Name Description Required
QUERY SQL query to execute Yes
Options#
Name Description Required Default
-l, --limit INTEGER Maximum number of rows to display [default: 100] No -
--help Show this message and exit. No -
app db tables#

List all tables in the database.

Usage#

app db tables [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--help Show this message and exit. No -

app executions#

View execution groups and their results

Usage#

app executions [OPTIONS] COMMAND [ARGS]...

Arguments#

No arguments available

Options#

Name Description Required Default
--help Show this message and exit. No -

Subcommands#

app executions list-groups#

List the diagnostic execution groups that have been identified

The data catalog is sorted by the date that the execution group was created (first = newest). If the --column option is provided, only the specified columns will be displayed.

Filters can be combined using AND logic across filter types and OR logic within a filter type.

The output will be in a tabular format.

Usage#

app executions list-groups [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--column TEXT Only include specified columns in the output No -
--limit INTEGER Limit the number of rows to display [default: 100] No -
--diagnostic TEXT Filter by diagnostic slug (substring match, case-insensitive).Multiple values can be provided. No -
--provider TEXT Filter by provider slug (substring match, case-insensitive).Multiple values can be provided. No -
--filter TEXT Filter by facet key=value pairs (exact match). Multiple filters can be provided. No -
--successful / --not-successful Filter by successful or unsuccessful executions. No -
--dirty / --not-dirty Filter to include only dirty or clean execution groups.These execution groups will be re-computed on the next run. No -
--help Show this message and exit. No -
app executions delete-groups#

Delete execution groups matching the specified filters.

This command will delete execution groups and their associated executions. Use filters to specify which groups to delete. At least one filter must be provided to prevent accidental deletion of all groups.

Filters can be combined using AND logic across filter types and OR logic within a filter type.

Usage#

app executions delete-groups [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--diagnostic TEXT Filter by diagnostic slug (substring match, case-insensitive).Multiple values can be provided. No -
--provider TEXT Filter by provider slug (substring match, case-insensitive).Multiple values can be provided. No -
--filter TEXT Filter by facet key=value pairs (exact match). Multiple filters can be provided. No -
--successful / --not-successful Filter by successful or unsuccessful executions. No -
--dirty / --not-dirty Filter to include only dirty or clean execution groups.These execution groups will be re-computed on the next run. No -
--remove-outputs Also remove output directories from the filesystem No -
--force / --no-force Skip confirmation prompt [default: no-force] No -
--help Show this message and exit. No -
app executions inspect#

Inspect a specific execution group by its ID

This will display the execution details, datasets, results directory, and logs if available.

Usage#

app executions inspect [OPTIONS] EXECUTION_ID

Arguments#
Name Description Required
EXECUTION_ID [required] No
Options#
Name Description Required Default
--help Show this message and exit. No -
app executions fail-running#

Mark running executions as failed.

Running executions (those with no success/failure status) block their execution group from being requeued. Use this command as an escape hatch to fail stuck executions so they can be retried on the next solve.

An optional age threshold can be provided with --older-than to only fail executions that have been running for longer than the specified number of hours.

Usage#

app executions fail-running [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--older-than FLOAT Only fail executions older than this many hours. If not specified, all running executions are failed. No -
--diagnostic TEXT Filter by diagnostic slug (substring match, case-insensitive). Multiple values can be provided. No -
--provider TEXT Filter by provider slug (substring match, case-insensitive). Multiple values can be provided. No -
--force / --no-force Skip confirmation prompt [default: no-force] No -
--help Show this message and exit. No -
app executions stats#

Show summary statistics for execution groups.

Displays counts of executions grouped by provider, broken down by status (running, failed, successful, not started, dirty).

Usage#

app executions stats [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--diagnostic TEXT Filter by diagnostic slug (substring match, case-insensitive).Multiple values can be provided. No -
--provider TEXT Filter by provider slug (substring match, case-insensitive).Multiple values can be provided. No -
--help Show this message and exit. No -
app executions reingest#

Reingest existing executions without re-running diagnostics.

Re-runs build_execution_result() on existing output files and re-ingests the results into the database. Useful when new series definitions or metadata extraction logic has been added.

A new Execution record is always created under the same ExecutionGroup, leaving the original execution untouched. Results are treated as immutable.

The dirty flag is never modified by this command.

Usage#

app executions reingest [OPTIONS][GROUP_IDS]...

Arguments#
Name Description Required
[GROUP_IDS]... Execution group IDs to reingest. If omitted, uses filters. No
Options#
Name Description Required Default
--provider TEXT Filter by provider slug (substring match, case-insensitive). Multiple values can be provided. No -
--diagnostic TEXT Filter by diagnostic slug (substring match, case-insensitive). Multiple values can be provided. No -
--include-failed Also attempt reingest on failed executions. No -
--dry-run Show what would be reingested without making changes. No -
--force / --no-force Skip confirmation prompt [default: no-force] No -
--help Show this message and exit. No -
app executions flag-dirty#

Flag an execution group for recomputation

Usage#

app executions flag-dirty [OPTIONS] EXECUTION_ID

Arguments#
Name Description Required
EXECUTION_ID [required] No
Options#
Name Description Required Default
--help Show this message and exit. No -

app providers#

Manage the REF providers.

Usage#

app providers [OPTIONS] COMMAND [ARGS]...

Arguments#

No arguments available

Options#

Name Description Required Default
--help Show this message and exit. No -

Subcommands#

app providers list#

Print the available providers.

Usage#

app providers list [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--help Show this message and exit. No -
app providers show#

Show diagnostics and data requirements for a provider.

Usage#

app providers show [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--provider TEXT Slug of the provider to show diagnostics for. Yes -
--format [table|list] Output format: 'list' for detailed per-diagnostic output, 'table' for a compact table. [default: list] No -
--columns TEXT Columns to include in table output (e.g. --columns diagnostic --columns variables). No -
--help Show this message and exit. No -
app providers create-env#

Create a conda environment containing the provider software.

.. deprecated:: Use ref providers setup instead, which handles both environment creation and data fetching in a single command.

If no provider is specified, all providers will be installed. If the provider is up to date or does not use a conda environment, it will be skipped.

Usage#

app providers create-env [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--provider TEXT Only install the environment for the named provider. No -
--help Show this message and exit. No -
app providers setup#

Run provider setup for offline execution.

This command prepares all providers for offline execution by:

  1. Creating conda environments (if applicable)

  2. Fetching required reference datasets to pooch cache

  3. Ingesting provider-specific datasets into the database

All operations are idempotent and safe to run multiple times. Run this on a login node with internet access before solving on compute nodes.

Usage#

app providers setup [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--provider TEXT Only run setup for the named provider. No -
--skip-env / --no-skip-env Skip environment setup (e.g., conda). [default: no-skip-env] No -
--skip-data / --no-skip-data Skip data fetching and ingestion. [default: no-skip-data] No -
--skip-validate / --no-skip-validate Skip validation. [default: no-skip-validate] No -
--validate-only / --no-validate-only Only validate setup, don't run it. [default: no-validate-only] No -
--help Show this message and exit. No -

app test-cases#

Test data management commands for diagnostic development.

These commands are intended for developers working on diagnostics and require a source checkout of the project with test data directories available.

Usage#

app test-cases [OPTIONS] COMMAND [ARGS]...

Arguments#

No arguments available

Options#

Name Description Required Default
--help Show this message and exit. No -

Subcommands#

app test-cases fetch#

Fetch test data from ESGF for running diagnostic tests.

Downloads full-resolution ESGF data based on diagnostic test_data_spec. Use --provider or --diagnostic to limit scope.

Examples#

ref test-cases fetch                   # Fetch all test data
ref test-cases fetch --provider ilamb  # Fetch ILAMB test data only
ref test-cases fetch --diagnostic ecs  # Fetch ECS diagnostic data
ref test-cases fetch --only-missing    # Skip test cases with existing catalogs
Usage#

app test-cases fetch [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--provider TEXT Specific provider to fetch data for (e.g., 'esmvaltool', 'ilamb') No -
--diagnostic TEXT Specific diagnostic slug to fetch data for No -
--test-case TEXT Specific test case name to fetch data for No -
--dry-run / --no-dry-run Show what would be fetched without downloading [default: no-dry-run] No -
--only-missing / --no-only-missing Only fetch data for test cases without existing catalogs [default: no-only-missing] No -
--force / --no-force Force overwrite catalog even if unchanged [default: no-force] No -
--help Show this message and exit. No -
app test-cases list#

List test cases for all diagnostics.

Shows which test cases are defined for each diagnostic and their descriptions. Also shows whether catalog and regression data exist for each test case.

Usage#

app test-cases list [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--provider TEXT Filter by provider No -
--help Show this message and exit. No -
app test-cases run#

Run test cases for diagnostics.

Executes diagnostics using pre-defined datasets from the test_data_spec and optionally compares against regression baselines.

Use --provider to select which provider's diagnostics to run (required). Use --diagnostic and --test-case to further narrow the scope.

Examples#

ref test-cases run --provider ilamb              # Run all ILAMB test cases
ref test-cases run --provider example --diagnostic global-mean-timeseries
ref test-cases run --provider ilamb --test-case default --fetch
ref test-cases run --provider pmp --only-missing # Skip test cases with regression data
ref test-cases run --provider pmp --if-changed   # Only run if catalog changed
Usage#

app test-cases run [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--provider TEXT Provider slug (required, e.g., 'example', 'ilamb') Yes -
--diagnostic TEXT Specific diagnostic slug to run (e.g., 'global-mean-timeseries') No -
--test-case TEXT Specific test case name to run (e.g., 'default') No -
--output-directory PATH Output directory for execution results No -
--force-regen / --no-force-regen Force regeneration of regression baselines [default: no-force-regen] No -
--fetch / --no-fetch Fetch test data from ESGF before running [default: no-fetch] No -
--size-threshold FLOAT Flag files larger than this size in MB (default: 1.0) [default: 1.0] No -
--dry-run / --no-dry-run Show what would be run without executing [default: no-dry-run] No -
--only-missing / --no-only-missing Only run test cases without existing regression data [default: no-only-missing] No -
--if-changed / --no-if-changed Only run if catalog has changed since regression data was generated [default: no-if-changed] No -
--clean / --no-clean Delete existing output directory before running [default: no-clean] No -
--help Show this message and exit. No -

app celery#

Managing remote celery workers

This module is used to manage remote execution workers for the Climate REF project. It is added to the ref command line interface if the climate-ref-celery package is installed.

A celery worker should be run for each diagnostic provider.

Usage#

app celery [OPTIONS] COMMAND [ARGS]...

Arguments#

No arguments available

Options#

Name Description Required Default
--help Show this message and exit. No -

Subcommands#

app celery start-worker#

Start a Celery worker for the given provider.

A celery worker enables the execution of tasks in the background on multiple different nodes. This worker will register a celery task for each diagnostic in the provider. The worker tasks can be executed by sending a celery task with the name '{package_slug}_{diagnostic_slug}'.

Providers must be registered as entry points in the pyproject.toml file of the package. The entry point should be defined under the group climate-ref.providers (See import_provider for details).

Usage#

app celery start-worker [OPTIONS][EXTRA_ARGS]...

Arguments#
Name Description Required
[EXTRA_ARGS]... Additional arguments for the worker No
Options#
Name Description Required Default
--loglevel TEXT Log level for the worker [default: info] No -
--provider TEXT Name of the provider to start a worker for. This argument may be supplied multiple times. If no provider is given, the worker will consume the default queue. No -
--package TEXT Deprecated. Use provider instead No -
--help Show this message and exit. No -
app celery list-config#

List the celery configuration

Usage#

app celery list-config [OPTIONS]

Arguments#

No arguments available

Options#
Name Description Required Default
--help Show this message and exit. No -