climate_ref.solve_helpers
#
Helpers for understanding and regression-testing the solver's behavior.
This module provides functions to: - Generate parquet catalogs from local dataset directories - Load parquet catalogs for solver testing - Run the solver on catalogs and format results - Produce regression-friendly output for pytest-regressions
format_solve_results_json(results)
#
Serialize solve results to JSON.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
list[dict[str, Any]]
|
Results from :func: |
required |
Returns:
| Type | Description |
|---|---|
str
|
JSON string of the results list |
Source code in packages/climate-ref/src/climate_ref/solve_helpers.py
format_solve_results_table(results)
#
Format solve results as a human-readable grouped text table.
Groups by provider, then diagnostic, showing dataset_key and matched instance_ids per source type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
list[dict[str, Any]]
|
Results from :func: |
required |
Returns:
| Type | Description |
|---|---|
str
|
Human-readable text representation |
Source code in packages/climate-ref/src/climate_ref/solve_helpers.py
generate_catalog(source_type, directories, strip_path_prefix=None)
#
Scan directories using the appropriate DatasetAdapter and concatenate results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_type
|
str
|
Dataset source type (e.g. "cmip6", "obs4mips") |
required |
directories
|
list[Path]
|
List of directories to scan for datasets |
required |
strip_path_prefix
|
str | None
|
If provided, replace this prefix in path columns with |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame containing dataset metadata from all directories |
Source code in packages/climate-ref/src/climate_ref/solve_helpers.py
load_solve_catalog(catalog_dir)
#
Load parquet catalog files from a directory.
Looks for cmip6_catalog.parquet, cmip7_catalog.parquet,
obs4mips_catalog.parquet, and pmp_climatology_catalog.parquet.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
catalog_dir
|
Path
|
Directory containing parquet catalog files |
required |
Returns:
| Type | Description |
|---|---|
dict[SourceDatasetType, DataFrame] | None
|
Mapping of source type to catalog DataFrame, or None if no catalogs found |
Source code in packages/climate-ref/src/climate_ref/solve_helpers.py
solve_results_for_regression(results)
#
Convert solve results to the dict format used by data_regression.check().
Produces {dataset_key: {source_type: [instance_id, ...]}}
for use with data_regression.check().
When called with results filtered to a single diagnostic (recommended),
dataset_key is unique and no data is lost. If results span multiple
diagnostics, duplicate dataset_key values will overwrite earlier entries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
list[dict[str, Any]]
|
Results from :func: |
required |
Returns:
| Type | Description |
|---|---|
dict[str, dict[str, list[str]]]
|
Dict keyed by |
Source code in packages/climate-ref/src/climate_ref/solve_helpers.py
solve_to_results(data_catalog, providers, filters=None)
#
Run the solver on a data catalog and collect results into a sorted list of dicts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_catalog
|
dict[SourceDatasetType, DataFrame]
|
Mapping of source type to catalog DataFrame |
required |
providers
|
list[DiagnosticProvider]
|
List of diagnostic providers to solve for |
required |
filters
|
SolveFilterOptions | None
|
Optional filters to restrict which diagnostics are solved |
None
|
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
Sorted list of result dicts, each with keys: |
Source code in packages/climate-ref/src/climate_ref/solve_helpers.py
write_catalog_parquet(catalog, output_path)
#
Write a catalog DataFrame to parquet.
cftime.datetime objects in start_time/end_time are converted to
strings before writing because pyarrow cannot serialize them.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
catalog
|
DataFrame
|
DataFrame to write |
required |
output_path
|
Path
|
Path for the output parquet file |
required |