climate_ref.datasets.netcdf_utils
#
Shared utilities for reading NetCDF metadata using netCDF4 directly.
These functions avoid the overhead of xarray for metadata-only reads, providing significant speedup per file by skipping Store construction, dask array creation, and full CF time decoding.
read_global_attrs(ds, keys)
#
Read global attributes from a netCDF4 Dataset
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Open netCDF4 Dataset |
required |
keys
|
list[str] | tuple[str, ...]
|
Attribute names to read |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dictionary mapping attribute names to values (None if missing) |
Source code in packages/climate-ref/src/climate_ref/datasets/netcdf_utils.py
read_mandatory_attr(ds, key)
#
Read a mandatory global attribute from a netCDF4 Dataset, ensuring it is present and non-empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Open netCDF4 Dataset |
required |
key
|
str
|
Attribute name to read |
required |
Returns:
| Type | Description |
|---|---|
Any
|
Value of the attribute |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the attribute is missing or empty |
Source code in packages/climate-ref/src/climate_ref/datasets/netcdf_utils.py
read_time_bounds(ds)
#
Read the first and last time values from a netCDF4 Dataset
Reads only two raw numeric values and decodes them with cftime.num2date,
matching xarray's CF time decoding output exactly.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Open netCDF4 Dataset |
required |
Returns:
| Type | Description |
|---|---|
tuple[str | None, str | None]
|
Tuple of (start_time, end_time) as strings, or (None, None) if no time variable or time dimension is empty |
Source code in packages/climate-ref/src/climate_ref/datasets/netcdf_utils.py
read_time_metadata(ds)
#
Read time encoding metadata from a netCDF4 Dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Open netCDF4 Dataset |
required |
Returns:
| Type | Description |
|---|---|
tuple[str | None, str | None]
|
Tuple of (time_units, calendar). Returns (None, None) if no time variable. |
Source code in packages/climate-ref/src/climate_ref/datasets/netcdf_utils.py
read_variable_attrs(ds, variable_id, attr_names)
#
Read attributes from a specific variable in a netCDF4 Dataset
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Open netCDF4 Dataset |
required |
variable_id
|
str
|
Name of the variable to read attributes from |
required |
attr_names
|
list[str] | tuple[str, ...]
|
Attribute names to read |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dictionary mapping attribute names to values (None if missing or variable not found) |
Source code in packages/climate-ref/src/climate_ref/datasets/netcdf_utils.py
read_vertical_levels(ds)
#
Count the number of vertical levels in a netCDF4 Dataset
Checks known vertical dimension names and returns the size of the first match.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Open netCDF4 Dataset |
required |
Returns:
| Type | Description |
|---|---|
int
|
Number of vertical levels (defaults to 1 if no vertical dimension found) |