SGRID / UGRID

SGRID

cf_xarray can parse the attributes on the grid_topology variable to identify dimension names with axes names X, Y, Z.

Consider this representative dataset

from cf_xarray.datasets import sgrid_roms

sgrid_roms
<xarray.Dataset> Size: 40B
Dimensions:  (xi_u: 2, eta_u: 2)
Dimensions without coordinates: xi_u, eta_u
Data variables:
    grid     int64 8B 0
    u        (xi_u, eta_u) float64 32B 1.0 1.0 1.0 1.0

A new SGRID section is added to the repr:

sgrid_roms.cf
SGRID:
             CF role:   grid_topology: ['grid']

                Axes: * X: ['xi_u']
                      * Y: ['eta_u']
                        Z: n/a

Coordinates:
             CF Axes:   X, Y, Z, T: n/a

      CF Coordinates:   longitude, latitude, vertical, time: n/a

       Cell Measures:   area, volume: n/a

      Standard Names:   n/a

              Bounds:   n/a

       Grid Mappings:   n/a

Data Variables:
       Cell Measures:   area, volume: n/a

      Standard Names:   n/a

              Bounds:   n/a

       Grid Mappings:   n/a

Topology variable

cf_xarray supports identifying grid_topology using the cf_role attribute.

sgrid_roms.cf["grid_topology"]
<xarray.DataArray 'grid' ()> Size: 8B
array(0)
Attributes:
    cf_role:              grid_topology
    topology_dimension:   2
    node_dimensions:      xi_psi eta_psi
    face_dimensions:      xi_rho: xi_psi (padding: both) eta_rho: eta_psi (pa...
    edge1_dimensions:     xi_u: xi_psi eta_u: eta_psi (padding: both)
    edge2_dimensions:     xi_v: xi_psi (padding: both) eta_v: eta_psi
    node_coordinates:     lon_psi lat_psi
    face_coordinates:     lon_rho lat_rho
    edge1_coordinates:    lon_u lat_u
    edge2_coordinates:    lon_v lat_v
    vertical_dimensions:  s_rho: s_w (padding: none)

Dimensions

Let’s look at the repr again:

sgrid_roms.cf
SGRID:
             CF role:   grid_topology: ['grid']

                Axes: * X: ['xi_u']
                      * Y: ['eta_u']
                        Z: n/a

Coordinates:
             CF Axes:   X, Y, Z, T: n/a

      CF Coordinates:   longitude, latitude, vertical, time: n/a

       Cell Measures:   area, volume: n/a

      Standard Names:   n/a

              Bounds:   n/a

       Grid Mappings:   n/a

Data Variables:
       Cell Measures:   area, volume: n/a

      Standard Names:   n/a

              Bounds:   n/a

       Grid Mappings:   n/a

Note that xi_u, eta_u were identified as X, Y axes even though there is no data associated with them. So now the following will return xi_u

sgrid_roms.cf["X"]
<xarray.DataArray 'xi_u' (xi_u: 2)> Size: 16B
[2 values with dtype=int64]
Dimensions without coordinates: xi_u

Tip

The repr only shows variable names that can be used as object[variable_name]. That is why only xi_u, eta_u are listed in the repr even though the attributes on the grid_topology variable grid list many more dimension names.

Coordinate variables

cf_xarray also follows the node_coordinates, face_coordinates, edge1_coordinates, edge2_coordinates, and volume_coordinates attributes on the grid_topology variable. When you select a data variable that references a grid_topology via its grid attribute, the referenced coordinate variables are pulled in alongside it:

ds.cf[["u"]]  # includes `grid`, lon_psi/lat_psi, lon_rho/lat_rho, ...

Only names actually present in the dataset are propagated. For the DataArray form (ds.cf["u"]) xarray only attaches coordinates whose dimensions are compatible with the variable, so e.g. only lon_u/lat_u appear as coords on u.

UGRID

Topology variable

cf_xarray supports identifying the mesh_topology variable using the cf_role attribute.

Connectivity and coordinate variables

When a data variable references a mesh_topology variable through its mesh attribute, cf_xarray follows that attribute when subsetting with .cf. The mesh topology variable, its connectivity variables (face_node_connectivity, edge_node_connectivity, face_edge_connectivity, face_face_connectivity, edge_face_connectivity, boundary_node_connectivity), and its coordinate variables (node_coordinates, edge_coordinates, face_coordinates) are pulled in alongside the data variable:

ds.cf[["h"]]  # includes `mesh`, face_nodes, node_lon/node_lat, ...

Only names actually present in the dataset are propagated.

More?

Further support for interpreting the SGRID and UGRID conventions can be added. Contributions are welcome!