Databrowser python module#

The following section gives an overview over the usage of the databrowser client python module. Please see the Installation and configuration section on how to install and configure the library.

TLDR: Too long didn’t read#

To query data databrowser and search for data you have three different options. You can the to following methods

freva_client.databrowser: The main class for searching data is the freva_client.databrowser class. After creating in instance of the databrowser class with your specific search constraints you can get retrieve all files or uris that matching your search constraints. You can also retrieve a count of the number objects matching the search, as well as getting an overview over the available metadata and creating an intake-esm catalogue from your search. Searching for Uris instead of file paths can be useful to get information on the storage system where the files or object stores are located.
freva_client.databrowser.metadata_search(): This class method lists all search categories (facets) and their values.
freva_client.databrowser.count_values(): You can count the occurrences of search results with this method.
freva_client.databrowser.userdata(): This calss method lets you add or delete your own metadata.

Library Reference#

Below you can find a more detailed documentation.

Client software freva evaluation system framework (freva):

Freva, the free evaluation system framework, is a data search and analysis platform developed by the atmospheric science community for the atmospheric science community. With help of Freva researchers can:

quickly and intuitively search for data stored at typical data centers that host many datasets.
create a common interface for user defined data analysis tools.
apply data analysis tools in a reproducible manner.

The code described here is currently in testing phase. The client and server library described in the documentation only support searching for data. If you need to apply data analysis plugins, please visit the

class freva_client.databrowser(*facets: str, uniq_key: Literal['file', 'uri'] = 'file', flavour: Literal['freva', 'cmip6', 'cmip5', 'cordex', 'nextgems', 'user'] = 'freva', time: str | None = None, host: str | None = None, time_select: Literal['flexible', 'strict', 'file'] = 'flexible', bbox: Tuple[float, float, float, float] | None = None, bbox_select: Literal['flexible', 'strict', 'file'] = 'flexible', stream_zarr: bool = False, multiversion: bool = False, fail_on_error: bool = False, **search_keys: str | List[str])#

Find data in the system.

You can either search for files or uri’s. Uri’s give you an information on the storage system where the files or objects you are looking for are located. The query is of the form key=value. For value you might use wild cards such as *, ? or any regular expression.

Parameters#

*facets: str: If you are not sure about the correct search key’s you can use positional arguments to search of any matching entries. For example ‘era5’ would allow you to search for any entries containing era5, regardless of project, product etc.
**search_keys: str: The search constraints applied in the data search. If not given the whole dataset will be queried.
flavour: str, default: freva: The Data Reference Syntax (DRS) standard specifying the type of climate datasets to query. You can get an overview by using the :py:meth:databrowser.overview class method to retrieve information on the available search flavours and their different search keys.
time: str, default: “”: Special search key to refine/subset search results by time. This can be a string representation of a time range or a single timestamp. The timestamps has to follow ISO-8601. Valid strings are %Y-%m-%dT%H:%M to %Y-%m-%dT%H:%M for time ranges or %Y-%m-%dT%H:%M for single time stamps.

Note

You don’t have to give the full string format to subset time steps %Y, %Y-%m etc are also valid.
time_select: str, default: flexible: Operator that specifies how the time period is selected. Choose from flexible (default), strict or file. strict returns only those files that have the entire time period covered. The time search 2000 to 2012 will not select files containing data from 2010 to 2020 with the strict method. flexible will select those files as flexible returns those files that have either start or end period covered. file will only return files where the entire time period is contained within one single file.
bbox: str, default: “”: Special search facet to refine/subset search results by spatial extent. This can be a list representation of a bounding box or a WKT polygon. Valid lists are min_lon max_lon min_lat max_lat for bounding boxes and Well-Known Text (WKT) format for polygons.
bbox_select: str, default: flexible: Operator that specifies how the spatial extent is selected. Choose from flexible (default), strict or file. strict returns only those files that fully contain the query extent. The bbox search -10 10 -10 10 will not select files covering only 0 5 0 5 with the strict method. flexible will select those files as it returns files that have any overlap with the query extent. file will only return files where the entire spatial extent is contained by the query geometry.
uniq_key: str, default: file: Chose if the solr search query should return paths to files or uris, uris will have the file path along with protocol of the storage system. URIs are useful when working with libraries like fsspec, which require protocol information.
host: str, default: None: Override the host name of the databrowser server. This is usually the url where the freva web site can be found. Such as www.freva.dkrz.de. By default no host name is given and the host name will be taken from the freva config file.
stream_zarr: bool, default: False: Create a zarr stream for all search results. When set to true the files are served in zarr format and can be opened from anywhere.
multiversion: bool, default: False: Select all versions and not just the latest version (default).
fail_on_error: bool, default: False: Make the call fail if the connection to the databrowser could not be established.

Attributes#

url: str: the url of the currently selected databrowser api server
metadata: dict[str, str]: The available search keys, or metadata, found for the applied search constraints. This can be useful for reverse searches.

Example#

Search for the cmorph datasets. Suppose we know that the experiment name of this dataset is cmorph therefore we can create in instance of the databrowser class using the experiment search constraint. If you just ‘print’ the created object you will get a quick overview:

Code

from freva_client import databrowser
db = databrowser(experiment="cmorph", uniq_key="uri")
print(db)

Results

databrowser(flavour=freva, host=http://localhost:7777/api/freva-nextgen/databrowser, multi_version=False, experiment=cmorph)

After having created the search object you can acquire different kinds of information like the number of found objects:

Code

from freva_client import databrowser
db = databrowser(experiment="cmorph", uniq_key="uri")
print(len(db))
# Get all the search keys associated with this search

Results

Or you can retrieve the combined metadata of the search objects.

Code

from freva_client import databrowser
db = databrowser(experiment="cmorph", uniq_key="uri")
print(db.metadata)

Results

{'cmor_table': ['30min'], 'dataset': ['obs-fs', 'obs-hsm', 'obs-swfit'], 'driving_model': [], 'ensemble': ['r1i1p1'], 'experiment': ['cmorph'], 'format': ['nc', 'zarr'], 'fs_type': ['posix'], 'grid_id': [], 'grid_label': ['gn'], 'institute': ['cpc'], 'level_type': ['2d'], 'model': ['cpc', 'cpc-cmorph'], 'product': ['grid'], 'project': ['observations'], 'rcm_name': [], 'rcm_version': [], 'realm': ['atmos'], 'time_aggregation': ['mean'], 'time_frequency': ['1min'], 'user': [], 'variable': ['pr']}

Most importantly you can retrieve the locations of all encountered objects

Code

from freva_client import databrowser
db = databrowser(experiment="cmorph", uniq_key="uri")
for file in db:
    pass
all_files = sorted(db)
print(all_files[0])

Results

/home/runner/work/freva-nextgen/freva-nextgen/freva-rest/src/freva_rest/databrowser_api/mock/data/observations/grid/CPC/CPC/cmorph/30min/atmos/30min/r1i1p1/v20210618/pr/pr_30min_CPC_cmorph_r1i1p1_201609020000-201609020030.nc

You can also set a different flavour, for example according to cmip6 standard:

Code

from freva_client import databrowser
db = databrowser(flavour="cmip6", experiment_id="cmorph")
print(db.metadata)

Results

{'table_id': ['30min'], 'dataset': ['obs-fs', 'obs-hsm', 'obs-swfit'], 'driving_model': [], 'member_id': ['r1i1p1'], 'experiment_id': ['cmorph'], 'format': ['nc', 'zarr'], 'fs_type': ['posix'], 'grid_id': [], 'grid_label': ['gn'], 'institution_id': ['cpc'], 'level_type': ['2d'], 'source_id': ['cpc', 'cpc-cmorph'], 'activity_id': ['grid'], 'mip_era': ['observations'], 'rcm_name': [], 'rcm_version': [], 'realm': ['atmos'], 'time_aggregation': ['mean'], 'frequency': ['1min'], 'user': [], 'variable_id': ['pr']}

Sometimes you don’t exactly know the exact names of the search keys and want retrieve all file objects that match a certain category. For example for getting all ocean reanalysis datasets you can apply the ‘reana*’ search key as a positional argument:

Code

from freva_client import databrowser
db = databrowser("reana*", realm="ocean", flavour="cmip6")
for file in db:
    print(file)

Results

https://swift.dkrz.de/v1/dkrz_a32dc0e8-2299-4239-a47d-6bf45c8b0160/freva_test/model/obs/reanalysis/reanalysis/NOAA/NODC/OC5/mon/ocean/Omon/r1i1p1/v20200101/hc700/hc700_mon_NODC_OC5_r1i1p1_201201-201212.zarr
/home/runner/work/freva-nextgen/freva-nextgen/freva-rest/src/freva_rest/databrowser_api/mock/data/model/obs/reanalysis/reanalysis/NOAA/NODC/OC5/mon/ocean/Omon/r1i1p1/v20200101/hc700/hc700_mon_NODC_OC5_r1i1p1_201201-201212.nc
/arch/bb1203/freva_test/model/obs/reanalysis/reanalysis/NOAA/NODC/OC5/mon/ocean/Omon/r1i1p1/v20200101/hc700/hc700_mon_NODC_OC5_r1i1p1_201201-201212.nc

If you don’t have direct access to the data, for example because you are not directly logged in to the computer where the data is stored you can set stream_zarr=True. The data will then be provisioned in zarr format and can be opened from anywhere. But bear in mind that zarr streams if not accessed in time will expire. Since the data can be accessed from anywhere you will also have to authenticate before you are able to access the data. Refer also to the freva_client.authenticate() method.

Code

from freva_client import authenticate, databrowser
token_info = authenticate(username="janedoe")
db = databrowser(dataset="cmip6-fs", stream_zarr=True)
zarr_files = list(db)
print(zarr_files)

Results

['http://fv-az816-966:7777/api/freva-nextgen/data-portal/zarr/b5f98d86-b658-5c92-b879-9aeddf6e2f19.zarr', 'http://fv-az816-966:7777/api/freva-nextgen/data-portal/zarr/0fcbbf0c-f054-5085-a547-7374189ad7a0.zarr']

After you have created the paths to the zarr files you can open them

import xarray as xr
dset = xr.open_dataset(
   zarr_files[0],
   chunks="auto",
   engine="zarr",
   storage_options={"header":
        {"Authorization": f"Bearer {token_info['access_token']}"}
   }
)

classmethod count_values(*facets: str, flavour: Literal['freva', 'cmip6', 'cmip5', 'cordex', 'nextgems', 'user'] = 'freva', time: str | None = None, host: str | None = None, time_select: Literal['flexible', 'strict', 'file'] = 'flexible', bbox: Tuple[float, float, float, float] | None = None, bbox_select: Literal['flexible', 'strict', 'file'] = 'flexible', multiversion: bool = False, fail_on_error: bool = False, extended_search: bool = False, **search_keys: str | List[str]) → Dict[str, Dict[str, int]]#

Count the number of objects in the databrowser.

Parameters#

*facets: str: If you are not sure about the correct search key’s you can use positional arguments to search of any matching entries. For example ‘era5’ would allow you to search for any entries containing era5, regardless of project, product etc.
flavour: str, default: freva: The Data Reference Syntax (DRS) standard specifying the type of climate datasets to query.
time: str, default: “”: Special search facet to refine/subset search results by time. This can be a string representation of a time range or a single timestamp. The timestamp has to follow ISO-8601. Valid strings are %Y-%m-%dT%H:%M to %Y-%m-%dT%H:%M for time ranges and %Y-%m-%dT%H:%M.

Note

You don’t have to give the full string format to subset time steps %Y, %Y-%m etc are also valid.
time_select: str, default: flexible: Operator that specifies how the time period is selected. Choose from flexible (default), strict or file. strict returns only those files that have the entire time period covered. The time search 2000 to 2012 will not select files containing data from 2010 to 2020 with the strict method. flexible will select those files as flexible returns those files that have either start or end period covered. file will only return files where the entire time period is contained within one single file.
bbox: str, default: “”: Special search facet to refine/subset search results by spatial extent. This can be a list representation of a bounding box or a WKT polygon. Valid lists are min_lon max_lon min_lat max_lat for bounding boxes and Well-Known Text (WKT) format for polygons.
bbox_select: str, default: flexible: Operator that specifies how the spatial extent is selected. Choose from flexible (default), strict or file. strict returns only those files that fully contain the query extent. The bbox search -10 10 -10 10 will not select files covering only 0 5 0 5 with the strict method. flexible will select those files as it returns files that have any overlap with the query extent. file will only return files where the entire spatial extent is contained by the query geometry.
extended_search: bool, default: False: Retrieve information on additional search keys.
host: str, default: None: Override the host name of the databrowser server. This is usually the url where the freva web site can be found. Such as www.freva.dkrz.de. By default no host name is given and the host name will be taken from the freva config file.
multiversion: bool, default: False: Select all versions and not just the latest version (default).
fail_on_error: bool, default: False: Make the call fail if the connection to the databrowser could not be established.
**search_keys: str: The search constraints to be applied in the data search. If not given the whole dataset will be queried.

Returns#

dict[str, int]:: Dictionary with the number of objects for each search facet/key is given.

Example#

Code

from freva_client import databrowser
print(databrowser.count_values(experiment="cmorph"))

Results

{'ensemble': {'r1i1p1': 49}, 'experiment': {'cmorph': 49}, 'institute': {'cpc': 49}, 'model': {'cpc': 25, 'cpc-cmorph': 24}, 'product': {'grid': 49}, 'project': {'observations': 49}, 'realm': {'atmos': 49}, 'time_aggregation': {'mean': 49}, 'time_frequency': {'1min': 49}, 'variable': {'pr': 49}}

Code

from freva_client import databrowser
print(databrowser.count_values("model"))

Results

{'ensemble': {}, 'experiment': {}, 'institute': {}, 'model': {}, 'product': {}, 'project': {}, 'realm': {}, 'time_aggregation': {}, 'time_frequency': {}, 'variable': {}}

Code

from freva_client import databrowser
print(databrowser.count_values("reana*", realm="ocean", flavour="cmip6"))

Results

{'member_id': {'r1i1p1': 3}, 'experiment_id': {'oc5': 3}, 'institution_id': {'noaa': 3}, 'source_id': {'nodc': 3}, 'activity_id': {'reanalysis': 3}, 'mip_era': {'observations': 3}, 'realm': {'ocean': 3}, 'time_aggregation': {'mean': 3}, 'frequency': {'mon': 3}, 'variable_id': {'hc700': 3}}

intake_catalogue() → esm_datastore#

Create an intake esm catalogue object from the search.

This method creates a intake-esm catalogue from the current object search. Instead of having the original files as target objects you can also choose to stream the files via zarr.

Returns#

intake_esm.core.esm_datastore: intake-esm catalogue.

Raises#

ValueError: If user is not authenticated or catalogue creation failed.

Example#

Let’s create an intake-esm catalogue that points points allows for streaming the target data as zarr:

Code

from freva_client import databrowser
db = databrowser(dataset="cmip6-hsm", stream_zarr=True)
cat = db.intake_catalogue()
print(cat.df)

Results

                                                 uri project  ... grid_label format
0  http://fv-az816-966:7777/api/freva-nextgen/dat...   CMIP6  ...         gn     nc

[1 rows x 14 columns]

property metadata: Dict[str, List[str]]#

Get the metadata (facets) for the current databrowser query.

You can retrieve all information that is associated with your current databrowser search. This can be useful for reverse searches for example for retrieving metadata of object stores or file/directory names.

Example#

Reverse search: retrieving meta data from a known file

Code

from freva_client import databrowser
db = databrowser(uri="slk:///arch/*/CPC/*")
print(db.metadata)

Results

{'cmor_table': ['30min'], 'dataset': ['obs-hsm'], 'driving_model': [], 'ensemble': ['r1i1p1'], 'experiment': ['cmorph'], 'format': ['nc'], 'fs_type': ['posix'], 'grid_id': [], 'grid_label': ['gn'], 'institute': ['cpc'], 'level_type': ['2d'], 'model': ['cpc'], 'product': ['grid'], 'project': ['observations'], 'rcm_name': [], 'rcm_version': [], 'realm': ['atmos'], 'time_aggregation': ['mean'], 'time_frequency': ['1min'], 'user': [], 'variable': ['pr']}

classmethod metadata_search(*facets: str, flavour: Literal['freva', 'cmip6', 'cmip5', 'cordex', 'nextgems', 'user'] = 'freva', time: str | None = None, host: str | None = None, time_select: Literal['flexible', 'strict', 'file'] = 'flexible', bbox: Tuple[float, float, float, float] | None = None, bbox_select: Literal['flexible', 'strict', 'file'] = 'flexible', multiversion: bool = False, fail_on_error: bool = False, extended_search: bool = False, **search_keys: str | List[str]) → Dict[str, List[str]]#

Search for data attributes (facets) in the databrowser.

The method queries the databrowser for available search facets (keys) like model, experiment etc.

Parameters#

*facets: str: If you are not sure about the correct search key’s you can use positional arguments to search of any matching entries. For example ‘era5’ would allow you to search for any entries containing era5, regardless of project, product etc.
flavour: str, default: freva: The Data Reference Syntax (DRS) standard specifying the type of climate datasets to query.
time: str, default: “”: Special search facet to refine/subset search results by time. This can be a string representation of a time range or a single timestamp. The timestamp has to follow ISO-8601. Valid strings are %Y-%m-%dT%H:%M to %Y-%m-%dT%H:%M for time ranges and %Y-%m-%dT%H:%M.

Note

You don’t have to give the full string format to subset time steps %Y, %Y-%m etc are also valid.
time_select: str, default: flexible: Operator that specifies how the time period is selected. Choose from flexible (default), strict or file. strict returns only those files that have the entire time period covered. The time search 2000 to 2012 will not select files containing data from 2010 to 2020 with the strict method. flexible will select those files as flexible returns those files that have either start or end period covered. file will only return files where the entire time period is contained within one single file.
bbox: str, default: “”: Special search facet to refine/subset search results by spatial extent. This can be a list representation of a bounding box or a WKT polygon. Valid lists are min_lon max_lon min_lat max_lat for bounding boxes and Well-Known Text (WKT) format for polygons.
bbox_select: str, default: flexible: Operator that specifies how the spatial extent is selected. Choose from flexible (default), strict or file. strict returns only those files that fully contain the query extent. The bbox search -10 10 -10 10 will not select files covering only 0 5 0 5 with the strict method. flexible will select those files as it returns files that have any overlap with the query extent. file will only return files where the entire spatial extent is contained by the query geometry.
extended_search: bool, default: False: Retrieve information on additional search keys.
multiversion: bool, default: False: Select all versions and not just the latest version (default).
host: str, default: None: Override the host name of the databrowser server. This is usually the url where the freva web site can be found. Such as www.freva.dkrz.de. By default no host name is given and the host name will be taken from the freva config file.
fail_on_error: bool, default: False: Make the call fail if the connection to the databrowser could not be established.
**search_keys: str, list[str]: The facets to be applied in the data search. If not given the whole dataset will be queried.

Returns#

dict[str, list[str]]:: Dictionary with a list search facet values for each search facet key

Example#

Code

from freva_client import databrowser
all_facets = databrowser.metadata_search(project='obs*')
print(all_facets)

Results

{'ensemble': ['r1i1p1'], 'experiment': ['cmorph', 'oc5'], 'institute': ['cpc', 'noaa'], 'model': ['cpc', 'cpc-cmorph', 'nodc'], 'product': ['grid', 'reanalysis'], 'project': ['observations'], 'realm': ['atmos', 'ocean'], 'time_aggregation': ['mean'], 'time_frequency': ['1min', 'mon'], 'variable': ['hc700', 'pr']}

You can also search for all metadata matching a search string:

Code

from freva_client import databrowser
spec_facets = databrowser.metadata_search("obs*")
print(spec_facets)

Results

{'ensemble': ['r1i1p1'], 'experiment': ['cmorph'], 'institute': ['cpc'], 'model': ['cpc', 'cpc-cmorph'], 'product': ['grid'], 'project': ['observations'], 'realm': ['atmos'], 'time_aggregation': ['mean'], 'time_frequency': ['1min'], 'variable': ['pr']}

Get all models that have a given time step:

Code

from freva_client import databrowser
model = databrowser.metadata_search(
    project="obs*",
    time="2016-09-02T22:10"
)
print(model)

Results

{'ensemble': ['r1i1p1'], 'experiment': ['cmorph'], 'institute': ['cpc'], 'model': ['cpc', 'cpc-cmorph'], 'product': ['grid'], 'project': ['observations'], 'realm': ['atmos'], 'time_aggregation': ['mean'], 'time_frequency': ['1min'], 'variable': ['pr']}

Reverse search: retrieving meta data from a known file

Code

from freva_client import databrowser
res = databrowser.metadata_search(file="/arch/*CPC/*")
print(res)

Results

{'ensemble': ['r1i1p1'], 'experiment': ['cmorph'], 'institute': ['cpc'], 'model': ['cpc'], 'product': ['grid'], 'project': ['observations'], 'realm': ['atmos'], 'time_aggregation': ['mean'], 'time_frequency': ['1min'], 'variable': ['pr']}

Code

from freva_client import databrowser
print(databrowser.metadata_search("reana*", realm="ocean", flavour="cmip6"))

Results

{'member_id': ['r1i1p1'], 'experiment_id': ['oc5'], 'institution_id': ['noaa'], 'source_id': ['nodc'], 'activity_id': ['reanalysis'], 'mip_era': ['observations'], 'realm': ['ocean'], 'time_aggregation': ['mean'], 'frequency': ['mon'], 'variable_id': ['hc700']}

In datasets with multiple versions only the latest version (i.e. highest version number) is returned by default. Querying a specific version from a multi versioned datasets requires the multiversion flag in combination with the version special attribute:

Code

from freva_client import databrowser
res = databrowser.metadata_search(dataset="cmip6-fs",
    model="access-cm2", version="v20191108", extended_search=True,
    multiversion=True)
print(res)

Results

{'cmor_table': ['amon'], 'dataset': ['cmip6-fs'], 'driving_model': [], 'ensemble': ['r1i1p1f1'], 'experiment': ['amip'], 'format': ['nc'], 'fs_type': ['posix'], 'grid_id': [], 'grid_label': ['gn'], 'institute': ['csiro-arccss'], 'level_type': ['2d'], 'model': ['access-cm2'], 'product': ['cmip'], 'project': ['cmip6'], 'rcm_name': [], 'rcm_version': [], 'realm': ['atmos'], 'time_aggregation': ['mean'], 'time_frequency': ['mon'], 'user': [], 'variable': ['ua'], 'version': ['20191108']}

If no particular version is requested, information of all versions will be returned.

classmethod overview(host: str | None = None) → str#

Get an overview over the available search options.

If you don’t know what search flavours or search keys you can use for searching the data you can use this method to get an overview over what is available.

Parameters#

host: str, default None: Override the host name of the databrowser server. This is usually the url where the freva web site can be found. Such as www.freva.dkrz.de. By default no host name is given and the host name will be taken from the freva config file.

Returns#

str: A string representation over what is available.

Example#

Code

from freva_client import databrowser
print(databrowser.overview())

Results

Available search flavours:
- freva
- cmip6
- cmip5
- cordex
- nextgems
- user
Search attributes by flavour:
  cmip5:
  - experiment
  - member_id
  - fs_type
  - grid_label
  - institution_id
  - model_id
  - project
  - product
  - realm
  - variable
  - time
  - bbox
  - time_aggregation
  - time_frequency
  - cmor_table
  - dataset
  - format
  - grid_id
  - level_type
  cmip6:
  - experiment_id
  - member_id
  - fs_type
  - grid_label
  - institution_id
  - source_id
  - mip_era
  - activity_id
  - realm
  - variable_id
  - time
  - bbox
  - time_aggregation
  - frequency
  - table_id
  - dataset
  - format
  - grid_id
  - level_type
  cordex:
  - experiment
  - ensemble
  - fs_type
  - grid_label
  - institution
  - model
  - project
  - domain
  - realm
  - variable
  - time
  - bbox
  - time_aggregation
  - time_frequency
  - cmor_table
  - dataset
  - driving_model
  - format
  - grid_id
  - level_type
  - rcm_name
  - rcm_version
  freva:
  - project
  - product
  - institute
  - model
  - experiment
  - time_frequency
  - realm
  - variable
  - ensemble
  - time_aggregation
  - fs_type
  - grid_label
  - cmor_table
  - format
  - grid_id
  - level_type
  - dataset
  - time
  - bbox
  - user
  nextgems:
  - simulation_id
  - member_id
  - fs_type
  - grid_label
  - institution_id
  - source_id
  - project
  - experiment_id
  - realm
  - variable_id
  - time
  - bbox
  - time_reduction
  - time_frequency
  - cmor_table
  - dataset
  - format
  - grid_id
  - level_type
  user:
  - project
  - product
  - institute
  - model
  - experiment
  - time_frequency
  - realm
  - variable
  - ensemble
  - time_aggregation
  - fs_type
  - grid_label
  - cmor_table
  - format
  - grid_id
  - level_type
  - dataset
  - time
  - bbox
  - user

stac_catalogue(filename: str | Path | None = None, **kwargs: Any) → str#

Create a static STAC catalogue from the search.

Parameters#

filename: str, default: None: The filename of the STAC catalogue. If not given or doesn’t exist the STAC catalogue will be saved to the current working directory.
**kwargs: Any: Additional keyword arguments to be passed to the request.

Returns#

BinaryIO A zip file stream

Raises#

ValueError: If stac-catalogue creation failed.

Example#

Let’s create a static STAC catalogue:

Code

from tempfile import mktemp
temp_path = mktemp(suffix=".zip")

from freva_client import databrowser
db = databrowser(dataset="cmip6-hsm")
db.stac_catalogue(filename=temp_path)
print(f"STAC catalog saved to: {temp_path}")

Results

Downloading the STAC catalog started ...
STAC catalog saved to: /tmp/tmpxsvicw5i.zip

property url: str#

Get the url of the databrowser API.

Example#

Code

from freva_client import databrowser
db = databrowser()
print(db.url)

Results

http://localhost:7777/api/freva-nextgen/databrowser

classmethod userdata(action: Literal['add', 'delete'], userdata_items: List[str | Dataset] | None = None, metadata: Dict[str, str] | None = None, host: str | None = None, fail_on_error: bool = False) → None#

Add or delete user data in the databrowser system.

Manage user data in the databrowser system by adding new data or deleting existing data.

For the “add” action, the user can provide data items (file paths or xarray datasets) along with metadata (key-value pairs) to categorize and organize the data.

For the “delete” action, the user provides metadata as search criteria to identify and remove the existing data from the system.

Parameters#

actionLiteral[“add”, “delete”]: The action to perform: “add” to add new data, or “delete” to remove existing data.
userdata_itemsList[Union[str, xr.Dataset]], optional: A list of user file paths or xarray datasets to add to the databrowser (required for “add”).
metadataDict[str, str], optional: Key-value metadata pairs to categorize the data (for “add”) or search and identify data for deletion (for “delete”).
hoststr, optional: Override the host name of the databrowser server. This is usually the url where the freva web site can be found. Such as www.freva.dkrz.de. By default no host name is given and the host name will be taken from the freva config file.
fail_on_errorbool, optional: Make the call fail if the connection to the databrowser could not be established.

Raises#

ValueError: If the operation fails or required parameters are missing for the specified action.
FileNotFoundError: If no user data is provided for the “add” action.

Example#

Adding user data:

Code

from freva_client import authenticate, databrowser
import xarray as xr
token_info = authenticate(username="janedoe")
filenames = (
    "../freva-rest/src/freva_rest/databrowser_api/mock/data/model/regional/cordex/output/EUR-11/"
    "GERICS/NCC-NorESM1-M/rcp85/r1i1p1/GERICS-REMO2015/v1/3hr/pr/v20181212/*.nc"
)
filename1 = (
    "../freva-rest/src/freva_rest/databrowser_api/mock/data/model/regional/cordex/output/EUR-11/"
    "CLMcom/MPI-M-MPI-ESM-LR/historical/r0i0p0/CLMcom-CCLM4-8-17/v1/fx/orog/v20140515/"
    "orog_EUR-11_MPI-M-MPI-ESM-LR_historical_r1i1p1_CLMcom-CCLM4-8-17_v1_fx.nc"
)
xarray_data = xr.open_dataset(filename1)
databrowser.userdata(
    action="add",
    userdata_items=[xarray_data, filenames],
    metadata={"project": "cmip5", "experiment": "myFavExp"}
)

Results

1 have been successfully added to the databrowser. 1 files were duplicates and 
not added.

Deleting user data:

Code

from freva_client import authenticate, databrowser
token_info = authenticate(username="janedoe")
databrowser.userdata(
    action="delete",
    metadata={"project": "cmip5", "experiment": "myFavExp"}
)

Results

User data deleted successfully