Frequently Ask Questions and best practices#
This section gives an overview over the most commonly asks questions and the best practice when it comes to plugin development.
How can I make use of Freva
from jupyter notebooks?#
The easiest way to use freva in a jupyter notebook is to log on the HPC system where freva is installed and load the freva module. The name of the specific freva module should be communicated by your admin team. If the name of the freva module is freva
then simply load it with:
module load freva
After that you can install a new jupyter kernel via:
jupyter-kernel-install python --name freva --display-name FrevaKernel
After you have installed the kernel you can use freva in your jupyter notebook:
[1]:
import freva
# Get an overview of the available freva plugins
freva.get_tools_list()
[1]:
Animator | Animate data on lon/lat grids |
DummyPlugin | A dummy plugin |
DummyPluginFolders | A dummy plugin with outputdir folder |
[2]:
# Get the documentation of a plugin
freva.plugin_doc("animator")
[2]:
Option | Description |
---|---|
input_file | NetCDF input file(s), you can choose multiple files
separated by a , or use a global pattern for multiple
files chose this option only if you don't want Freva to
find files by seach facets (default: |
variable | Variable name (only applicable if you didn't choose an
input file) (default: |
project | Project name (only applicable if you didn't choose an
input file) (default: |
product | Product name (only applicable if you didn't choose an
input file) (default: |
experiment | Experiment name (only applicable if you didn't choose an
input file) (default: |
institute | Institute name (only applicable if you didn't choose an
input file) (default: |
model | Model name (only applicable if you didn't choose an
input file) (default: |
time_frequency | Time frequency name (only applicable if you didn't
choose an input file) (default: |
ensemble | Ensemble name (only applicable if you didn't choose an
input file) (default: |
start | Define the first time step to be plotted, leave blank if
taken from data (default: |
end | Define the last time step to be plotted, leave blank if
taken from data (default: |
time_mean | Select a time interval if time averaging/min/max/sum
should be applied along the time axis. This can be D for
daily, M for monthly 6H for 6 hours etc. Leave blank if
the time axis should not be resampled (default) or set
to all if you want to collapse the time axis. (default: |
time_method | If resampling of the time axis is chosen (default: no) set the method: mean, max or min. Note: This has only an effect if the above parameter for time_mean is set. (default: mean) |
lonlatbox | Set the extend of a rectangular lonlatbox (left_lon,
right_lon, lower_lat, upper_lat) (default: |
output_unit | Set the output unit of the variable - leave blank for no
conversion. This can be useful if the unit of the input
files should be converted, for example for
precipitation. Note: Although many conversions are
supported, by using the `pint` conversion library. (default: |
vmin | Set the minimum plotting range (leave blank to calculate
from data 1st decile) (default: |
vmax | Set the maximum plotting range (leave blank to calculate
the 9th decile) (default: |
cmap | Set the colormap, more information on colormaps is available on the matplotlib website. (default: RdYlBu\_r) |
linecolor | Color of the coast lines in the map (default: k) |
projection | Set the global map projection. Note: this should the name of the cartopy projection method (e.g PlatteCarree for Cylindrical Projection). Pleas refer to cartopy website for details. (default: PlateCarree) |
proj_centre | Set center longitude of the global map projection. (default: 50) |
pic_size | Set the size of the picture (in pixel) (default: 1360,900) |
plot_title | Set plot title (default: ) |
cbar_label | Overwrite default colorbar label by this value (default: ) |
suffix | Filetype of the animation (default: mp4) |
fps | Set the frames per seceonds of the output animation. (default: 5) |
extra_scheduler_options | Set additional options for the job submission to the workload manager (, separated). Note: batchmode and web only. (default: --qos=test, --array=20) |
How can I use freva in my own python environment?#
If you don’t want to create a new jupyter kernel that points to the python environment where the freva system is installed you can also install freva into your own python environment. You can either install freva via pip
:
python3 -m pip install freva
or conda
:
conda install -c conda-forge freva
Afterwards you can use freva in your own environment.
The only difference to the above approach of using the freva python environment is that you will have to tell freva to use the correct configuration.
import freva
# In order to use freva we have to tell where to get the configuration
freva.config(config_file="/path/to/the/freva/config/file.conf")
# Talk to your admins to get the location of the config file.
Please talk to your admins on where the central freva configuration is located.
How can I add my own plugins?#
If you are working with python and want to load your own plugin definitions you can also use freva.config
to load you own plugin definitons. Assuming you have your plugin definition saved in ~/freva/mypluging/the_plugin.py
you can load your own plugin via:
import freva
freva.config(plugin_path="~/freva/mypluging,the_plugin")
You can load multiple plugins by using a list.
import freva
freva.config(plugin_path=["~/freva/mypluging,the_plugin_1", "~/freva/mypluging,the_plugin_2"])
Loading plugins can of course be combined with loading a different freva config:
freva.config(config_file="/path/to/the/freva/config/file.conf",
plugin_path="~/freva/mypluing,the_plugin")
How can I use Freva
in my analysis workflows?#
You can use Freva
without creating or applying data analysis plugins. One example would be using the databrowser
command in you data analysis workflow:
# Use the databrowser search and pipe the output to ncremap
freva databrowser project=observations experiment=cmorph time_frequency=30min | ncremap -m map.nc -O drc_rgr
Below you can find a python example, which you could use in a notebook
[3]:
import freva
import xarray as xr
# Open the data with xarray
dset = xr.open_mfdataset(freva.databrowser(project="obs*", time_frequency='30min'), combine='by_coords')['pr']
dset
[3]:
<xarray.DataArray 'pr' (time: 48, lat: 412, lon: 687)> Size: 54MB dask.array<concatenate, shape=(48, 412, 687), dtype=float32, chunksize=(1, 1, 687), chunktype=numpy.ndarray> Coordinates: * time (time) datetime64[ns] 384B 2016-09-02 ... 2016-09-02T23:30:00 * lon (lon) float32 3kB 255.0 255.1 255.2 255.3 ... 304.8 304.9 305.0 * lat (lat) float32 2kB 15.06 15.14 15.21 15.28 ... 44.83 44.9 44.97 Attributes: standard_name: precipitation_flux long_name: Precipitation units: kg m-2 s-1
Best practice: Using the Freva
module in a plugin#
Like above you can use the Freva python module within your wrapper API code of a plugin
[4]:
import json
from evaluation_system.api import plugin, parameters
import freva
class MyPlugin(plugin.PluginAbstract):
"""An analysis plugin that uses the databrowser search."""
__parameters__ = parameters.ParameterDictionary(
parameters.SolrField(name="project", facet="project", help="Set the project name"),
parameters.SolrField(name="variable", facet="variable", help="Set the variable name"),
)
def run_tool(self, config_dict):
"""Main plugin method that makes Freva calls."""
# Search for files
files = list(freva.databrowser(**config_dict))
# Save the files to a json file
with open("/tmp/some_json_file.json", "w") as f:
json.dumps(files, f)
self.call("external_command /tmp/some_json_file.json")
Best practice: Making calls to external commands with complex command line parameters#
If you have a plugin that makes calls to a command line interface (like a shell script) you should avoid long command line argument calls like this:
[5]:
import json
from evaluation_system.api import plugin, parameters
import freva
class MyPlugin(plugin.PluginAbstract):
def run_tool(self, config_dict):
result = self.call('%s/main.py %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s'
% (self.class_basedir, config_dict['inputdir'], config_dict['project'],
config_dict['product'], config_dict['institute'], config_dict['model'],
config_dict['experiment'], config_dict['time_frequency'],
config_dict['variable'], config_dict['ensemble'], config_dict['cmor_table'],
config_dict['outputdir'], config_dict['cache'], config_dict['cacheclear'],
config_dict['seldate'], config_dict['latlon'], config_dict['area_weight'],
config_dict['percentile_threshold'], config_dict['threshold'],
config_dict['persistence'], config_dict['sel_extr'], config_dict['dryrun'])
)
Such call are very hard to read and should be avoided. Instead you can use Freva as much as possible within the wrapper code and save relevant results into a json file that is passed and a single argument to the tool. Using the above example the code could be simplified as follows:
[6]:
import json
from tempfile import NamedTemporaryFile
from evaluation_system.api import plugin, parameters
import freva
class MyPlugin(plugin.PluginAbstract):
def run_tool(self, config_dict):
config_dict['input_files'] = list(
freva.databrowser(
product=config_dict.pop('product'), project=config_dict.pop('project'),
institute=config_dict.pop('institute'), model=config_dict.pop('model'),
experiment=config_dict.pop('experiment'), variable=config_dict.pop('variable'),
ensemble=config_dict.pop('ensemble'), cmor_table=config_dict.pop('cmor_table')
)
)
with NamedTemporaryFile(suffix='.json') as tf:
with open(tf.name, 'w') as f:
json.dump(config_dict, f)
self.call(f"{self.class_basedir}/main.py {tf.name}")
Here we use json because most scripting and programming languages have a json
parser functionality which can be conveniently used.