Home · CommonDataModel.jl

Data types

In order to implement a new dataset based CommonDataModel.jl one has to create two types derived from:

AbstractVariable: a variable with named dimension and metadata
AbstractDataset: a collection of variable with named dimension, metadata and sub-groups. The sub-groups are also AbstractDataset.

CommonDataModel.jl also provides a type CFVariable which wraps a type derived from AbstractVariable and applies the scaling described in cfvariable.

Overview of methods:

	get names	get values	set value	property
Dimensions	`dimnames`	`dim`	`defDim`	`dim`
Attributes	`attribnames`	`attrib`	`defAttrib`	`attrib`
Variables	`varnames`	`variable`	`defVar`	-
Groups	`groupnames`	`group`	`defGroup`	`group`

For read-only datasets, the methods in "set value" column are not to be implemented. Attributes can also be delete with the delAttrib functions.

Every struct deriving from AbstractDataset have automaticaly the special properties dim, attrib and group which act like dictionaries (unless a field with this name already exists). For attrib, calls to keys, getindex and setindex!, delete! are dispated to attribnames, attrib,defAttrib, and delAttrib respectively (and likewise for other properties). For example:

using NCDatasets
ds = NCDataset("file.nc")
# setindex!(ds.attrib,...) here automatically calls defAttrib(ds,...)
ds.attrib["title"] = "my amazing results";

Variables can be accessed by directly indexing the AbstractDataset.

Every struct deriving from AbstractVariable has the properties dim, and attrib.

Current functionalities of CommonDataModel include:

virtually concatenating files along a given dimension
create a virtual subset ((view)) by indices or by values of coordinate variables (select, @select)
group, map and reduce a variable (groupby, @groupby, rolling)

API

CommonDataModel.AbstractDataset — Type

AbstractDataset is a collection of multidimensional variables (for example a NetCDF or GRIB file)

A data set ds of a type derived from AbstractDataset should implemented at minimum:

Base.key(ds): return a list of variable names as strings
variable(ds,varname::String): return an array-like data structure (derived from AbstractVariable) of the variables corresponding to varname. This array-like data structure should follow the CF semantics.
dimnames(ds): should be an iterable with all dimension names in the data set ds
dim(ds,name): dimension value corresponding to name

Optionally a data set can have attributes and groups:

attribnames(ds): should be an iterable with all attribute names
attrib(ds,name): attribute value corresponding to name
groupnames(ds): should be an iterable with all group names
group(ds,name): group corresponding to the name

For a writable dataset, one should also implement:

defDim: define a dimension
defAttrib: define a attribute
defVar: define a variable
defGroup: define a group