Data types
In order to implement a new dataset based CommonDataModel.jl one has to create two types derived from:
AbstractVariable: a variable with named dimension and metadataAbstractDataset: a collection of variable with named dimension, metadata and sub-groups. The sub-groups are alsoAbstractDataset.
CommonDataModel.jl also provides a type CFVariable which wraps a type derived from AbstractVariable and applies the scaling described in cfvariable.
Overview of methods:
| get names | get values | set value | property | |
|---|---|---|---|---|
| Dimensions | dimnames | dim | defDim | dim | 
| Attributes | attribnames | attrib | defAttrib | attrib | 
| Variables | varnames | variable | defVar | - | 
| Groups | groupnames | group | defGroup | group | 
For read-only datasets, the methods in "set value" column are not to be implemented. Attributes can also be delete with the delAttrib functions.
Every struct deriving from AbstractDataset have automaticaly the special properties dim, attrib and group which act like dictionaries (unless a field with this name already exists). For attrib, calls to keys, getindex and setindex!, delete! are dispated to attribnames, attrib,defAttrib, and delAttrib respectively (and likewise for other properties). For example:
using NCDatasets
ds = NCDataset("file.nc")
# setindex!(ds.attrib,...) here automatically calls defAttrib(ds,...)
ds.attrib["title"] = "my amazing results";Variables can be accessed by directly indexing the AbstractDataset.
Every struct deriving from AbstractVariable has the properties dim, and attrib.
Current functionalities of CommonDataModel include:
- virtually concatenating files along a given dimension
 - create a virtual subset ((
view)) by indices or by values of coordinate variables (select,@select) - group, map and reduce a variable (
groupby,@groupby,rolling) 
API
CommonDataModel.AbstractDataset — TypeAbstractDataset is a collection of multidimensional variables (for example a NetCDF or GRIB file)
A data set ds of a type derived from AbstractDataset should implemented at minimum:
Base.key(ds): return a list of variable names as stringsvariable(ds,varname::String): return an array-like data structure (derived fromAbstractVariable) of the variables corresponding tovarname. This array-like data structure should follow the CF semantics.dimnames(ds): should be an iterable with all dimension names in the data setdsdim(ds,name): dimension value corresponding to name
Optionally a data set can have attributes and groups:
attribnames(ds): should be an iterable with all attribute namesattrib(ds,name): attribute value corresponding to namegroupnames(ds): should be an iterable with all group namesgroup(ds,name): group corresponding to the name
For a writable dataset, one should also implement:
defDim: define a dimensiondefAttrib: define a attributedefVar: define a variabledefGroup: define a group
CommonDataModel.AbstractVariable — TypeAbstractVariable{T,N} is a subclass of AbstractArray{T, N}. A variable v of a type derived from AbstractVariable should implement:
name(v): should be the name of variable within the data setdimnames(v): should be a iterable data structure with all dimension names of the variablevdataset(v): the parent dataset containingvBase.size(v): the size of the variableBase.getindex(v,indices...): get the data ofvat the provided indices
Optionally a variable can have attributes:
attribnames(v): should be an iterable with all attribute namesattrib(v,name): attribute value corresponding to name
For a writable dataset, one should also implement:
defAttrib: define a attributeBase.setindex!(v,data,indices...): set the data invat the provided indices
CommonDataModel.Attributes — TypeA collection of attributes with a Dict-like interface dispatching to attribnames, attrib, defAttrib for keys, getindex and setindex! respectively.
CommonDataModel.CFVariable — TypeVariable (with applied transformations following the CF convention) attrib can have different attributes as the parent variables (used in GRIBDatasets to map from grib attributes to CF attributes)
CommonDataModel.Dimensions — TypeA collection of dimensions with a Dict-like interface dispatching to dimnames, dim, defDim for keys, getindex and setindex! respectively.
CommonDataModel.Groups — TypeA collection of groups with a Dict-like interface dispatching to groupnames and group for keys and getindex respectively.
Base.collect — Methodcollect always returns an array. Even if the result of the indexing is a scalar, it is wrapped into a zero-dimensional array.
Base.delete! — MethodBase.delete!(a::Attributes, name)Delete the attribute name from the attribute list a.
Base.filter — Methoddata = CommonDataModel.filter(ncv, indices...; accepted_status_flags = nothing)Load and filter observations by replacing all variables without an acepted status flag to missing. It is used the attribute ancillary_variables to identify the status flag.
# da["data"] is 2D matrix
good_data = NCDatasets.filter(ds["data"],:,:, accepted_status_flags = ["good_data","probably_good_data"])Base.getindex — Methodv = getindex(ds::AbstractDataset, varname::SymbolOrString)Return the variable varname in the dataset ds as a CFVariable. The following CF convention are honored when the variable is indexed:
_FillValueormissing_value(which can be a list) will be returned asmissing.scale_factorandadd_offsetare applied (output =scale_factor*data_in_file+add_offset)- time variables (recognized by the units attribute and possibly the calendar attribute) are returned usually as 
DateTimeobject. Note thatCFTime.DateTimeAllLeap,CFTime.DateTimeNoLeapandCF.TimeDateTime360Daycannot be converted to the proleptic gregorian calendar used in julia and are returned as such. (SeeCFTime.jlfor more information about those date types.) If a calendar is defined but not among the ones specified in the CF convention, then the data in the file is not converted into a date structure. 
A call getindex(ds, varname) is usually written as ds[varname].
If variable represents a cell boundary, the attributes calendar and units of the related variables are used, if they are not specified. For example:
dimensions:
  time = UNLIMITED; // (5 currently)
  nv = 2;
variables:
  double time(time);
    time:long_name = "time";
    time:units = "hours since 1998-04-019 06:00:00";
    time:bounds = "time_bnds";
  double time_bnds(time,nv);In this case, the variable time_bnds uses the units and calendar of time because both variables are related thought the bounds attribute following the CF conventions.
See also cfvariable(ds, varname).
Base.getindex — Methodgetindex(a::Attributes,name::SymbolOrString)Return the value of the attribute called name from the attribute list a. Generally the attributes are loaded by indexing, for example:
using NCDatasets
ds = NCDataset("file.nc")
title = ds.attrib["title"]Base.getindex — Methodgroup = getindex(g::Groups,groupname::AbstractString)Return the NetCDF group with the name groupname from the parent group g.
For example:
ds = NCDataset("results.nc", "r");
forecast_group = ds.group["forecast"]
forecast_temp = forecast_group["temperature"]Base.getindex — Methodvar = getindex(ds::Union{AbstractDataset,AbstractVariable},cfname::CFStdName)Return the NetCDF variable var with the standard name cfname from a dataset. If the first argument is a variable, then the search is limited to all variables with the same dimension names.
Base.keys — MethodBase.keys(a::Attributes)Return a list of the names of all attributes.
Base.keys — Methodkeys(d::Dimensions)Return a list of all dimension names in NCDataset ds.
Examples
ds = NCDataset("results.nc", "r");
dimnames = keys(ds.dim)Base.keys — Methodnames = keys(g::Groups)Return the names of all subgroubs of the group g.
Base.merge — Methodmfds = merge(ds1, ds2...)Virtually merge all datasets (ds1,...) by concatenating the list of variables. Every variable which happend to be repeated in several datasets is should to be constant in all datasets.
Base.reduce — Methodgr = reduce(f,gv::GroupedVariable)Reduce the grouped variable gv along grouped dimension using the function f. The function f will be called as f(x,dims=d) where x array (an element of gv) and d is an integer of the dimension overwhich one need to reduce x.
Base.setindex! — MethodBase.setindex!(a::Attributes,data,name::SymbolOrString)Set the attribute called name to the value data in the attribute list a. data can be a vector or a scalar. A scalar is handeld as a vector with one element in the NetCDF data model.
Generally the attributes are defined by indexing, for example:
ds = NCDataset("file.nc","c")
ds.attrib["title"] = "my title"
close(ds)Base.setindex! — Methodsetindex!(d::Dimensions,len,name::AbstractString)Defines the dimension called name to the length len, for example:
ds = NCDataset("file.nc","c")
ds.dim["longitude"] = 100If len is the special value Inf, then the dimension is considered as unlimited, i.e. it will grow as data is added to the NetCDF file.
Base.size — Methodsz = size(var::CFVariable)Return a tuple of integers with the size of the variable var.
Base.view — Methodsv = view(v::CommonDataModel.AbstractVariable,indices...)Returns a view of the variable v where indices are only lazily applied. No data is actually copied or loaded. Modifications to a view sv, also modifies the underlying array v. All attributes of v are also present in sv.
Examples
using NCDatasets
fname = tempname()
data = zeros(Int,10,11)
ds = NCDataset(fname,"c")
ncdata = defVar(ds,"temp",data,("lon","lat"))
ncdata_view = view(ncdata,2:3,2:4)
size(ncdata_view)
# output (2,3)
ncdata_view[1,1] = 1
ncdata[2,2]
# outputs 1 as ncdata is also modified
close(ds)Base.write — Methodwrite(dest::AbstractDataset, src::AbstractDataset; include = keys(src), exclude = [])Write the variables of src dataset into an empty dest dataset (which must be opened in mode "a" or "c"). The keywords include and exclude configure which variable of src should be included (by default all), or which should be excluded (by default none).
If the first argument is a file name, then the dataset is open in create mode ("c").
This function is useful when you want to save the dataset from a multi-file dataset.
To save a subset, one can use the view function view to virtually slice a dataset:
Example
NCDataset(fname_src) do ds
    write(fname_slice,view(ds, lon = 2:3))
endAll variables in the source file fname_src with a dimension lon will be sliced along the indices 2:3 for the lon dimension. All attributes (and variables without a dimension lon) will be copied over unmodified.
CommonDataModel.ancillaryvariables — Methodncvar = CommonDataModel.ancillaryvariables(ncv::CFVariable,modifier)Return the first ancillary variables from the NetCDF (or other format) variable ncv with the standard name modifier modifier. It can be used for example to access related variable like status flags.
CommonDataModel.attrib — MethodCommonDatamodel.attrib(ds::Union{AbstractDataset,AbstractVariable},attribname::SymbolOrString)Return the value of the attribute attribname in the data set ds.
CommonDataModel.attribnames — MethodCommonDatamodel.attribnames(ds::Union{AbstractDataset,AbstractVariable})Return an iterable of all attribute names in ds.
CommonDataModel.bounds — Methodb = bounds(ncvar::NCDatasets.CFVariable)Return the CFVariable corresponding to the bounds attribute of the variable ncvar. The time units and calendar from the ncvar are used but not the attributes controling the packing of data scale_factor, add_offset and _FillValue.
CommonDataModel.cfvariable — Methodv = cfvariable(ds::NCDataset,varname::SymbolOrString; <attrib> = <value>)Return the variable varname in the dataset ds as a NCDataset.CFVariable. The keyword argument <attrib> are the attributes (fillvalue, missing_value, scale_factor, add_offset, units and calendar) relevant to the CF conventions. By specifing the value of these attributes, the one can override the value specified in the data set. If the attribute is set to nothing, then the attribute is not loaded and the corresponding transformation is ignored. This function is similar to ds[varname] with the additional flexibility that some variable attributes can be overridden.
Example:
NCDataset("foo.nc","c") do ds
  defVar(ds,"data",[10., 11., 12., 13.], ("time",), attrib = Dict(
      "add_offset" => 10.,
      "scale_factor" => 0.2))
end
# The stored (packed) valued are [0., 5., 10., 15.]
# since 0.2 .* [0., 5., 10., 15.] .+ 10 is [10., 11., 12., 13.]
ds = NCDataset("foo.nc");
@show ds["data"].var[:]
# returns [0., 5., 10., 15.]
@show cfvariable(ds,"data")[:]
# returns [10., 11., 12., 13.]
# neither add_offset nor scale_factor are applied
@show cfvariable(ds,"data", add_offset = nothing, scale_factor = nothing)[:]
# returns [0, 5, 10, 15]
# add_offset is applied but not scale_factor
@show cfvariable(ds,"data", scale_factor = nothing)[:]
# returns [10, 15, 20, 25]
# 0 is declared as the fill value (add_offset and scale_factor are applied as usual)
@show cfvariable(ds,"data", fillvalue = 0)[:]
# returns [missing, 11., 12., 13.]
# Use the time units: days since 2000-01-01
@show cfvariable(ds,"data", units = "days since 2000-01-01")[:]
# returns [DateTime(2000,1,11), DateTime(2000,1,12), DateTime(2000,1,13), DateTime(2000,1,14)]
close(ds)CommonDataModel.chunking — Methodstorage,chunksizes = chunking(v::MFVariable)
storage,chunksizes = chunking(v::MFCFVariable)Return the storage type (:contiguous or :chunked) and the chunk sizes of the varable v corresponding to the first file. If the first file in the collection is chunked then this storage attributes are returned. If not the first file is not contiguous, then multi-file variable is still reported as chunked with chunk size equal to the size of the first variable.
CommonDataModel.coord — Methodcv = coord(v::Union{CFVariable,Variable},standard_name)Find the coordinate of the variable v by the standard name standard_name or some standardized heuristics based on units. If the heuristics fail to detect the coordinate, consider to modify the file to add the standard_name attribute. All dimensions of the coordinate must also be dimensions of the variable v.
Example
using NCDatasets
ds = NCDataset("file.nc")
ncv = ds["SST"]
lon = coord(ncv,"longitude")[:]
lat = coord(ncv,"latitude")[:]
v = ncv[:]
close(ds)CommonDataModel.dataset — Methodds = CommonDataModel.dataset(v::AbstractVariable)Return the data set ds to which a the variable v belongs to.
CommonDataModel.defAttrib — MethodCommonDatamodel.defAttrib(ds::Union{AbstractDataset,AbstractVariable},name::SymbolOrString,data)Create an attribute with the name attrib in the data set or variable ds.
CommonDataModel.defDim — MethodCommonDatamodel.defDim(ds::AbstractDataset,name::SymbolOrString,len)Create dimension with the name name in the data set ds with the length len. len can be Inf for unlimited dimensions.
CommonDataModel.defGroup — Methodgroup = CommonDatamodel.defGroup(ds::AbstractDataset,name::SymbolOrString)Create an empty sub-group with the name name in the data set ds. The group is a sub-type of AbstractDataset.
CommonDataModel.defVar — Methodv = CommonDataModel.defVar(ds::AbstractDataset,src::AbstractVariable)
v = CommonDataModel.defVar(ds::AbstractDataset,name::SymbolOrString,src::AbstractVariable)Defines and return the variable in the data set ds copied from the variable src. The dimension name, attributes and data are copied from src as well as the variable name (unless provide by name).
CommonDataModel.delAttrib — MethodCommonDatamodel.delAttrib(ds::Union{AbstractDataset,AbstractVariable},name::SymbolOrString,data)Deletes an attribute with the name attrib in the data set or variable ds.
CommonDataModel.dim — MethodCommonDatamodel.dim(ds::AbstractDataset,dimname::SymbolOrString)Return the length of the dimension dimname in the data set ds.
CommonDataModel.dimnames — MethodCommonDataModel.dimnames(v::AbstractVariable)Return an iterable of the dimension names of the variable v.
CommonDataModel.dimnames — MethodCommonDatamodel.dimnames(ds::AbstractDataset)Return an iterable of all dimension names in ds. This information can also be accessed using the property ds.dim:
Examples
ds = NCDataset("results.nc", "r");
dimnames = keys(ds.dim)CommonDataModel.dimnames — Methoddimnames(v::CFVariable)Return a tuple of strings with the dimension names of the variable v.
CommonDataModel.dims — MethodCommonDatamodel.dims(ds::Union{AbstractDataset,AbstractVariable})Return a dict-like of all dimensions and their corresponding length defined in the the data set ds (or variable).
CommonDataModel.fillvalue — Methodfillvalue(::Type{Int8})
fillvalue(::Type{UInt8})
fillvalue(::Type{Int16})
fillvalue(::Type{UInt16})
fillvalue(::Type{Int32})
fillvalue(::Type{UInt32})
fillvalue(::Type{Int64})
fillvalue(::Type{UInt64})
fillvalue(::Type{Float32})
fillvalue(::Type{Float64})
fillvalue(::Type{Char})
fillvalue(::Type{String})Default fill-value for the given type from NetCDF.
CommonDataModel.find_gv — MethodA = find_gv(T,As)returns the first type T among the arguments.
CommonDataModel.group — MethodCommonDatamodel.group(ds::AbstractDataset,groupname::SymbolOrString)Return the sub-group data set with the name groupname.
CommonDataModel.groupby — Methodgv = CommonDataModel.groupby(v::AbstractVariable,:coordname => group_fun)
gv = CommonDataModel.groupby(v::AbstractVariable,"coordname" => group_fun)Create a grouped variable gv whose elements composed by all elements in v whose corresponding coordinate variable (with the name coordname) map to the same value once the group function group_fun is applied to the coordinate.
The grouped variable gv and be reduced using the functions sum mean, median, var or std, for example gr = mean(gv). The result gr is a lazy structure representing the outcome of these operations performed over the grouped dimension. Only when the result gr is indexed the actually values are computed.
Broadcasting for gv and gr is overloaded. Broadcasting over all elements of gv means that a mapping function is to be applied to all elements of gv before a possible the reduction. Broadcasting over gr, for example v .- gr mean that gr is broadcasted over the full size of v according to the grouping function.
Example:
using NCDatasets, Dates
using CommonDataModel: @groupby, groupby
# create same test data
time = DateTime(2000,1,1):Day(1):DateTime(2009,12,31);  # 10 years
data = rand(Float32.(-9:99),360,180,length(time));
fname = "test_file.nc"
ds = NCDataset(fname,"c");
defVar(ds,"time",time,("time",));
defVar(ds,"data",data,("lon","lat","time"));
# group over month
gv = @groupby(ds["data"],Dates.Month(time))
# or
# gv = groupby(ds["data"],:time => Dates.Month)
length(gv)
# output 12 as they are all 12 months in this dataset
size(gv[1])
# 360 x 180 x 310 array with all time slices corresponding to the 1st month
# the variable `gv` is equivalent to the following operation
time_month = Dates.Month.(ds[:time][:])
gv2 = [ds[:data][:,:,findall(time_month .== m)] for m in sort(unique(time_month))];
# compute basic statistics
using Statistics
monthly_mean = mean(gv);
size(monthly_mean)
# 360 x 180 x 12 array with the monthly mean
# get a regular julia array
monthly_mean_array = monthly_mean[:,:,:];
typeof(monthly_mean_array)
# Array{Float32, 3}
# substact from data the corresponding monthly mean
monthly_anomalies = data .- mean(gv);
close(ds)CommonDataModel.groupnames — MethodCommonDatamodel.groupnames(ds::AbstractDataset)All the subgroup names of the data set ds. For a data set containing only a single group, this will be an empty vector of String.
CommonDataModel.groups — MethodCommonDatamodel.groups(ds::AbstractDataset)Return all sub-group data as a dict-like object.
CommonDataModel.load! — MethodCommonDataModel.load!(ncvar::CFVariable, data, buffer, indices)Loads a NetCDF (or other format) variables ncvar in-place and puts the result in data (an array of eltype(ncvar)) along the specified indices. buffer is a temporary  array of the same size as data but the type should be eltype(parent(ncv)), i.e. the corresponding type in the files (before applying scale_factor, add_offset and masking fill values). Scaling and masking will be applied to the array data.
data and buffer can be the same array if eltype(ncvar) == eltype(ncvar.var).
Example:
# create some test array
Dataset("file.nc","c") do ds
    defDim(ds,"time",3)
    ncvar = defVar(ds,"vgos",Int16,("time",),attrib = ["scale_factor" => 0.1])
    ncvar[:] = [1.1, 1.2, 1.3]
    # store 11, 12 and 13 as scale_factor is 0.1
end
ds = Dataset("file.nc")
ncv = ds["vgos"];
# data and buffer must have the right shape and type
data = zeros(eltype(ncv),size(ncv)); # here Vector{Float64}
buffer = zeros(eltype(parent(ncv)),size(ncv)); # here Vector{Int16}
NCDatasets.load!(ncv,data,buffer,:,:,:)
close(ds)CommonDataModel.name — MethodCommonDatamodel.name(ds::AbstractDataset)Name of the group of the data set ds. For a data set containing only a single group, this will be always the root group "/".
CommonDataModel.name — MethodCommonDataModel.name(v::AbstractVariable)Return the name of the variable v as a string.
CommonDataModel.parentdataset — Methodpds = CommonDatamodel.parentdataset(ds::AbstractDataset)The data set pds containing ds as a sub-group. pds is nothing for the root group.
CommonDataModel.path — MethodCommonDatamodel.path(ds::AbstractDataset)File path of the data set ds.
CommonDataModel.rolling — Methodgv = CommonDataModel.rolling(v::AbstractVariable,:coordname => n)
gv = CommonDataModel.rolling(v::AbstractVariable,:coordname => rolling_classes)Create a grouped variable gv whose elements composed by all elements in v grouped by a rolling window of length n along the coordinate variable coordname. One can also specify a vector classes (rolling_classes) with as many elements as they are groups and the elements coorespond to the values of the coordinate. Unlike CommonDataModel.groupby, the groups defined by rolling can overlap.
The grouped variable gv and be reduced using the functions sum mean, median, var or std, for example gr = mean(gv). The result gr is a lazy structure representing the outcome of these operations performed over the grouped dimension. Only when the result gr is indexed the actually values are computed.
Broadcasting for gv is overloaded. Broadcasting over all elements of gv means that a mapping function is to be applied to all elements of gv before a possible the reduction.
This operations is also called "running mean" when using mean as reduction function.
Example:
using NCDatasets, Dates
using CommonDataModel: rolling
# create same test data
time = DateTime(2000,1,1):Day(1):DateTime(2009,12,31);  # 10 years
data = rand(Float32.(-9:99),360,180,length(time));
fname = "test_file.nc"
ds = NCDataset(fname,"c");
defVar(ds,"time",time,("time",));
defVar(ds,"data",data,("lon","lat","time"));
# running weekly mean
gv = rolling(ds["data"],:time => 7)
length(gv)
# output 3653 as a mean will be compute for all time instance (including a
# partial mean and the beginning and the end)
size(gv[1])
# output 360 x 180 x 4: the first time slice will only be a mean of 4 values
size(gv[4])
# output 360 x 180 x 7: data from the first 7 days
# compute basic statistics
using Statistics
weekly_running_mean = mean(gv);
size(weekly_running_mean)
# 360 x 180 x 3653 array with the running weekly mean
# get a regular julia array
weekly_running_mean_array = weekly_running_mean[:,:,:];
typeof(weekly_running_mean_array)
# Array{Float32, 3}
# computing a centred 3 monthly mean taking into account that month do not have the
# same length
rolling_classes = [(t - Dates.Month(1)):Day(1):(t + Dates.Month(1)) for t in time]
extrema(length.(rolling_classes))
# output (60, 63)
data_3monthly = mean(rolling(ds["data"],:time => rolling_classes))[:,:,:];
close(ds)CommonDataModel.select — Methodvsubset = CommonDataModel.select(v,param1 => condition1, param2 => condition2,...)
dssubset = CommonDataModel.select(ds,param1 => condition1, param2 => condition2,...)Return a subset of the variable v (or dataset ds) satisfying the conditions applied to the corresponding parameters. param1, param2 ... are symbols or strings with variable names and condition1, condition2 ... are functions taking as a argument the values of the corresponding parameter and return either false (ignore the corresponding elements) or true (select the corresponding elements for loading).
Examples
Create a sample file with random data:
using NCDatasets, Dates
using CommonDataModel: select, Near
fname = "sample_file.nc"
lon = -180:180
lat = -90:90
time = DateTime(2000,1,1):Day(1):DateTime(2000,1,3)
SST = randn(length(lon),length(lat),length(time));
ds = NCDataset(fname,"c")
defVar(ds,"lon",lon,("lon",));
defVar(ds,"lat",lat,("lat",));
defVar(ds,"time",time,("time",));
defVar(ds,"SST",SST,("lon","lat","time"));
# load by bounding box
v = select(ds["SST"],
           :lon => lon -> 30 <= lon <= 60,
           :lat => lat -> 40 <= lat <= 80)
# You can also select based on `ClosedInterval`s from `IntervalSets.jl`.
# Both 30..60 and 60 ± 20 construct `ClosedInterval`s, see their documentation for details. `∈` can be typed `\in` followed by the TAB-key.
using IntervalSets
v = select(ds["SST"], :lon => ∈(30..60), :lat => ∈(60 ± 20))
# get the indices matching the select query
(lon_indices,lat_indices,time_indices) = parentindices(v)
# get longitude matchting the select query
v_lon = v["lon"]
# find the nearest time instance
v = select(ds["SST"],:time => Near(DateTime(2000,1,4)))
# find the nearest time instance but not earlier or later than 2 hours
# an empty array is returned if no time instance is present
v = select(ds["SST"],:time => Near(DateTime(2000,1,3,1),Hour(2)))
close(ds)See also @select for more information.
CommonDataModel.set_attribute_color — Method    CommonDataModel.set_attribute_color(color::Symbol)Set the attribute color. The default color is cyan.
CommonDataModel.set_section_color — Method    CommonDataModel.set_section_color(color::Symbol)Set the section color. The default color is red.
CommonDataModel.set_variable_color — Method    CommonDataModel.set_variable_color(color::Symbol)Set the variable color. The default color is green.
CommonDataModel.show_dim — MethodCommonDatamodel.show_dim(io,dim)Print a list all dimensions (key/values pairs where key is the dimension names and value the corresponding length) in dim to IO stream io. The IO property :level is used for indentation.
CommonDataModel.time_factor — Methodtf = CommonDataModel.time_factor(v::CFVariable)The time unit in milliseconds. E.g. seconds would be 1000., days would be 86400000. The result can also be nothing if the variable has no time units.
CommonDataModel.unlimited — MethodCommonDatamodel.unlimited(ds::AbstractDataset)Iterator of strings with the name of the unlimited dimension.
CommonDataModel.unlimited — Methodunlimited(d::Dimensions)Return the names of all unlimited dimensions.
CommonDataModel.varbyattrib — Methodvarbyattrib(ds, attname = attval)Returns a list of variable(s) which has the attribute attname matching the value attval in the dataset ds. The list is empty if the none of the variables has the match. The output is a list of CFVariables.
Examples
Load all the data of the first variable with standard name "longitude" from the NetCDF file results.nc.
julia> ds = NCDataset("results.nc", "r");
julia> data = varbyattrib(ds, standard_name = "longitude")[1][:]CommonDataModel.variable — MethodCommonDataModel.variable(ds::AbstractDataset,variablename::SymbolOrString)Return the variable with the name variablename from the data set ds.
CommonDataModel.varnames — MethodCommonDataModel.varnames(ds::AbstractDataset)Return an iterable of all the variable name.
CommonDataModel.@groupby — Macrogv = CommonDataModel.@groupby(v,group_fun(coordname))Create a grouped variable gv whose elements are composed by all elements in v whose corresponding coordinate variable (with the name coordname) map to the same value once the group function group_fun is applied to the coordinate variable.
See groupby for more information.
CommonDataModel.@select — Macrovsubset = CommonDataModel.@select(v,expression)
dssubset = CommonDataModel.@select(ds,expression)Return a subset of the variable v (or dataset ds) satisfying the condition expression as a view. The condition has the following form:
condition₁ && condition₂ && condition₃ ... conditionₙ
Every condition should involve a single 1D variable (typically a coordinate variable, referred as coord below). If v is a variable, the related 1D variable should have a shared dimension with the variable v. All local variables need to have a $ prefix (see examples below). This macro is experimental and subjected to change.
Every condition can either perform:
a nearest match:
coord ≈ target_coord(for≈type\approxfollowed by the TAB-key). Only the data corresponding to the index closest totarget_coordis loaded.a nearest match with tolerance:
coord ≈ target_coord ± tolerance. As before, but if the difference between the closest value incoordandtarget_coordis larger (in absolute value) thantolerance, an empty array is returned.a condition operating on scalar values. For example, a
conditionequal to10 <= lon <= 20loads all data with the longitude between 10 and 20 orabs(lat) > 60loads all variables with a latitude north of 60° N and south of 60° S (assuming that the has the 1D variableslonandlatfor longitude and latitude).
Only the data which satisfies all conditions is loaded. All conditions must be chained with an && (logical and). They should not contain additional parenthesis or other logical operators such as || (logical or).
To convert the view into a regular array one can use collect, Array or regular indexing. As in julia, views of scalars are wrapped into a zero dimensional arrays which can be dereferenced by using []. Modifying a view will modify the underlying file (if the file is opened as writable, otherwise an error is issued).
As for any view, one can use parentindices(vsubset) to get the indices matching a select query.
Examples
Create a sample file with random data:
using NCDatasets, Dates
using CommonDataModel: @select
# or
# using NCDatasets: @select
fname = "sample_file.nc"
lon = -180:180
lat = -90:90
time = DateTime(2000,1,1):Day(1):DateTime(2000,1,3)
SST = randn(length(lon),length(lat),length(time))
ds = NCDataset(fname,"c")
defVar(ds,"lon",lon,("lon",));
defVar(ds,"lat",lat,("lat",));
defVar(ds,"time",time,("time",));
defVar(ds,"SST",SST,("lon","lat","time"));
# load by bounding box
v = @select(ds["SST"],30 <= lon <= 60 && 40 <= lat <= 80)
# substitute a local variable in condition using $
lonr = (30,60) # longitude range
latr = (40,80) # latitude range
v = @select(ds["SST"],$lonr[1] <= lon <= $lonr[2] && $latr[1] <= lat <= $latr[2])
# You can also select based on `ClosedInterval`s from `IntervalSets.jl`.
# Both 30..60 and 60 ± 20 construct `ClosedInterval`s, see their documentation for details.
lon_interval = 30..60
lat_interval = 60 ± 20
v = @select(ds["SST"], lon ∈ $lon_interval && lat ∈ $lat_interval)
# get the indices matching the select query
(lon_indices,lat_indices,time_indices) = parentindices(v)
# get longitude matchting the select query
v_lon = v["lon"]
# find the nearest time instance
v = @select(ds["SST"],time ≈ DateTime(2000,1,4))
# find the nearest time instance but not earlier or later than 2 hours
# an empty array is returned if no time instance is present
v = @select(ds["SST"],time ≈ DateTime(2000,1,3,1) ± Hour(2))
close(ds)Any 1D variable with the same dimension name can be used in @select. For example, if we have a time series of temperature and salinity, the temperature values can also be selected based on salinity:
using NCDatasets, Dates
using CommonDataModel: @select
fname = "sample_series.nc"
# create a sample time series
time = DateTime(2000,1,1):Day(1):DateTime(2009,12,31)
salinity = randn(length(time)) .+ 35
temperature = randn(length(time))
NCDataset(fname,"c") do ds
    defVar(ds,"time",time,("time",));
    defVar(ds,"salinity",salinity,("time",));
    defVar(ds,"temperature",temperature,("time",));
end
ds = NCDataset(fname)
# load all temperature data from January where the salinity is larger than 35.
v = @select(ds["temperature"],Dates.month(time) == 1 && salinity >= 35)
# this is equivalent to:
v2 = ds["temperature"][findall(Dates.month.(time) .== 1 .&& salinity .>= 35)]
v == v2
# returns true
close(ds)