Variables
Variables (like e.g. CFVariable) are the quantities contained within a NetCDF dataset. See the Datasets page on how to obtain them from a dataset.
Different type of arrays are involved when working with NCDatasets. For instance assume that test.nc is a file with a Float32 variable called var. Assume that we open this data set in append mode ("a"):
using NCDatasets
ds = NCDataset("test.nc","a")
v_cf = ds["var"]The variable v_cf has the type CFVariable. No data is actually loaded from disk, but you can query its size, number of dimensions, number elements, etc., using the functions size, ndims, length as if v_cf was an ordinary Julia array.
To load the variable v_cf in memory as numeric data you can convert it into an array (preserving its dimensionality structure) with
Array(v_cf)The syntax v_cf[:] is equivalent with the above, it doesn't make a Vector (like it does on normal Julia arrays).
You can only load sub-parts of it in memory via indexing each dimension:
v_cf[1:5, 10:20](here you must know the number of dimensions of the variable, as you must access all of them).
NCDatasets.Variable and NCDatasets.CFVariable implement the interface of AbstractArray. It is thus possible to call any function that accepts an AbstractArray. But functions like mean, sum (and many more) would load every element individually which is very inefficient for large fields read from disk. You should instead convert such a variable to a standard Julia Array and then do computations with it. See also the performance tips for more information.
The following functions are convenient for working with variables:
Base.size — Methodsz = size(var::CFVariable)Return a tuple of integers with the size of the variable var.
Note that the size of a variable can change, i.e. for a variable with an unlimited dimension.
NCDatasets.dimnames — Functiondimnames(v::Variable)Return a tuple of strings with the dimension names of the variable v.
dimnames(v::CFVariable)Return a tuple of strings with the dimension names of the variable v.
NCDatasets.dimsize — Functiondimsize(v::CFVariable)Get the size of a CFVariable as a named tuple of dimension → length.
NCDatasets.name — Functionname(v::Variable)Return the name of the NetCDF variable v.
NCDatasets.renameVar — FunctionrenameVar(ds::NCDataset,oldname,newname)Rename the variable called oldname to newname.
NCDatasets.NCDataset — Methodmfds = NCDataset(fnames,mode = "r"; aggdim = nothing, deferopen = true)Opens a multi-file dataset in read-only "r" or append mode "a". fnames is a vector of file names. Variables are aggregated over the first unlimited dimension or over the dimension aggdim if specified. The append mode is only implemented when deferopen is false.
All variables containing the dimension aggdim are aggregated. The variable who do not contain the dimension aggdim are assumed constant.
If deferopen is false, all files are opened at the same time. However the operating system might limit the number of open files. In Linux, the limit can be controled with the command ulimit.
NCDatasets.nomissing — Functiona = nomissing(da)Return the values of the array da of type Array{Union{T,Missing},N} (potentially containing missing values) as a regular Julia array a of the same element type. It raises an error if the array contains at least one missing value.
a = nomissing(da,value)Retun the values of the array da of type Array{Union{T,Missing},N} as a regular Julia array a by replacing all missing value by value (converted to type T). This function is identical to coalesce.(da,T(value)) where T is the element tyoe of da.
Example:
julia> nomissing([missing,1.,2.],NaN)
# returns [NaN, 1.0, 2.0]NCDatasets.fillvalue — Functionfillvalue(::Type{Int8})
fillvalue(::Type{UInt8})
fillvalue(::Type{Int16})
fillvalue(::Type{UInt16})
fillvalue(::Type{Int32})
fillvalue(::Type{UInt32})
fillvalue(::Type{Int64})
fillvalue(::Type{UInt64})
fillvalue(::Type{Float32})
fillvalue(::Type{Float64})
fillvalue(::Type{Char})
fillvalue(::Type{String})Default fill-value for the given type.
fv = fillvalue(v::Variable)
fv = fillvalue(v::CFVariable)Return the fill-value of the variable v.
NCDatasets.loadragged — Function data = loadragged(ncvar,index::Colon)Load data from ncvar in the contiguous ragged array representation as a vector of vectors. It is typically used to load a list of profiles or time series of different length each.
The indexed ragged array representation is currently not supported.
NCDatasets.load! — FunctionNCDatasets.load!(ncvar::Variable, data, indices)Loads a NetCDF variables ncvar in-place and puts the result in data along the specified indices.
ds = Dataset("file.nc")
ncv = ds["vgos"].var;
# data must have the right shape and type
data = zeros(eltype(ncv),size(ncv));
NCDatasets.load!(ncv,data,:,:,:)
close(ds)
# loading a subset
data = zeros(5); # must have the right shape and type
load!(ds["temp"].var,data,:,1) # loads the 1st columnCreating a variable
NCDatasets.defVar — FunctiondefVar(ds::NCDataset,name,vtype,dimnames; kwargs...)
defVar(ds::NCDataset,name,data,dimnames; kwargs...)Define a variable with the name name in the dataset ds. vtype can be Julia types in the table below (with the corresponding NetCDF type). The parameter dimnames is a tuple with the names of the dimension. For scalar this parameter is the empty tuple (). The variable is returned (of the type CFVariable).
Instead of providing the variable type one can directly give also the data data which will be used to fill the NetCDF variable. In this case, the dimensions with the appropriate size will be created as required using the names in dimnames.
If data is a vector or array of DateTime objects, then the dates are saved as double-precision floats and units "days since 1900-01-01 00:00:00" (unless a time unit is specifed with the attrib keyword as described below). Dates are converted to the default calendar in the CF conversion which is the mixed Julian/Gregorian calendar.
Keyword arguments
fillvalue: A value filled in the NetCDF file to indicate missing data. It will be stored in the _FillValue attribute.chunksizes: Vector integers setting the chunk size. The total size of a chunk must be less than 4 GiB.deflatelevel: Compression level: 0 (default) means no compression and 9 means maximum compression. Each chunk will be compressed individually.shuffle: If true, the shuffle filter is activated which can improve the compression ratio.checksum: The checksum method can be:fletcher32or:nochecksum(checksumming is disabled, which is the default)attrib: An iterable of attribute name and attribute value pairs, for example aDict,DataStructures.OrderedDictor simply a vector of pairs (see example below)typename(string): The name of the NetCDF type required for vlen arrays
chunksizes, deflatelevel, shuffle and checksum can only be set on NetCDF 4 files.
NetCDF data types
| NetCDF Type | Julia Type |
|---|---|
| NC_BYTE | Int8 |
| NC_UBYTE | UInt8 |
| NC_SHORT | Int16 |
| NC_INT | Int32 |
| NC_INT64 | Int64 |
| NC_FLOAT | Float32 |
| NC_DOUBLE | Float64 |
| NC_CHAR | Char |
| NC_STRING | String |
Example:
In this example, scale_factor and add_offset are applied when the data is saved.
julia> using DataStructures
julia> data = randn(3,5)
julia> NCDataset("test_file.nc","c") do ds
defVar(ds,"temp",data,("lon","lat"), attrib = OrderedDict(
"units" => "degree_Celsius",
"add_offset" => -273.15,
"scale_factor" => 0.1,
"long_name" => "Temperature"
))
end;If the attributes _FillValue, add_offset, scale_factor, units and calendar are used, they should be defined when calling defVar by using the parameter attrib as shown in the example above.
Storage parameter of a variable
NCDatasets.chunking — Functionstorage,chunksizes = chunking(v::Variable)Return the storage type (:contiguous or :chunked) and the chunk sizes of the varable v.
NCDatasets.deflate — Functionisshuffled,isdeflated,deflate_level = deflate(v::Variable)Return compression information of the variable v. If shuffle is true, then shuffling (byte interlacing) is activaded. If deflate is true, then the data chunks (see chunking) are compressed using the compression level deflate_level (0 means no compression and 9 means maximum compression).
NCDatasets.checksum — Functionchecksummethod = checksum(v::Variable)
Return the checksum method of the variable v which can be either be :fletcher32 or :nochecksum.
Coordinate variables
NCDatasets.coord — Functioncv = coord(v::Union{CFVariable,Variable},standard_name)Find the coordinate of the variable v by the standard name standard_name or some standardized heuristics based on units. If the heuristics fail to detect the coordinate, consider to modify the netCDF file to add the standard_name attribute. All dimensions of the coordinate must also be dimensions of the variable v.
Example
using NCDatasets
ds = NCDataset("file.nc")
ncv = ds["SST"]
lon = coord(ncv,"longitude")[:]
lat = coord(ncv,"latitude")[:]
v = ncv[:]
close(ds)