NCDatasets.jl
Documentation for NCDatasets.jl
Datasets
NCDatasets.Dataset — Type.Dataset(filename::AbstractString,mode::AbstractString = "r";
format::Symbol = :netcdf4)Create a new NetCDF file if the mode is "c". An existing file with the same name will be overwritten. If mode is "a", then an existing file is open into append mode (i.e. existing data in the netCDF file is not overwritten and a variable can be added). With the mode set to "r", an existing netCDF file or OPeNDAP URL can be open in read-only mode. The default mode is "r".
Supported formats:
:netcdf4 (default): HDF5-based NetCDF format.
:netcdf4_classic: Only netCDF 3 compatible API features will be used.
:netcdf3_classic: classic netCDF format supporting only files smaller than 2GB.
:netcdf3_64bit_offset: improved netCDF format supporting files larger than 2GB.
Files can also be open and automatically closed with a do block.
Dataset("file.nc") do ds
data = ds["temperature"][:,:]
endBase.keys — Method.keys(ds::Dataset)Return a list of all variables names in Dataset ds.
Base.haskey — Function.haskey(ds::Dataset,varname)Return true of the Dataset ds has a variable with the name varname. For example:
ds = Dataset("/tmp/test.nc","r")
if haskey(ds,"temperature")
println("The file has a variable 'temperature'")
endThis example checks if the file /tmp/test.nc has a variable with the name temperature.
Base.getindex — Method.getindex(ds::Dataset,varname::AbstractString)Return the NetCDF variable varname in the dataset ds as a NCDataset.CFVariable. The CF convention are honored when the variable is indexed:
_FillValuewill be returned asmissing(DataArrays)scale_factorandadd_offsetare appliedtime variables (recognized by the units attribute) are returned
as DateTime object.
A call getindex(ds,varname) is usually written as ds[varname].
NCDatasets.variable — Function.variable(ds::Dataset,varname::String)Return the NetCDF variable varname in the dataset ds as a NCDataset.Variable. No scaling is applied when this variable is indexes.
NCDatasets.sync — Function.sync(ds::Dataset)Write all changes in Dataset ds to the disk.
Base.close — Function.close(ds::Dataset)Close the Dataset ds. All pending changes will be written to the disk.
NCDatasets.path — Function.path(ds::Dataset)Return the file path (or the opendap URL) of the Dataset ds
Variables
NCDatasets.defVar — Function.defVar(ds::Dataset,name,vtype,dimnames; kwargs...)Define a variable with the name name in the dataset ds. vtype can be Julia types in the table below (with the corresponding NetCDF type). The parameter dimnames is a tuple with the names of the dimension. For scalar this parameter is the empty tuple (). The variable is returned (of the type CFVariable).
Keyword arguments
fillvalue: A value filled in the NetCDF file to indicate missing data. It will be stored in the _FillValue attribute.chunksizes: Vector integers setting the chunk size. The total size of a chunk must be less than 4 GiB.deflatelevel: Compression level: 0 (default) means no compression and 9 means maximum compression. Each chunk will be compressed individually.shuffle: If true, the shuffle filter is activated which can improve the compression ratio.checksum: The checksum method can be:fletcher32or:nochecksum(checksumming is disabled, which is the default)typename(string): The name of the NetCDF type required for vlen arrays [1]
chunksizes, deflatelevel, shuffle and checksum can only be set on NetCDF 4 files.
NetCDF data types
| NetCDF Type | Julia Type |
|---|---|
| NC_BYTE | Int8 |
| NC_UBYTE | UInt8 |
| NC_SHORT | Int16 |
| NC_INT | Int32 |
| NC_INT64 | Int64 |
| NC_FLOAT | Float32 |
| NC_DOUBLE | Float64 |
| NC_CHAR | Char |
| NC_STRING | String |
[1] https://web.archive.org/save/https://www.unidata.ucar.edu/software/netcdf/netcdf-4/newdocs/netcdf-c/nc_005fdef_005fvlen.html
NCDatasets.dimnames — Function.dimnames(v::Variable)Return a tuple of the dimension names of the variable v.
NCDatasets.name — Function.name(v::Variable)Return the name of the NetCDF variable v.
NCDatasets.chunking — Function.storage,chunksizes = chunking(v::Variable)Return the storage type (:contiguous or :chunked) and the chunk sizes of the varable v.
NCDatasets.deflate — Function.shuffle,deflate,deflate_level = deflate(v::Variable)Return compression information of the variable v. If shuffle is true, then shuffling (byte interlacing) is activaded. If deflate is true, then the data chunks (see chunking) are compressed using the compression level deflate_level (0 means no compression and 9 means maximum compression).
NCDatasets.checksum — Function.checksummethod = checksum(v::Variable)
Return the checksum method of the variable v which can be either be :fletcher32 or :nochecksum.
Different type of arrays are involved when working with NCDatasets. For instance assume that test.nc is a file with a Float32 variable called var. Assume that we open this data set in append mode ("a"):
using NCDatasets
ds = Dataset("test.nc","a")
v_cf = ds["var"]The variable v_cf has the type CFVariable. No data is actually loaded from disk, but you can query its size, number of dimensions, number elements, ... by the functions size, ndims, length as ordinary Julia arrays. Once you index, the variable v_cf, then the data is loaded and stored into a DataArray:
v_da = v_cf[:,:]Attributes
The NetCDF dataset (as return by Dataset or NetCDF groups) and the NetCDF variables (as returned by getindex, variable or defVar) have the field attrib which has the type NCDatasets.Attributes and behaves like a julia dictionary.
Base.getindex — Method.getindex(a::Attributes,name::AbstractString)Return the value of the attribute called name from the attribute list a. Generally the attributes are loaded by indexing, for example:
ds = Dataset("file.nc")
title = ds.attrib["title"]Base.setindex! — Method.Base.setindex!(a::Attributes,data,name::AbstractString)Set the attribute called name to the value data in the attribute list a. Generally the attributes are defined by indexing, for example:
ds = Dataset("file.nc","c")
ds.attrib["title"] = "my title"Base.keys — Method.Base.keys(a::Attributes)
Return a list of the names of all attributes.
Dimensions
NCDatasets.defDim — Function.defDim(ds::Dataset,name,len)Define a dimension in the data set ds with the given name and length len. If len is the special value Inf, then the dimension is considered as unlimited, i.e. it will grow as data is added to the NetCDF file.
For example:
ds = Dataset("/tmp/test.nc","c")
defDim(ds,"lon",100)This defines the dimension lon with the size 100.
Base.setindex! — Method.Base.setindex!(d::Dimensions,len,name::AbstractString)Defines the dimension called name to the length len. Generally dimension are defined by indexing, for example:
ds = Dataset("file.nc","c")
ds.dim["longitude"] = 100If len is the special value Inf, then the dimension is considered as unlimited, i.e. it will grow as data is added to the NetCDF file.
NCDatasets.dimnames — Method.dimnames(v::Variable)Return a tuple of the dimension names of the variable v.
Groups
NCDatasets.defGroup — Method.defGroup(ds::Dataset,groupname)Create the group with the name groupname in the dataset ds.
Base.getindex — Method.group = getindex(g::NCDatasets.Groups,groupname::AbstractString)Return the NetCDF group with the name groupname. For example:
julia> ds = Dataset("results.nc", "r");
julia> forecast_group = ds.group["forecast"]
julia> forecast_temp = forecast_group["temperature"]Base.keys — Method.Base.keys(g::NCDatasets.Groups)Return the names of all subgroubs of the group g.
Common methods
Explore a NetCDF dataset
Base.start — Method.start(ds::NCDatasets.Dataset)
start(a::NCDatasets.Attributes)
start(d::NCDatasets.Dimensions)
start(g::NCDatasets.Groups)Allow one to iterate over a dataset, attribute list, dimensions and NetCDF groups.
for (varname,var) in ds
# all variables
@show (varname,size(var))
end
for (dimname,dim) in ds.dims
# all dimensions
@show (dimname,dim)
end
for (attribname,attrib) in ds.attrib
# all attributes
@show (attribname,attrib)
end
for (groupname,group) in ds.groups
# all groups
@show (groupname,group)
endTime functions
NCDatasets.DateTimeStandard — Type.NCDatasets.DateTimeStandard(y, [m, d, h, mi, s, ms]) -> NCDatasets.DateTimeStandardConstruct a NCDatasets.DateTimeStandard type by year (y), month (m, default 1), day (d, default 1), hour (h, default 0), minute (mi, default 0), second (s, default 0), millisecond (ms, default 0). All arguments must be convertible to Int64. NCDatasets.DateTimeStandard is a subtype of AbstractCFDateTime.
The netCDF CF calendars are defined at [1].
[1] https://web.archive.org/web/20180622080424/http://cfconventions.org/cf-conventions/cf-conventions.html#calendar
NCDatasets.DateTimeJulian — Type.NCDatasets.DateTimeJulian(y, [m, d, h, mi, s, ms]) -> NCDatasets.DateTimeJulianConstruct a NCDatasets.DateTimeJulian type by year (y), month (m, default 1), day (d, default 1), hour (h, default 0), minute (mi, default 0), second (s, default 0), millisecond (ms, default 0). All arguments must be convertible to Int64. NCDatasets.DateTimeJulian is a subtype of AbstractCFDateTime.
The netCDF CF calendars are defined at [1].
[1] https://web.archive.org/web/20180622080424/http://cfconventions.org/cf-conventions/cf-conventions.html#calendar
NCDatasets.DateTimeProlepticGregorian(y, [m, d, h, mi, s, ms]) -> NCDatasets.DateTimeProlepticGregorianConstruct a NCDatasets.DateTimeProlepticGregorian type by year (y), month (m, default 1), day (d, default 1), hour (h, default 0), minute (mi, default 0), second (s, default 0), millisecond (ms, default 0). All arguments must be convertible to Int64. NCDatasets.DateTimeProlepticGregorian is a subtype of AbstractCFDateTime.
The netCDF CF calendars are defined at [1].
[1] https://web.archive.org/web/20180622080424/http://cfconventions.org/cf-conventions/cf-conventions.html#calendar
NCDatasets.DateTimeAllLeap — Type.NCDatasets.DateTimeAllLeap(y, [m, d, h, mi, s, ms]) -> NCDatasets.DateTimeAllLeapConstruct a NCDatasets.DateTimeAllLeap type by year (y), month (m, default 1), day (d, default 1), hour (h, default 0), minute (mi, default 0), second (s, default 0), millisecond (ms, default 0). All arguments must be convertible to Int64. NCDatasets.DateTimeAllLeap is a subtype of AbstractCFDateTime.
The netCDF CF calendars are defined at [1].
[1] https://web.archive.org/web/20180622080424/http://cfconventions.org/cf-conventions/cf-conventions.html#calendar
NCDatasets.DateTimeNoLeap — Type.NCDatasets.DateTimeNoLeap(y, [m, d, h, mi, s, ms]) -> NCDatasets.DateTimeNoLeapConstruct a NCDatasets.DateTimeNoLeap type by year (y), month (m, default 1), day (d, default 1), hour (h, default 0), minute (mi, default 0), second (s, default 0), millisecond (ms, default 0). All arguments must be convertible to Int64. NCDatasets.DateTimeNoLeap is a subtype of AbstractCFDateTime.
The netCDF CF calendars are defined at [1].
[1] https://web.archive.org/web/20180622080424/http://cfconventions.org/cf-conventions/cf-conventions.html#calendar
NCDatasets.DateTime360Day — Type.NCDatasets.DateTime360Day(y, [m, d, h, mi, s, ms]) -> NCDatasets.DateTime360DayConstruct a NCDatasets.DateTime360Day type by year (y), month (m, default 1), day (d, default 1), hour (h, default 0), minute (mi, default 0), second (s, default 0), millisecond (ms, default 0). All arguments must be convertible to Int64. NCDatasets.DateTime360Day is a subtype of AbstractCFDateTime.
The netCDF CF calendars are defined at [1].
[1] https://web.archive.org/web/20180622080424/http://cfconventions.org/cf-conventions/cf-conventions.html#calendar
Base.Dates.year — Method.Dates.year(dt::AbstractCFDateTime) -> Int64Extract the year-part of a AbstractCFDateTime as an Int64.
Base.Dates.month — Method.Dates.month(dt::AbstractCFDateTime) -> Int64Extract the month-part of a AbstractCFDateTime as an Int64.
Base.Dates.day — Method.Dates.day(dt::AbstractCFDateTime) -> Int64Extract the day-part of a AbstractCFDateTime as an Int64.
Base.Dates.hour — Method.Dates.hour(dt::AbstractCFDateTime) -> Int64Extract the hour-part of a AbstractCFDateTime as an Int64.
Base.Dates.minute — Method.Dates.minute(dt::AbstractCFDateTime) -> Int64Extract the minute-part of a AbstractCFDateTime as an Int64.
Base.Dates.second — Method.Dates.second(dt::AbstractCFDateTime) -> Int64Extract the second-part of a AbstractCFDateTime as an Int64.
Base.Dates.millisecond — Method.Dates.millisecond(dt::AbstractCFDateTime) -> Int64Extract the millisecond-part of a AbstractCFDateTime as an Int64.
Base.convert — Function.dt2 = convert(::Type{T}, dt)Convert a DateTime of type DateTimeStandard, DateTimeProlepticGregorian, DateTimeJulian or DateTime into the type T which can also be either DateTimeStandard, DateTimeProlepticGregorian, DateTimeJulian or DateTime.
Converstion is done such that durations (difference of DateTime types) are preserved. For dates on and after 1582-10-15, the year, month and days are the same for the types DateTimeStandard, DateTimeProlepticGregorian and DateTime.
For dates before 1582-10-15, the year, month and days are the same for the types DateTimeStandard and DateTimeJulian.
Base.reinterpret — Function.dt2 = reinterpret(::Type{T}, dt)Convert a variable dt of type DateTime, DateTimeStandard, DateTimeJulian, DateTimeProlepticGregorian, DateTimeAllLeap, DateTimeNoLeap or DateTime360Day into the date time type T using the same values for year, month, day, minute, second and millisecond. The convertion might fail if a particular date does not exist in the target calendar.
NCDatasets.timedecode — Function.dt = timedecode(data,units,calendar = "standard", prefer_datetime = true)Decode the time information in data as given by the units units according to the specified calendar. Valid values for calendar are "standard", "gregorian", "proleptic_gregorian", "julian", "noleap", "365_day", "all_leap", "366_day" and "360_day".
If prefer_datetime is true (default), dates are converted to the DateTime type (for the calendars "standard", "gregorian", "proleptic_gregorian" and "julian"). Such convertion is not possible for the other calendars.
| Calendar | Type (prefer_datetime=true) | Type (prefer_datetime=false) |
|---|---|---|
| standard, gregorian | DateTime | DateTimeStandard |
| proleptic_gregorian | DateTime | DateTimeProlepticGregorian |
| julian | DateTime | DateTimeJulian |
| noleap, 365_day | DateTimeNoLeap | DateTimeNoLeap |
| all_leap, 366_day | DateTimeAllLeap | DateTimeAllLeap |
| 360_day | DateTime360Day | DateTime360Day |
NCDatasets.timeencode — Function.data = timeencode(dt,units,calendar = "standard")Convert a vector or array of DateTime (or DateTimeStandard, DateTimeProlepticGregorian, DateTimeJulian, DateTimeNoLeap, DateTimeAllLeap, DateTime360Day) accoring to the specified units (e.g. "days since 2000-01-01 00:00:00") using the calendar calendar. Valid values for calendar are: "standard", "gregorian", "proleptic_gregorian", "julian", "noleap", "365_day", "all_leap", "366_day", "360_day".
Utility functions
NCDatasets.ncgen — Method.ncgen(fname; ...)
ncgen(fname,jlname; ...)Generate the Julia code that would produce a NetCDF file with the same metadata as the NetCDF file fname. The code is placed in the file jlname or printed to the standard output. By default the new NetCDF file is called filename.nc. This can be changed with the optional parameter newfname.
NCDatasets.nomissing — Method.a = nomissing(da::DataArray)Retun the values of the DataArray da as a regular Julia array a of the same element type and checks that no missing values are present.
NCDatasets.nomissing — Method.a = nomissing(da::DataArray,value)Retun the values of the DataArray da as a regular Julia array a by replacing all missing value by value.
NCDatasets.varbyattrib — Function.varbyattrib(ds, attname = attval)Returns a list of variable(s) which has the attribute attname matching the value attval in the dataset ds. The list is empty if the none of the variables has the match. The output is a list of CFVariables.
Examples
Load all the data of the first variable with standard name "longitude" from the NetCDF file results.nc.
julia> ds = Dataset("results.nc", "r");
julia> data = varbyattrib(ds, standard_name = "longitude")[1][:]Issues
libnetcdf not properly installed
If you see the following error,
ERROR: LoadError: LoadError: libnetcdf not properly installed. Please run Pkg.build("NCDatasets")you can try to install netcdf explicitly with Conda:
using Conda
Conda.add("libnetcdf")NetCDF: Not a valid data type or _FillValue type mismatch
Trying to define the _FillValue, procudes the following error:
ERROR: LoadError: NCDatasets.NetCDFError(-45, "NetCDF: Not a valid data type or _FillValue type mismatch")The error could be generated by a code like this:
using NCDatasets
# ...
tempvar = defVar(ds,"temp",Float32,("lonc","latc","time"))
tempvar.attrib["_FillValue"] = -9999.In fact, _FillValue must have the same data type as the corresponding variable. In the case above, tempvar is a 32-bit float and the number -9999. is a 64-bit float (aka double, which is the default floating point type in Julia). It is sufficient to convert the value -9999. to a 32-bit float:
tempvar.attrib["_FillValue"] = Float32(-9999.)Corner cases
An attribute representing a vector with a single value (e.g.
[1]) will be read back as a scalar (1) (same behavior in python netCDF4 1.3.1).NetCDF and Julia distinguishes between a vector of chars and a string, but both are returned as string for ease of use, in particular
an attribute representing a vector of chars ['u','n','i','t','s'] will be read back as the string "units".
An attribute representing a vector of chars
['u','n','i','t','s','\0']will also be read back as the string"units"(issue #12).
<!– LocalWords: NCDatasets jl Datasets Dataset netCDF –>