Raster Histograms
First, add the required depedencies
using Rasters, NCDatasets, Downloads, CairoMakie
and the RasterHistograms
package.
using RasterHistograms
Using this package we can produce Histogram
s from data that is in a Raster
, RasterStack
or RasterSeries
, which are N-dimensional arrays, in a similar way that xhistogram works for xarray in python. This example is structured similarly to the xhistogram tutorial.
Randomly generated toy data
First we generate some randomly distributed data and form a Raster
.
x, t = range(-2π, 2π; length = 50), range(0, 4π; length = 100)
dimensions = (X(x), Ti(t))
rs = Raster(randn(length(x), length(t)), dimensions; name = :Toy_data)
50×100 Raster{Float64,2} Toy_data with dimensions:
X Sampled{Float64} -6.283185307179586:0.2564565431501872:6.283185307179586 ForwardOrdered Regular Points,
Ti Sampled{Float64} 0.0:0.12693303650867852:12.566370614359172 ForwardOrdered Regular Points
extent: Extent(X = (-6.283185307179586, 6.283185307179586), Ti = (0.0, 12.566370614359172))missingval: missingparent:
0.0 0.126933 … 12.3125 12.4394 12.5664
-6.28319 -1.82931 0.826835 1.05965 0.0852104 -0.536963
-6.02673 1.45437 1.44044 -0.246961 0.185595 0.0402522
-5.77027 0.582355 -0.772286 -0.43168 -0.9027 1.87734
-5.51382 -1.28468 1.34507 -3.24976 -0.928869 -1.19851
-5.25736 0.375997 0.513657 … -0.39049 -0.314387 2.10712
⋮ ⋱ ⋮
5.0009 0.0892556 -0.70489 0.343495 2.01332 -1.48377
5.25736 1.85664 1.07829 -0.475647 0.536484 -0.382936
5.51382 -0.294048 -0.540655 2.05521 0.625332 0.734095
5.77027 -1.0287 -0.516634 … -0.522387 -0.252286 -0.508618
6.02673 -0.044631 -0.280868 1.01865 1.71449 0.438216
6.28319 -0.267313 -0.617277 0.675261 -1.38207 -1.8626
The we can form a RasterLayerHistogram
for the :Toy_data
rs_hist = RasterLayerHistogram(rs)
RasterLayerHistogram for the variable Toy_data
┣━━ Layer dimensions: (:X, :Ti)
┣━━━━━━━━ Layer size: (50, 100)
┗━━━━━━━━━ Histogram: 1-dimensional
We can then plot the data and the Histogram
fig = Figure(size = (1000, 600))
ax1 = Axis(fig[1, 1];
title = "Toy data",
xlabel = "x",
ylabel = "time")
hm = heatmap!(ax1, x, t, rs.data)
Colorbar(fig[2, 1], hm; vertical = false, flipaxis = false)
ax2 = Axis(fig[1, 2];
title = "Histogram of Toy data",
xlabel = "Toy data", ylabel = "Counts")
plot!(ax2, rs_hist; color = :steelblue)
fig
By default the Histogram
has the counts in each bin. We can normalise the Histogram
by calling the normalize!
function on rs_hist
and choosing a mode
of normalisation. For more information about the possible modes of normalisation see here.
normalize!(rs_hist; mode = :pdf)
Then replot with the normalised histogram
fig = Figure(size = (900, 600))
ax1 = Axis(fig[1, 1];
title = "Toy data",
xlabel = "x",
ylabel = "time")
hm = heatmap!(ax1, x, t, rs.data)
Colorbar(fig[2, 1], hm; vertical = false, flipaxis = false)
ax2 = Axis(fig[1, 2];
title = "Histogram (pdf) of Toy data",
xlabel = "Toy data", ylabel = "density")
plot!(ax2, rs_hist; color = :steelblue)
fig
Plotting using Plots.jl is also possible. See the module documentation for more info.
Real world data example
This package is mainly concerned with ocean variables, so we now look at temperature and salinity distributions using ECCO model output.
Downloads.download("https://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%2520Ocean%2520Temperature%2520and%2520Salinity%2520-%2520Daily%2520Mean%25200.5%2520Degree%2520(Version%25204%2520Release%25204)/granules/OCEAN_TEMPERATURE_SALINITY_day_mean_2007-01-01_ECCO_V4r4_latlon_0p50deg.dap.nc4", "ECCO_data.nc")
This example also shows how the module works for 2-dimensional Histograms
though it can be generalised to N dimensions depending on the number of variables (i.e. layers in the RasterStack
) one is looking at.
Forming the RasterStack
We form a RasterStack
with only the :SALT
(practical salinity) and :THETA
(potential temperature) layers. This means the resulting RasterStackHistogram
will be 2 dimensional. Note the order of the variables matters here for plotting purposes. The first layer, in this case :SALT
will be the x-axis, and the second layer :THETA
will be the y-axis.
stack_TS = RasterStack("ECCO_data.nc"; name = (:SALT, :THETA))
edges = (31:0.025:38, -2:0.1:32)
stack_hist = RasterStackHistogram(stack_TS, edges)
Now we can plot, the histogram and look at the unweighted distribtution of temperature and salinity. By default the empty bins are plotted with the value of zero. To not plot the empty bins argument we pass show_empty_bins = false
to the plotting function.
fig = Figure(size = (500, 500))
ax = Axis(fig[1, 1];
title = "Temperature and salinity joint distribution (unweighted)",
xlabel = "Practical salinity (psu)",
ylabel = "Potential temperature (°C)")
hm = heatmap!(ax, stack_hist; colorscale = log10)
Colorbar(fig[1, 2], hm)
fig
Weighting the Histogram
The module also exports simple functions for calculating area and volume weights from the dimensions of the grid and plot the data. Where weights are available from model data they should be used in favour of the functions.
dV = volume_weights(stack_TS)
weighted_stack_hist = RasterStackHistogram(stack_TS, dV, edges)
fig = Figure(size = (500, 500))
ax = Axis(fig[1, 1];
title = "Temperature and salinity joint distribution (weighted)",
xlabel = "Practical salinity (psu)",
ylabel = "Potential temperature (°C)")
hm = heatmap!(ax, weighted_stack_hist; colorscale = log10)
Colorbar(fig[1, 2], hm)
fig
This page was generated using Literate.jl.